Content Management With JudoScript
By James Jianbo Huang April 2004
printer-friendly versionAbstract
A content management system enables users to easily add and manipulate
content for publication services. The content is processed, either
statically or dynamically (real-time), into viewable format such as
HTML. SGML is one of the most commonly used format for content encoding,
and with JudoScript's convenient SGML statement, one can easily establish a
efficient CM for writing and publishing. The
judoscript.com site, for example,
is managed by a custom CM system written totally in JudoScript.
This short article demonstrates how to create a CM with JudoScript, describes
the strcuture of the judoscript.com site, and has included a sample
site that is a subset of the real CM system.
A content management (CM) system enables users to easily add and
manipulate content for publication services, including web sites. The
content is almost always independent of its presentation. Content can
be stored in any format, such as SGML, XML or relational database. Many
publication institutions have developed their own content management
systems. Although most of them allow authors to use Microsoft Word®
to write books or articles, they typically mandate that authors strictly
use the publishers' Word template, so that manuscripts can be easily
converted to their internal content management format.
SGML is a very viable format for publishing content. For instance, the
Document Style Semantics and
Specification Language (DSSSL) is one of the efforts to define an
SGML format that covers most of the publishing needs. XML is a more
formalized variant of SGML, and many XML tools can be used to flexibly
produce different presentation, or produce statistics, reports, etc.
However, XML may be too rigid for book and article writing. For
DOM-based XML software, large documents may pose memory problems. SGML
is, in many situations, sufficient.
The presentation of content can be static or dynamic. Static
presentation is to preprocess all the content into HTML for publication.
Dynamimc presentation is to run a presentation engine (such as using
JSP/servlets, PHP or any web application frameworks) to dynamically
return pages to viewers. There are pros and cons for both, and it is
totally dependent on your specific needs and requirements. Whenever
possible, static presentation is preferred, because the generated HTML
pages can be deployed to any kinds of web servers or on local PCs; it
is easier and more performant. Static or dynamica is not related to
the way content is encoded, as this is the nature of CM systems.
»»» Top «««
JudoScript's SGML scraping statement is the basis for SGML-based content
management systems. You can define custom tags to your hearts' content,
and at the same time maintain other tags such as those defined in HTML.
For instance, the following is the content of a document stored in a
file called article-1.sgml
:
<doc title="La-la-la">
<!-- the first sentence -->
<J> is <em>the</em> <u>shell and scripting tool</u> for Java!
</doc>
The following code processes it into a HTML page:
// Process that document into a HTML --
procDoc 'article-1.sgml', 'articles-1.html';
/**
* The "style-sheet" impl. for the type "article" documents
*/
function procArticle srcFile, destFile {
htmlOut = openTextFile(destDir, 'w');
do srcFile as sgml {
<doc>:
print <htmlOut> [[*
<html>
<head><title>(* $_.title *)</title>
</head>
<body>
<h1>(* $_.title *)</h1>
*]];
<j>:
print <htmlOut> '<b>JudoScript</b>';
TEXT, <>, <!>:
print <htmlOut> $_; // any other tags and text, print verbatim.
</doc>:
println <htmlOut> '</body></html>';
}
htmlOut.close();
}
Documents in a content management system are always stored separately
from their presentations. The following code sample processes all the
articles and put them onto a web site:
!include 'cm.judi'
// defines these 2 variables: cmroot and docroot
// and the procArticle() function.
function procAllArticles {
list '*.sgml' in '${cmroot}/articles';
for fname in $$fs_result {
destFile = fname.getFileName().replace('.sgml', '.html');
procArticle fname, '${docroot}/articles/${destFile}';
}
}
Voilá! There you have a SGML-based content management system.
»»» Top «««
The whole judoscript.com site
is completely managed by a custom content management system. There are
three groups of all the site content:
- Articles -- this is very much like the demonstration discussed
in the previous section.
- Reference -- this is a very complicated system. There are only
a handful of document files, and they are processed into many HTML
pages, which, along with sophisticated JavaScript code in HTML,
accomplish the JudoScript
Reference.
- Web site -- there are a few ad-hoc content formats. The
downloadable sample site is a cut-down version of this part. This is
discussed in detail in the next section.
»»» Top «««
The sample site download contains such a directory structure:
/buidtools/build.judo
/site.judo
/common.judi
/home_src/home.sgml
/articles.html
/weblinks.html
/contributions.html
/sitemap.html
/share/....
The build.judo
is a menu-driven build script. It also
does some simple processing but the major tasks are delegated to the
site.judo
script.
The judoscript.com site-wide
pages all employ a consistent look-and-feel: the logo and search box
at the top, a menu column on the left and the content pane in the
south-east corner, except for site
map, which does not have a menu column because this page lists
all choices.
There are two kinds of contents in this mini system. The content of
the home page uses a custom tag system, so it needs a corresponding
tag routine. The majority of the pages, however, are just plain HTML
content to be copied onto the content pane in the final page. Thus,
we have three content processing routines:
genHomePage();
genSiteMapPage();
genSitePage(); // this is general-purpose.
Please refer to the site.judo
script for details. So
the complete building process can be illustrated as follows:
Content Source Resultant Web Site
============== ==================
share/* --------------------- copied ------------> ${docroot}/share/*
home_src/home.sgml ---------- genHomePage() -----> ${docroot}/index.html
home_src/sitemap.sgml ------- genSiteMapPage() --> ${docroot}/sitemap.html
home_src/weblinks.html ------ genSitePage(..) ---> ${docroot}/weblinks.html
home_src/articles.html ------ genSitePage(..) ---> ${docroot}/articles/index.html
home_src/contribution.html -- genSitePage(..) ---> ${docroot}/contrib/index.html
»»» Top «««
Anyone writing and publishing any content quickly realizes how critical
a content management system can be. With a consistent data format for
your content, you can deliver them into different formats, collect
statitistics and reports, produce cross-references, on and on ...
SGML is one of the most commonly used format for content mangement.
With JudoScript's convenient SGML statement, one can easily set up a content
management system. It may start with something simple, such as putting
a page layout around content, and then evolve into more complicated
pages.