This article is old and is being consolidated into the book.
Please refer to the corresponding chapter(s) therein.
If the chapters or sections are not completed yet, you can use this article.
Refer to the examples as they are tested against the latest code.
Content Management With JudoScriptBy James Jianbo Huang April 2004 non-printer versionAbstract A content management system enables users to easily add and manipulate content for publication services. The content is processed, either statically or dynamically (real-time), into viewable format such as HTML. SGML is one of the most commonly used format for content encoding, and with JudoScript's convenient SGML statement, one can easily establish a efficient CM for writing and publishing. The judoscript.com site, for example, is managed by a custom CM system written totally in JudoScript. This short article demonstrates how to create a CM with JudoScript, describes the strcuture of the judoscript.com site, and has included a sample site that is a subset of the real CM system.
1. Introduction to CM
A content management (CM) system enables users to easily add and manipulate content for publication services, including web sites. The content is almost always independent of its presentation. Content can be stored in any format, such as SGML, XML or relational database. Many publication institutions have developed their own content management systems. Although most of them allow authors to use Microsoft Word® to write books or articles, they typically mandate that authors strictly use the publishers' Word template, so that manuscripts can be easily converted to their internal content management format. SGML is a very viable format for publishing content. For instance, the Document Style Semantics and Specification Language (DSSSL) is one of the efforts to define an SGML format that covers most of the publishing needs. XML is a more formalized variant of SGML, and many XML tools can be used to flexibly produce different presentation, or produce statistics, reports, etc. However, XML may be too rigid for book and article writing. For DOM-based XML software, large documents may pose memory problems. SGML is, in many situations, sufficient. The presentation of content can be static or dynamic. Static presentation is to preprocess all the content into HTML for publication. Dynamimc presentation is to run a presentation engine (such as using JSP/servlets, PHP or any web application frameworks) to dynamically return pages to viewers. There are pros and cons for both, and it is totally dependent on your specific needs and requirements. Whenever possible, static presentation is preferred, because the generated HTML pages can be deployed to any kinds of web servers or on local PCs; it is easier and more performant. Static or dynamica is not related to the way content is encoded, as this is the nature of CM systems.
2. The Basis for CM With JudoScript
JudoScript's SGML scraping statement is the basis for SGML-based content
management systems. You can define custom tags to your hearts' content,
and at the same time maintain other tags such as those defined in HTML.
For instance, the following is the content of a document stored in a
file called The following code processes it into a HTML page:<doc title="La-la-la"> <!-- the first sentence --> <J> is <em>the</em> <u>shell and scripting tool</u> for Java! </doc> Documents in a content management system are always stored separately from their presentations. The following code sample processes all the articles and put them onto a web site:// Process that document into a HTML -- procDoc 'article-1.sgml', 'articles-1.html'; /** * The "style-sheet" impl. for the type "article" documents */ function procArticle srcFile, destFile { htmlOut = openTextFile(destDir, 'w'); do srcFile as sgml { <doc>: print <htmlOut> [[* <html> <head><title>(* $_.title *)</title> </head> <body> <h1>(* $_.title *)</h1> *]]; <j>: print <htmlOut> '<b>JudoScript</b>'; TEXT, <>, <!>: print <htmlOut> $_; // any other tags and text, print verbatim. </doc>: println <htmlOut> '</body></html>'; } htmlOut.close(); } Voilá! There you have a SGML-based content management system.!include 'cm.judi' // defines these 2 variables: cmroot and docroot // and the procArticle() function. function procAllArticles { list '*.sgml' in '${cmroot}/articles'; for fname in $$fs_result { destFile = fname.getFileName().replace('.sgml', '.html'); procArticle fname, '${docroot}/articles/${destFile}'; } }
3. The Structure of JudoScript.Com
The whole judoscript.com site is completely managed by a custom content management system. There are three groups of all the site content:
4. Sample Site
The sample site download contains such a directory structure: The/buidtools/build.judo /site.judo /common.judi /home_src/home.sgml /articles.html /weblinks.html /contributions.html /sitemap.html /share/.... build.judo is a menu-driven build script. It also
does some simple processing but the major tasks are delegated to the
site.judo script.
The judoscript.com site-wide pages all employ a consistent look-and-feel: the logo and search box at the top, a menu column on the left and the content pane in the south-east corner, except for site map, which does not have a menu column because this page lists all choices. There are two kinds of contents in this mini system. The content of the home page uses a custom tag system, so it needs a corresponding tag routine. The majority of the pages, however, are just plain HTML content to be copied onto the content pane in the final page. Thus, we have three content processing routines: Please refer to thegenHomePage(); genSiteMapPage(); genSitePage(); // this is general-purpose. site.judo script for details. So
the complete building process can be illustrated as follows:
Content Source Resultant Web Site ============== ================== share/* --------------------- copied ------------> ${docroot}/share/* home_src/home.sgml ---------- genHomePage() -----> ${docroot}/index.html home_src/sitemap.sgml ------- genSiteMapPage() --> ${docroot}/sitemap.html home_src/weblinks.html ------ genSitePage(..) ---> ${docroot}/weblinks.html home_src/articles.html ------ genSitePage(..) ---> ${docroot}/articles/index.html home_src/contribution.html -- genSitePage(..) ---> ${docroot}/contrib/index.html
5. Summary
Anyone writing and publishing any content quickly realizes how critical a content management system can be. With a consistent data format for your content, you can deliver them into different formats, collect statitistics and reports, produce cross-references, on and on ... SGML is one of the most commonly used format for content mangement. With JudoScript's convenient SGML statement, one can easily set up a content management system. It may start with something simple, such as putting a page layout around content, and then evolve into more complicated pages. |