Sign-In
SIGN-IN TO EPUBNOW!
 
Username:
Password:
 
 

XML Workflow for Publishers, Part - 2

02, 2009
By Dr. Brijesh Kumar,  Digital Media Initiatives

In the part 1 of this topic, we discussed text processing as a critical activity for a publisher and a gradual evolution of generalized markup languages for single-source multiple-format publishing and searcheable documentation databases. Differences between DITA and DocBook XML Vocabularies were briefly touched upon. It was mentioned that simple formats, such as .doc (Word), has a potential to be converted in to a generic DocBook XML document, which could be used for multiple format publishing, as well as for searcheable databases. Beginning to think in terms of XML requires some prerequisite understanding on part of a publisher, and that includes a conviction that managing content in a structured generalized markup language fetches him far better returns than keeping all data and documentation in WYSIWYG rendition desktop publishing software tools. Prima-facie, the best approach is to capture manuscripts in a common documentation interchange format. That may not be possible in all cases, so manuscripts may be accepted in simple traditional formats, such as a Word document (.doc) or an OpenOffice Writer's ODF Text Document (.odt). These formats could easily be converted to another generic format such as HTML/XHTML and cleaned for removing any style information. Suggesting to use HTML/ XHTML has inherent benefits. One that they are markup languages with distinct set of tags and could be managed in a WYSIWYG Editor, they are also easily processable in a computing environment and which could transform them in to more structured XML formats, such as DocBook XML or DITA.

The potential gets larger as soon the content is transformed into an XML structure. Schemas could be used to validate the content and transform in to presentation formats both in traditional paper print format, as well as numerous digital formats (Web, CD/DVD, eBook, Digital Talking Book). Key workflow processes shall thus include an XML Pipeline that would include Quality Assurance checks at various places. A human intervention and inspection could also be incorporated at appropriate levels. An automation shall ensure consistent quality and flawless publishing of the end product (s). An XML Pipeline is formed when XML Processes, called transformations, are connected together. For instance, given two transformations T1 and T2, the two can be connected together so that an input XML document is transformed by T1 and then the output of T1 is fed as input document to T2. [1] There are great efforts towards developing a standard XML Pipeline Language [2], such as XML Pipeline Definition Language Version 1.0, however, many alternative efforts are also being attempted. SXPipe [3], Apache Jelly [4], Apache Cocoon [5], XProc [6]. We shall continue this discussion in subsequent posts.

---------

[1] http://en.wikipedia.org/wiki/XML_pipeline

[2] http://www.w3.org/TR/xml-pipeline/

[3] http://norman.walsh.name/2004/06/20/sxpipe

[4] http://commons.apache.org/jelly/pipeline.html

 [5] http://cocoon.apache.org/

[6] http://www.w3.org/XML/Processing/

RDF Resource Description Framework    Copyright © 2010, ePubNow! | All right reserved. |
Powered by Cardamom CMS©