Re: XML-based Help Authoring tools for customized help

Subject: Re: XML-based Help Authoring tools for customized help
From: David Neeley <dbneeley -at- oddpost -dot- com>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Sun, 14 Dec 2003 16:55:19 -0800 (PST)


Mark,

Thanks for your detailed treatment of my comments. I still think I may be a little hazy as to several of the points, though, so perhaps you can enlighten me further.

First, regarding whether DocBook is an "application" in the sense that Frame or Word are "applications. If it were, then I would be surprised to see the mail list for developers of applications called "docbook-apps" (see www.docbook.org).

Oh, I'll grant you that there are many solutions that are often used with the DocBook DTD so that they may seem a part of it, but at least officially DocBook is a specified DTD created originally for the documentation of computer software or hardware. Thus, it is no wonder that it is not appropriate for all types of documentation. Still, I can see no sense in which DocBook is an "application" in the same sense as Frame or Word...although I will readily grant that conceptually a DTD is very similar to a subset of applications such as Frame or Word. Conversely, of course, Word or Frame are both more than and less than DocBook or any DTD.

"What I mean is a application in the same general sense that Word or Frame
are applications. That it, as packages that include standard data formats
and collections of programs that operate on those data formats to accomplish
a defined set of objectives."

Perhaps I am simply not familiar enough with a DTD such as DocBook to understand where are the "collections of programs that operate on those data formats?"

"Bill's argument in favor of using Docbook rather than a custom markup
language was that the data format is well defined and published and that
there is a ready made set of programs to work with it, including both
editing applications and various processing and publishing programs. My
point was that Docbook, understood not only of the file format itself, but
the whole "tool chain" designed to work with that tool chain, is a packaged
application more or less equivalent to Word of Frame, of, for instance, Open
Office."

Now, I begin to see--I think. In other words, you are comparing a *group of tools* that work with the DTD in a "tool chain" as being equivalent to Word or Frame.

In the sense you mean, you would have to include the other tools that work with the Word or Frame part as part of the "tool chain" or process in order to be equivalent to the apparently rather strained definition you are dealing with and lumping together to call "DocBook."

"To illustrate the parallel further, lets compare Docbook to these other
packaged application on several points.
Published file format: The Docbook format is published and downloadable
separately from the "tool chain", so are RTF, MIF and the Open Office
format."

Mark, I think you're being a bit disingenuous here. Besides the point that MIF is not the primary file format of Frame, to be fair we should compare structured Frame with OpenOffice or whatever-tool-chain-you-choose-to-produce-DocBook. That is, we might select Frame or OpenOffice as our editors of choice for producing...you guessed it!...DocBook. Again, the file format or DTD is *not* the application in the same sense of the word.

"Standard application package: Docbook has a standard package just like Word,
Frame, etc."

This, I readily admit, is a somewhat new concept to me. I have downloaded both the SGML and XML versions of DocBook--and all I ever found was a DTD. The parallel, unless there is something I have not previously found, again would not be Word or Frame, but the *format definitions* presumably of .rtf or .mif. Where do I err?

"Third party extensions: People can write Docbook processing applications
independent of the standard application package. Ditto for Word and Frame
(RoboHelp, WebWorks, etc., are examples.)"

So?

"User customizable: Users can extend the functionality of the Docbook tool
chain by writing their own code. People can use a standard parser to parse
the document structure. This is also true of Word. VBA give you direct
access to the internals of a Word document without making you parse the file
format yourself."

Again, you compare "the DocBook tool chain" to Word in this case (or Frame with the Frame SDK or FrameScript, I suppose). However, again I am confused by the lumping together of whatever tool chain may be constructed to work with the DocBook DTD as a comparison with Word (or Frame or OpenOffice, for that matter).

In other words--Frame or OpenOffice (and perhaps the newest release of Word, but I am ignorant on that) may be readily employed *as part of the DocBook tool chain*--but that is not to say that DocBook *itself* is equivalent.

"Now you could certainly argue that Docbook had certain advantages over their
other packaged applications mentioned here (and some people would doubtless
reply with advantages of Word or Frame) but my point remains true: Docbook
is a packaged application like Frame, Word, and OpenOffice, and is designed
and used for broadly the same purpose."

Okay, I surrender. Show me the parts of DocBook that are an "application package like Frame, Word, and OpenOffice." If I understand Norman Walsh's diagram of publishing with DocBook, the tools required are various.

> However, they are not designed to be altered with any sort of equivalent to schema
> or DTD...

"Sure they have. A Word stylesheet or a Frame template are the equivalent of
an XML DTD: They define a set of named "elements" which can be assigned to
spans of text. You can change a stylesheet or template just as you can
change a DTD, and for much the same purpose"

Again, a very strained comparison, I believe. Again, I cannot speak for the latest incarnation of Word. For Frame, the DocBook DTD is more equivalent to a portion of the structured Frame EDD--which in this case would incorporate the DocBook DTD as a part and various text formatting rules as another, IIRC.

Still, we seem to be arguing at cross purposes. For the sake of proceeding, then, let us presume that your "Docbook consists of Docbook (the DTD) and various assorted other tools" definition. I *still* do not understand why that is somehow equivalent to a Word, Frame, or OpenOffice other than loosely. Any of these three, it seems to me, can and do substitute for *some* of the "various assorted other tools" in various DocBook-using shops.

In other words, under your definition you have to make the argument that DocBook is a superset of itself. Therefore, I still see no way to assert that DocBook is functionally equivalent to Word, Frame, or OpenOffice--especially since at least two of them can and do export DocBook document instances.

> The fact is that DocBook is as a standard a *superset* of what most users
will need; it permits
> various combinations of styles which may work best for one organization
and another subset
> for another without departing from the standard.

"That is a very good description of what it does. But notice what the
consequences are of it being a superset of each user's needs. It means that
it is not only bigger and more complex than some users need, it is actually
bigger and more complex than any one user needs.
And Docbook is, at best, only a superset of what most users need from a
document description language. There are document description needs it
doesn't meet. There are also a wide variety of content capture,
customization, automation, and exchange functions that it does not meet."

To assert that DocBook or *any* XML language is not all-encompassing is to create a straw man--easy to knock down, perhaps, but not useful in the real world outside of argumentation. DocBook, while very useful, was never *intended* to be an "all things to all people" XML language.

It should be readily apparent that individual organizations may need to extend DocBook--or create their own information description language (XML, SGML, or custom if they are so silly). It should also be apparent that for an organization with the luxury of time and budget, many opportunities exist which may not be tenable for most. Lacking either of these, I submit that it may be fastest, easiest, and actually most effective to begin with DocBook or similar and pare it down to a functional subset that serves that organization.

Of course, just as for many years custom EDI vendors made huge amounts of money for the constant reinvention of a very similar set of wheels, so too are various information architecture firms now seeking to implement highly customized and usually quite expensive XML instances.

However, in my experience, outside consultants have a rather long "ramp up" to fully understand any organization's business needs. By starting with a framework that has shown itself extremely useful for many people, and for which training and tools are readily available, it seems to me that the organization itself has a better chance of winding up with a functional and useful product with a relative minimum investment in time and money.

I would propose that in either case, the organization is going to have to make various changes over time. By doing much of the initial tailoring themselves, I believe they will tend to be more productive over time than by having the original setup done for them. However, I would never suggest they would be ill-advised to employ outside or internal consultants to be sure they take into account the various factors necessary for a successful implementation if they lack the experience and skill required. Personally, I think in most cases that sort of consultant can be hired directly for a very cost-efficient amount if they begin with a well-understood standard such as DocBook.

> With an appropriate schema or DTD, of course, you can produce a wide
variety of appearance
> details for any given instance of a DocBook-compliant document using any
of the *applications*
> which you may prefer that can work with the standard.
>
> That is, in fact, why SGML and later XML were first developed.

"SGML and XML were developed for a number of reasons. But they key feature of
both is that they are *generalized* markup languages. They were developed
out of the recognition that no one tagging language was able to meet all
needs and that the development of custom tagging languages was an
unavoidable necessity. They were also developed out of the recognition that
it was easier to develop custom tagging languages if they shared a common
syntax and thus could be interpreted using a common parser. But the point is
this: SGML and XML were both developed out of the explicit recognition that
the development of custom application-specific tagging languages was both
necessary and appropriate."

Pardon me, but originally SGML was developed because of a customer's recognized need. That "customer" was the Department of Defense, and the need was for a better system for creating documentation for various weapons systems and aircraft. Printed shop manuals for an F16 fighter, for instance, ran into the hundreds of feet of shelf space with hundreds of subsystems being documented by subcontractors with different styles. DoD put together a team of which IBM was a principal member to develop a method by which this could be transformed to a system whereby the source documents could be transformed to be uniform--minimizing time and mistakes on the part of those who have to maintain this equipment. Further, by making it an easily transmitted form, the contractors could keep the latest information available electronically so that the customers could be sure to have that latest information irrespective of the labyrinthine logistics system of the DoD.

As I recall, one reason IBM was selected as a lead element of this effort was because of various research projects they had in motion already. The DoD work (later expanded to include other agencies of the government) simply provided the impetus and the funding. All the rest of the project fell out of these requirements...and the rest, as they say, is history.

Hence my statement that being able to alter the appearance of the documents at whim was precisely an underlying reason of SGML development originally. (For example, a weapons system might be used by several of the services--and each could have the documentation in the format it prefers from the same document source).

> As for using DocBook or another XML standard versus some other
custom-built approach, that
> also depends very largely upon the flexibility you wish to have later on
in the life of the document.

"Yes, except that if you want flexibility you have to stop thinking in terms
of documents and start thinking in terms of content. Documents are just
presentation mechanisms for content."

Actually, I believe that not all information can be "chunked" with the granularity you seem to insist upon. In many cases, in fact, such chunking strains the ability of the reader to grasp the finished product. It is also very difficult to manage at present, since most organizations that have the greatest need also have weak abilities to find if a particular "chunk" has already been defined elsewhere in their documentation base.

Frankly, I suspect strongly that this will continue until we have a better handle upon automatically generated semantic analysis of a given document base with what is being written or prepared. However, that is a topic for another day.

> Because of the rapid evolution of products such as content management
systems that
> understand XML, and because of the growing understanding of and tools for
the major
> standards such as DocBook, I believe that the place you should start today
may well be
> a selection of the subset of one of these standards. By selecting the
basics from HTML
> as you suggest would be abandoning the easy introduction of these
fast-maturing tools.

"Never start with tools. Never never start with tools. Start with your
content and your business needs, and the process you need to apply to your
content. The choose tools that support those processes in the most robust
and efficient manner possibly."

On the other hand, as a human matter one *always* has some conception of tools available. I agree fully that determination of the process comes much after consideration of what information must be conveyed and the manner in which it is to be expressed. Fortunately, we are able to use quite a few tools to give us many options on both counts--but to fail to account for existing, off the shelf tools would be folly...just as starting with the tools is *often* folly. (Witness the "pointy headed bosses" who insist upon Word as a documentation tool...but I digress).

> Later, should you have enough data to make a CMS appropriate, wouldn't it
be *nice* if you
> *already* have your documents in a form which can immediately be used?

"You have to stop thinking of documents as an input. Information is the
input. Documents are the output. In between there are process that act on
information and that eventually produce documents or other information
vehicles. That is what you have to plan for."

As I have said, "it all depends." If you have many products consisting largely of various mixes of the same components, then it becomes relatively easy to do the logical chunking. If, on the other hand, a company is producing prototypes for other firms and all are widely different, then that chunking may in fact be meaningless. For "one off" projects, the chunk size may *be* at the document level.

However, by "document" I mean a collection of related information chunks. By no means must this consist of the traditional paper-based product. Of course, with a little over six years of my past spent consulting with "The Document Company" and going through countless hours of indoctrination into what a "document" might be, I may well have a somewhat different slant than most.

The point is, though, that not all information is readily reusable. In fact, many times the effort to parse it into various levels of detail so that any particular level might be extracted through some sort of automated tool for various forms of output is simply not worth the candle. In fact, with the present state of the art, reaching the true level of understanding of the information as information and not simply as linguistic elements implies a great deal of metadata identification which is all too often completely beyond any realistic budget of time, manpower, or money.

David

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ROBOHELP FOR FRAMEMAKER TRIAL NOW AVAILABLE!

RoboHelp for FrameMaker is a NEW online publishing tool for FrameMaker that
lets you easily single-source content to online Help, intranet, and Web.
The interface is designed for FrameMaker users, so there is little or no
learning curve and no macro language required! Call 800-718-4407 for
competitive pricing or download a trial at: http://www.ehelp.com/techwr-l4

---
You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit
http://www.raycomm.com/techwhirl/ for more resources and info.



Follow-Ups:

References:
Re: XML-based Help Authoring tools for customized help: From: Mark Baker

Previous by Author: Re: XML-based Help Authoring tools for customized help
Next by Author: Re: XML-based Help Authoring tools for customized help
Previous by Thread: Re: XML-based Help Authoring tools for customized help
Next by Thread: Mimeo experiences?


What this post helpful? Share it with friends and colleagues:


Sponsored Ads