Re: Frame to XML, SGML + Learning Curve

Subject: Re: Frame to XML, SGML + Learning Curve
From: Chris Despopoulos <cud -at- ARRAKIS -dot- ES>
Date: Wed, 7 Jul 1999 10:59:50 +0200

Chuck Allen wrote:

FrameMaker+SGML will read an unstructured
FrameMaker
file, but cannot write SGML from that file. You
can only "save as"
or export to native SGML if the file was imported
into
FrameMaker+SGML as SGML. If you are starting with
an
unstructured FrameMaker document, there isn't an
automagic,
"out-of-the box" way to convert it to SGML.

The spirit of this is sort of true, but I feel a need to
qualify... As far as I know, you cannot get SGML out of any
unstructured document. What I mean by that is, you must
always map *something* in your document to the SGML document
type definition. By definition, SGML is explicitly
structured, and an unstructured document is not.

That said, Maker+SGML does include mapping functionality you
can use to map an unstructured document to a structure
definition. Very briefly, it works like this:

* You define structure for Maker+SGML in an EDD (element
definition document). This corresponds roughly to the
SGML DTD, but the DTD is not required for you to have
an EDD. You *can* save an EDD as a DTD, and you can
convert a DTD into an EDD.
* For unstructured Maker docs, you create a conversion
table that maps para formats, char formats, and other
document objects to various elements in the EDD.
* Assuming you have all you need (EDD and DTD at the very
least), you can then save a document from Maker+SGML,
to SGML. Likewise, you can then open that SGML as
structured Maker+SGML.

Is this out-o-th'-box? Certainly not! You must create the
conversion table. But you can do it in terms of the
unstructured document's formatting. And it IS a part of the
Maker+SGML product.


The usual way to convert unstructured FrameMaker
to SGML
is to go an intermediate format, such as MIF, RTF,
HTML, or
XML, and then use a conversion script to go to
SGML. Using HTML
as an intermediate format can offer some
advantages since HTML
is an SGML application and thus enables you to use
an SGML-
aware scripting tool like Omnimark (now available
at no cost at
http://www.omnimark.com) to accomplish the
conversion efficiently.

I have to sound nasty here and call this very bad advice.
HTML is indeed an implementation of SGML, but it is a
special one, and very badly done from the perspective of
getting full value out of your SGML. If HTML can capture
all you need, why then convert the HTML to SGML? The fact
is, HTML does not have a rich document definition, and HTML
elements are used more to express formatting than
structure. Yes, HTML can distinguish <H1> from <H2>, but
can it tell you whether the headings are in a <Section1> or
a <Section2> or a <Chapter> or a <Glossary>? Compare HTML to
DocBook... How could you ever convert an HTML document into
meaningful DocBook? And if your original Maker document
*could* be meaningful docbook, think of all you will have
lost once you saved it as HTML!

In other words:

* Richly formatted doc to HTML = many to one.
* HTML to DocBook = one to many
* (many to one) to (one to many) = loss of data

If there is a way around that dilemma, please let me know.
But it seems a bit like you need to program in a Maxwell's
Demon to pull it off without a loss of data.

As for MIF, I think that is also a bad idea since MIF
includes much information that has nothing to do with
structure or content, per se. I suspect RTF is similar in
that sense, but there may be tools out there. As far as
that goes, there might be some tools that convert MIF to
specific DTDs. But I would use a conversion table in
Maker+SGML... Much easier, and much less like programming.

As for DocBook, Maker+SGML comes with an implementation of
the DocBook DTD, and you may be able to use it as is. Which
means you will still have to make a conversion table to map
formatting in your unstructured documents to valid DocBook
structure. But once you have your Maker+SGML docs
structured according to the DocBook EDD, you will be able to
make round trips between Maker & SGML.

As for choosing tools, that is a touchy subject. Everybody
has his favorites... Maker+SGML is mine because I know it.
I simply haven't had the opportunity to learn other tools.
I hear it is pretty good, and I can find people who have
done their homework, and who then decided to go with it.
But you should NOT take my preference as any proof that it
is the right tool for you.

I suggest you FIRST and FOREMOST be able to describe exactly
what it is you want to accomplish with markup; SGML or XML.
Then look for other people who already do that. Then see
what tools they use. Then maybe look for new tools they
didn't have a chance to consider. But it begins with
knowing EXACTLY what you want to do with the markup. (Well,
exactitude bing nearly impossible, take a good stab at it,
anyway.)

Did I mention that you need to know WHAT you want to do with
your documents in XML or SGML? If I have not stressed that
point sufficiently to convince you, please send me a message
offline so I have a chance to do so. You MUST know this
before you can meaningfully evaluate your tools. Hey,
determining how you want to use XML or SGML could be fun...
researching all the things people have done, from IETMs
(Interactive Electronic Technical Manuals) for the Air
Force, to Encyclopedia Brittanica, to the Oxford English
Dictionary Project. Or who knows what else?

Cheers cud


From ??? -at- ??? Sun Jan 00 00:00:00 0000=



Previous by Author: Re: Frame to XML, SGML + Learning Curve
Next by Author: Re: Desperately seeking employment (and why)
Previous by Thread: Re: Frame to XML, SGML + Learning Curve
Next by Thread: Re: editor, please


What this post helpful? Share it with friends and colleagues:


Sponsored Ads