Re: Converting Word files into XML

Subject: Re: Converting Word files into XML
From: "Yves Barbion" <yves -dot- barbion -at- gmail -dot- com>
To: "Kevin McGowan" <thatguy_80 -at- hotmail -dot- com>
Date: Fri, 30 May 2008 08:45:05 +0200

Hi Kevin

Yes, this "amazing little tool" does exist. In fact, it's a toolset,
consisting of FrameMaker, MIF2Go and DITA-FMx.

This is how I "convert" Word documents to DITA topics:

1. Open the Word file in FrameMaker.
2. Using MIF2Go, map Word styles to DITA elements (also nested elements).
3. Save the FrameMaker file as DITA XML topics (this will also chunk up
your big monolothic Word file and create a ditamap).
4. Open the ditamap and the topics in FrameMaker, or any other DITA-aware
editor, for example XMetal.
5. *Validate and restructure the topics.*
6. Upload your DITA content to a DITA-aware content management system,
for example DITA Exchange.

Step 5 is particularly important because the original Word file was not
written with DITA in mind. Therefore, converting Word documents (or any
other "unstructured" document) to DITA always involves two steps:

1. A "technical" conversion to valid XML, which can be automated to a
large extent.
2. Restructuring the content into the DITA content model by applying the
principles of topic-oriented authoring, which requires authors who have a
thorough understanding of DITA and topic-oriented authoring.

Step 2 is often neglected, and then you end up with valid DITA topics, but
content which does not really fit the DITA model.

I have a Flash movie that shows the main steps of the process I described
above. If you wish to see this movie, feel free to contact me off-list.

--
Yves Barbion
Managing Director
Adobe-Certified FrameMaker Instructor
____________________________________

Scripto
skype: yves.barbion
www.scripto.nu
____________________________________

On Thu, May 29, 2008 at 5:39 PM, Kevin McGowan <thatguy_80 -at- hotmail -dot- com>
wrote:

>
> Hi all,
>
> Recently started another new contract, and will most likely be moving some
> existing, thankfully small, documents from Word format into DITA - XML via
> FrameMaker or XMetal.
>
> Thing is, I just chatted with a couple of guys here who's exclusive job it
> is to take GIANT Word file (they could range from 50-1200 pages) and convert
> them into XML (not DITA, but some other DTD). I just got a tour of what they
> do, and they literally go through line-by-line in Dreamweaver, assigning
> tags as they go.
>
> Has no one yet developed an amazing little tool that could map Word styles
> into XML tags to provide clean output? There's gotta be a faster way to do
> this, isn't there?
>
> Cheers,
> Kevin
>
> _________________________________________________________________
> If you like crossword puzzles, then you'll love Flexicon, a game which
> combines four overlapping crossword puzzles into one!
> http://g.msn.ca/ca55/208
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Create HTML or Microsoft Word content and convert to Help file formats or
> printed documentation. Features include support for Windows Vista & 2007
> Microsoft Office, team authoring, plus more.
> http://www.DocToHelp.com/TechwrlList
>
> True single source, conditional content, PDF export, modular help.
> Help & Manual is the most powerful authoring tool for technical
> documentation. Boost your productivity! http://www.helpandmanual.com
>
> ---
> You are currently subscribed to TECHWR-L as yves -dot- barbion -at- gmail -dot- com -dot-
>
> To unsubscribe send a blank email to
> techwr-l-unsubscribe -at- lists -dot- techwr-l -dot- com
> or visit
> http://lists.techwr-l.com/mailman/options/techwr-l/yves.barbion%40gmail.com
>
>
> To subscribe, send a blank email to techwr-l-join -at- lists -dot- techwr-l -dot- com
>
> Send administrative questions to admin -at- techwr-l -dot- com -dot- Visit
> http://www.techwr-l.com/ for more resources and info.
>
>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Create HTML or Microsoft Word content and convert to Help file formats or
printed documentation. Features include support for Windows Vista & 2007
Microsoft Office, team authoring, plus more.
http://www.DocToHelp.com/TechwrlList

True single source, conditional content, PDF export, modular help.
Help & Manual is the most powerful authoring tool for technical
documentation. Boost your productivity! http://www.helpandmanual.com

---
You are currently subscribed to TECHWR-L as archive -at- web -dot- techwr-l -dot- com -dot-

To unsubscribe send a blank email to
techwr-l-unsubscribe -at- lists -dot- techwr-l -dot- com
or visit http://lists.techwr-l.com/mailman/options/techwr-l/archive%40web.techwr-l.com


To subscribe, send a blank email to techwr-l-join -at- lists -dot- techwr-l -dot- com

Send administrative questions to admin -at- techwr-l -dot- com -dot- Visit
http://www.techwr-l.com/ for more resources and info.


References:
Is Vista "there" yet?: From: McLauchlan, Kevin
Re: Is Vista "there" yet?: From: sintac
RE: Is Vista "there" yet?: From: Combs, Richard
Converting Word files into XML: From: Kevin McGowan

Previous by Author: RE: XML content management systems
Next by Author: Captions Troubleshooting
Previous by Thread: RE: Converting Word files into XML
Next by Thread: RE: Converting Word files into XML


What this post helpful? Share it with friends and colleagues:


Sponsored Ads