PDF extraction bug

Subject: PDF extraction bug
From: Martin Finke <mfinke -at- crimsonlanguage -dot- com>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Fri, 5 Apr 2002 12:07:20 -0500


I use Gemini to extract text from PDF from a regular client who uses the
TBodoni font.

Every time I extract from one of their files to RTF format the two
consecutive characters "fi" are deleted. Apostrophes and quotation marks are
deleted also.

Is there a way to combat this? Perhaps a more sophisticated PDF extraction
tool than Gemini? My client will most likely not change their favourite font
just for me (I work at a translation agency).

Thanks,
Martin


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Are you using Doc-to-Help or ForeHelp? Switch to RoboHelp for Word for $249
or to RoboHelp Office for only $499. Get the PC Magazine five-star rated
Help authoring tool for less! Go to http://www.ehelp.com/techwr

Free copy of ARTS PDF Tools when you register for the PDF
Conference by April 30. Leading-Edge Practices for Enterprise
& Government, June 3-5, Bethesda,MD. www.PDFConference.com

---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit http://www.raycomm.com/techwhirl/ for more resources and info.


Previous by Author: Re: departing from a template
Next by Author: Help file basics - online resources?
Previous by Thread: Re: techwr-l digest: April 04, 2002
Next by Thread: Re: PDF extraction bug


What this post helpful? Share it with friends and colleagues:


Sponsored Ads