TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:RE: Tools: PDF to SQL From:Ed Klopfenstein <eklopfenstein -at- proclarity -dot- com> To:"TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com> Date:Fri, 23 Aug 2002 09:27:15 -0600
Matthew Horn wrote:
*************************************************
1. Export the PDF as an RTF document
2. Open RTF in favorite word processor (Word).
3. Convert to simple text format (TXT).
4. Buy a book on Perl. Laura LeMay's book will get you up to speed fast. And
I believe she is a member of this list.
5. Write script that uses Perl's text-manipulation functions to extract data
from text file.
6. Extend this script or write another script that converts the extracted
data into SQL statements.
7. Insert SQL into database.
*************************************************
I've used a similar technique, but instead of learning Perl (could take time
-- it's not called the "toothpick" language for it's easy to understand
syntax), I would suggest either using Word's search and replace capabilities
to massage the data or create a quick macro to clean up the data. You could
also do this by hand if you don't know code.
What you want is a final text file that uses some unique deliminator. I like
double colons (::) since they're more unique than commas. SQL Server's DTS
Wizard can pull data from structured text files as easily as Access' import
wizard -- just make sure you have the same number of deliminated columns
across all rows. If you're more visually oriented, creating a Word table and
then converting the table to deliminated text might also be an easy
solution. Look up tables and double deliminators in Word's Help files.
A final hint is to ensure you match your column data with the column data
type in SQL Server. For instance, if you're creating a bunch of text columns
and SQL wants an integer, you're going to generate an error.
Feel free to contact me offline. I'd be happy to help or give you some
quickie macros to provide a starting place.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Check out the new release of RoboDemo, our easy-to-use tutorial software.
Plus, buy RoboHelp Office in August and save $100 with our mail-in rebate.
Get details and download free trial versions at http://www.ehelp.com/techwr-l
---
You are currently subscribed to techwr-l as:
archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit http://www.raycomm.com/techwhirl/ for more resources and info.