Page 1 of 1

Converting PDFs to text

Posted: Mon Oct 13, 2003 7:19 pm
by jeffy
I found this great tool for converting the text in PDFs to plain text:

http://pricelessware.org/2003/PL2003TEXT.htm#P110

It's a command-line program. Call it, for example, like this:

C:\applications\PDF-TXT1.EXE "C:\whatever\My PDF Document.pdf" C:\temp\from_pdf.txt

I have nothing to do with the creation of this program. I just use it and love it. Enjoy.

Posted: Fri Jan 30, 2004 9:08 pm
by lichudang
Thanks, Great!

Posted: Wed Apr 11, 2007 8:27 am
by Fredkc
Another handy comand line tool is PDFToHTML

Reads a PDF file and does it's best to make an HTML file of it. Multi-column pages are not it's strong point, but it does a pretty god job.

Freeware:
http://sourceforge.net/projects/pdftohtml/

And no, I don't have a thing to do with this one, either; but yes I use it all the time.

Re: Converting PDFs to text

Posted: Thu Apr 12, 2007 5:49 am
by dak
jeffy wrote:I found this great tool for converting the text in PDFs to plain text:

http://pricelessware.org/2003/PL2003TEXT.htm#P110
Looks like this utility is no longer available from this site.

It is listed on the front page, but is not available from the Text section.

Cheers,

dak

Posted: Thu Apr 12, 2007 11:55 am
by SteveH
It may also be worth trying xPDF. The Windows version includes a utility called pdftotext that will convert a pdf to text and, optionally, preserve formatting.

Posted: Wed Jul 07, 2010 10:13 am
by proximity4
A-PDF Text Extractor is a free utility designed to extract text from Adobe PDF files for use in other applications. There are three mode of output text: In PDF Order, Smart Rearrange and With Position. Learn more about the output type here.

The program is freeware, which means that you can use it either persionally or commercially for free.

The program is a standalone application; no Adobe Acrobat needed. A command line version is available also to allow you to call in your program or script.

If you want to grap images from PDF files, you may check out the A-PDF Image Extractor.
__________________________________________________________

office chair | office chairs