Converting PDFs to text

Usage tips, posted by users. No questions here please.

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
jeffy
Posts: 323
Joined: Mon Mar 03, 2003 9:04 am
Location: Philadelphia

Converting PDFs to text

Post by jeffy »

I found this great tool for converting the text in PDFs to plain text:

http://pricelessware.org/2003/PL2003TEXT.htm#P110

It's a command-line program. Call it, for example, like this:

C:\applications\PDF-TXT1.EXE "C:\whatever\My PDF Document.pdf" C:\temp\from_pdf.txt

I have nothing to do with the creation of this program. I just use it and love it. Enjoy.
lichudang
Posts: 5
Joined: Tue Jan 27, 2004 8:36 pm

Post by lichudang »

Thanks, Great!
User avatar
Fredkc
Posts: 6
Joined: Tue Apr 10, 2007 11:26 pm
Location: Riverside, Ca.
Contact:

Post by Fredkc »

Another handy comand line tool is PDFToHTML

Reads a PDF file and does it's best to make an HTML file of it. Multi-column pages are not it's strong point, but it does a pretty god job.

Freeware:
http://sourceforge.net/projects/pdftohtml/

And no, I don't have a thing to do with this one, either; but yes I use it all the time.
Life IS mystical. It's just that we're used to it.
User avatar
dak
Posts: 18
Joined: Thu Mar 22, 2007 2:39 am

Re: Converting PDFs to text

Post by dak »

jeffy wrote:I found this great tool for converting the text in PDFs to plain text:

http://pricelessware.org/2003/PL2003TEXT.htm#P110
Looks like this utility is no longer available from this site.

It is listed on the front page, but is not available from the Text section.

Cheers,

dak
User avatar
SteveH
Posts: 327
Joined: Thu Apr 03, 2003 11:37 am
Location: Edinburgh, Scotland
Contact:

Post by SteveH »

It may also be worth trying xPDF. The Windows version includes a utility called pdftotext that will convert a pdf to text and, optionally, preserve formatting.
proximity4
Posts: 1
Joined: Wed Jul 07, 2010 10:08 am

Post by proximity4 »

A-PDF Text Extractor is a free utility designed to extract text from Adobe PDF files for use in other applications. There are three mode of output text: In PDF Order, Smart Rearrange and With Position. Learn more about the output type here.

The program is freeware, which means that you can use it either persionally or commercially for free.

The program is a standalone application; no Adobe Acrobat needed. A command line version is available also to allow you to call in your program or script.

If you want to grap images from PDF files, you may check out the A-PDF Image Extractor.
__________________________________________________________

office chair | office chairs
Post Reply