Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #19572

Re: Reading Adobe PDF File

References <a54dcb32-1ecd-4186-81a7-3a55c275c9b0@4g2000pbz.googlegroups.com>
Date 2012-01-28 21:59 -0800
Subject Re: Reading Adobe PDF File
From Chris Rebert <clp2@rebertia.com>
Newsgroups comp.lang.python
Message-ID <mailman.5191.1327816772.27778.python-list@python.org> (permalink)

Show all headers | View raw


On Sat, Jan 28, 2012 at 9:52 PM, Shrewd Investor <cltung@gmail.com> wrote:
> Hi,
>
> I have a very large Adobe PDF file.  I was hoping to use a script to
> extract the information for it.  Is there a way to loop through a PDF
> file using Python?

Haven't used it myself, but:
http://www.unixuser.org/~euske/python/pdfminer/

> Or do I need to find a way to convert a PDF file into a text file?  If
> so how?

The pdf2txt.py script from the same package happens to do exactly this.

Cheers,
Chris

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Reading Adobe PDF File Shrewd Investor <cltung@gmail.com> - 2012-01-28 21:52 -0800
  Re: Reading Adobe PDF File Shrewd Investor <cltung@gmail.com> - 2012-01-28 21:52 -0800
  Re: Reading Adobe PDF File Chris Rebert <clp2@rebertia.com> - 2012-01-28 21:59 -0800
  Re: Reading Adobe PDF File Matej Cepl <mcepl@redhat.com> - 2012-01-30 09:09 +0100
  Re: Reading Adobe PDF File Adam Tauno Williams <awilliam@whitemice.org> - 2012-01-30 08:22 -0500

csiph-web