Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.ruby > #4693

Pdf Parsing Challenge

From Felipe Espinoza <fespinozacast@gmail.com>
Newsgroups comp.lang.ruby
Subject Pdf Parsing Challenge
Date 2011-05-17 16:04 -0500
Organization Service de news de lacave.net
Message-ID <b3e54e146d346d393b16b935800076bb@ruby-forum.com> (permalink)

Show all headers | View raw


Hi Everyone,

I'm just trying to use the pdf-reader gem, but I have some trouble
understading how the gem wokds

If someone can help me with this, i'll be really grateful

The Problem:

I have to extract some data from a paper in a pdf format. I just need
some data from the page 1, like the title of the paper, the authors
list, the universities of these autors, their mails, the abstract and
keywords

how I can extract this data from this paper?
http://dl.dropbox.com/u/6928078/CLEI_2008_002.pdf

with a simple string that contains the information of a complete field
(keywords, abstract, etc) would help me

It's not necessary to use this gem, but I need a string for each field
with this info, how can I do that?

-- 
Posted via http://www.ruby-forum.com/.

Back to comp.lang.ruby | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Pdf Parsing Challenge Felipe Espinoza <fespinozacast@gmail.com> - 2011-05-17 16:04 -0500
  Re: Pdf Parsing Challenge Phillip Gawlowski <cmdjackryan@googlemail.com> - 2011-05-17 16:31 -0500
    Re: Pdf Parsing Challenge Felipe Espinoza <fespinozacast@gmail.com> - 2011-05-17 16:38 -0500
      Re: Pdf Parsing Challenge Phillip Gawlowski <cmdjackryan@googlemail.com> - 2011-05-17 16:45 -0500
  Re: Pdf Parsing Challenge Mark T <paradisaeidae@gmail.com> - 2011-05-17 19:42 -0500
  Re: Pdf Parsing Challenge Mark T <paradisaeidae@gmail.com> - 2011-05-17 19:37 -0500
  Re: Pdf Parsing Challenge Kouhei Sutou <kou@cozmixng.org> - 2011-05-18 08:23 -0500
  Re: Pdf Parsing Challenge Johannes Held <johannes.held@informatik.uni-erlangen.de> - 2011-05-19 10:21 +0200

csiph-web