Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.postscript > #406

Re: Magnifying pdf cleans irregularities?

From Helge Blischke <h.blischke@acm.org>
Newsgroups comp.lang.postscript
Subject Re: Magnifying pdf cleans irregularities?
Followup-To comp.lang.postscript
Date 2011-10-23 10:29 +0200
Message-ID <9gi1itF8diU1@mid.individual.net> (permalink)
References <j7vda5$2kk$1@dont-email.me>

Followups directed to: comp.lang.postscript

Show all headers | View raw


no.top.post@gmail.com wrote:

> By using gocr on:
> http://www.cogsci.rpi.edu/~rsun/sun.clarion2005.pdf
> I've been trying to extract the ASCII.
> 
> So far, using:
> pdftoppm -f 13 -l 13 -r 300 sun.clarion2005.pdf | gocr -o ppm13.300
> gives the best Optical Character recognition results.
> But it sees "k" as "h".
> 
> What confuses me, is that when I view with xpdf, the text
> looks as if it was printed by a bad-condition 1950 typewriter.
> 
> I especially remember "2004" where the 'bottoms' were
> badly un-aligned. But if I set xpdf to 'magnify' a section of
> the text, it looks clean, and of course gocr decodes perfectly.
> 
> I don't know exactly how the rendering works, but imagine
> that if the 'normal size' uses a bad quality font, and the
> magnified version uses a good quality font, that could
> explain what I'm seeing.
> 
> Since the information that 'the char IS a "k" and not
> a "h" is in the *.pdf file, and quiet independant of ANY
> rendering, and gocr can correctly decode BIG font,
> should I not expect to be able to get gocr to decode
> correctly, by <filtering it through a suiatble font>?
> 
> Thanks,
> 
> == Chris Glur.

If you look at the PDF properties, you'll recognize that the fonts used are 
bitmapped type3 fonts (in a fairly high resolution, though). That leads to 
degraded rendering whenever recalculation of the bitmaps is required due to 
the different resolution of the canvas.

Helge

Back to comp.lang.postscript | Previous | NextPrevious in thread | Find similar


Thread

Magnifying pdf cleans irregularities? no.top.post@gmail.com - 2011-10-22 21:41 +0000
  Re: Magnifying pdf cleans irregularities? luser- -droog <mijoryx@yahoo.com> - 2011-10-22 23:19 -0700
  Re: Magnifying pdf cleans irregularities? Helge Blischke <h.blischke@acm.org> - 2011-10-23 10:29 +0200

csiph-web