Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.misc > #14841

pdf & O.C.R ?

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!1.eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail
From Unknown <dog@gmail.com>
Newsgroups comp.os.linux.misc
Subject pdf & O.C.R ?
Date Sat, 23 May 2015 07:49:37 +0000 (UTC)
Organization A noiseless patient Spider
Lines 18
Message-ID <pan.2015.05.23.07.50.46@gmail.com> (permalink)
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
Injection-Date Sat, 23 May 2015 07:49:37 +0000 (UTC)
Injection-Info mx02.eternal-september.org; posting-host="9d0025e7ac33c81a717f76a77067729a"; logging-data="10093"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX189Q2kgxK5yCK4PKAnSNW9uLsm6Ahz8wt4="
User-Agent Pan/0.133 (House of Butterflies)
Cancel-Lock sha1:Z+Yq1sN15Wq74LBuowUHAFoP+6E=
Xref csiph.com comp.os.linux.misc:14841

Show key headers only | View raw


I'm confused and disturbed that xpdf of:
http://www.inf.ethz.ch/personal/wirth/ProjectOberon/PO.Computer.pdf
  is perfect to the pixel, with maximum magnification [400%],
  which is expected, since it's computer-font generated, whereas:
http://www.northernlaw.co.za/images/stories/files/actsandbills/COMPANY%
20LAW%20ACT.pdf
  shows blotchy and fibers as if it's a photo-of-a-paper-copy.

And scanned copies of papers are apparently normal.

BUT!! How is it that xpdf allows me to extract the text, via mouse-copy
from COMPANY%20LAW%20ACT.pdf ?
That would mean that the mouse-driver is doing O.C.R.   ?!
And mc's viewer [which uses <pdftotext> ] reads this text.

Is this some new O.C.R. which I could use on jpg-ed pages of text?

==Thanks for any answers.

Back to comp.os.linux.misc | Previous | NextNext in thread | Find similar | Unroll thread


Thread

pdf & O.C.R ? Unknown <dog@gmail.com> - 2015-05-23 07:49 +0000
  Re: pdf & O.C.R ? Bob Tennent <BobT@cs.queensu.ca> - 2015-05-23 11:13 +0000
    Re: pdf & O.C.R ? Unknown <dog@gmail.com> - 2015-05-27 17:11 +0000
  Re: pdf & O.C.R ? John-Paul Stewart <jpstewart@sympatico.ca> - 2015-05-23 20:46 -0400
    Re: pdf & O.C.R ? Joe Beanfish <joebeanfish@nospam.duh> - 2015-05-26 13:26 +0000
      Re: pdf & O.C.R ? Unknown <dog@gmail.com> - 2015-06-13 13:29 +0000
        Re: pdf & O.C.R ? Robert Heller <heller@deepsoft.com> - 2015-06-13 12:52 -0500
    Re: pdf & O.C.R ? Unknown <dog@gmail.com> - 2015-05-27 17:10 +0000
      Re: pdf & O.C.R ? John-Paul Stewart <jpstewart@sympatico.ca> - 2015-05-29 20:31 -0400

csiph-web