techniques of extracting the original ASCII?

From	no.top.post@gmail.com
Newsgroups	comp.lang.postscript
Subject	techniques of extracting the original ASCII?
Date	2011-08-26 18:46 +0000
Organization	A noiseless patient Spider
Message-ID	<j38pmg$mvp$1@dont-email.me> (permalink)

Show all headers | View raw

I previously asked here:
Why is ps/pdf quirky with "ff" ?

and got the answers that "it's rendered by a glyph".
Well of course,but WHY? Why then isn't "a"
"rendered by a glyph"?

Since char("f") was originally entered by a keyboard as
ASCII, why should *IT*, and not other chars be transformed?
------------
I'm trying to absorb the contents of  [230069 bytes]
http://www.cogsci.rpi.edu/~rsun/sun.clarion2005.pdf 
which is frustrating since I can't extract any of the text to
my own notes.

I'm surprised that it's a: %PDF-1.2
because it's a newish document.
And what's not understandable, is that uless the original
'typed script' was given to a Chinese wood carver who
treated each char as an individual piece of art, why can't 
linux-tools nor Win7-adobe extract the original text
[except for that of one diagram] ?!

And although close examination of the rendering does
show that the 'same ascii-wise chars' DO have slightly
difference appearances, if the commonality of eg. all
char("N")s had not been factored-out, the file would
be massively increased in size.

How do you solve this problem of not being able to get
as ascii version of such 'texts' ?

Does ps & pdf render characters sequentially, or pixels,
or columns or glyphs sequentially; and if by glyphs: do 
they have variable positions on the screen?

== TIA.

Back to comp.lang.postscript | Previous | Next — Next in thread | Find similar

Thread

techniques of extracting the original ASCII? no.top.post@gmail.com - 2011-08-26 18:46 +0000
  Re: techniques of extracting the original ASCII? RedGrittyBrick <RedGrittyBrick@SpamWeary.invalid> - 2011-08-27 00:21 +0100

csiph-web