Groups | Search | Server Info | Keyboard shortcuts | Login | Register


Groups > comp.lang.postscript > #180

Re: text messes up in ps2pdf converted pdf

From Mirek <pisz_na.mirek@dionizos.zind.ikem.pwr.wroc.pl>
Newsgroups comp.lang.postscript
Subject Re: text messes up in ps2pdf converted pdf
Date 2011-04-28 21:46 +0000
Organization Wroclaw University of Technology, Poland
Message-ID <ipcn7f$3a$1@z-news.wcss.wroc.pl> (permalink)
References <0ee8b9ed-146a-403a-8b34-f37497f2b5df@n10g2000yqf.googlegroups.com> <91nv22F5p3U1@mid.individual.net>

Show all headers | View raw


On wto, 26 kwi 2011 15:18:57 in article news:<91nv22F5p3U1@mid.individual.net>
Helge Blischke wrote:
> Peng Yu wrote:
> 
>> Hi,
>> 
>> I convert the ps file to a pdf file. But the text in the pdf is messed
>> up. I'm wondering how to convert a ps file to a pdf file so that the
>> text can be copied? Thanks and look forward to hearing from you!
>> 
>> $ps2pdf pcfg-notes.ps
>> 
>> http://www.cs.cmu.edu/~roni/11761-s01/PreviousYearsHandouts/pcfg-notes.ps
 
> The PS file has been generated by dvips, the default TeX PostScript 
> generation method. This method, by default, positions each character 
> indifidually, therefore there is no trivial method to extract text from it; 
> it requires more intelligent algorithms (e.g. to compare the distance 
> between rendered characters with the normal interword spacing to determine 
> word boundaries etc.

I've tried ps2asci, pstotext and pdftotext tools on this example with quite
good results.

But I think that OP meant pdf with mouse copiable text.
For this a ps-file made by an ancient dvips should be repaired.
Pkfix with pkfix-helper solved this task:

  pkfix-helper pcfg-notes.ps | pkfix - pcfg-notes.repaired.ps

Back to comp.lang.postscript | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-25 20:03 -0700
  Re: text messes up in ps2pdf converted pdf luser- -droog <mijoryx@yahoo.com> - 2011-04-25 22:45 -0700
    Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-26 05:22 -0700
      Re: text messes up in ps2pdf converted pdf luser- -droog <mijoryx@yahoo.com> - 2011-04-26 12:02 -0700
  Re: text messes up in ps2pdf converted pdf Helge Blischke <h.blischke@acm.org> - 2011-04-26 15:18 +0200
    Re: text messes up in ps2pdf converted pdf Mirek <pisz_na.mirek@dionizos.zind.ikem.pwr.wroc.pl> - 2011-04-28 21:46 +0000
      Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-28 15:12 -0700
        Re: text messes up in ps2pdf converted pdf "Mark T. B. Carroll" <mtbc@bcs.org> - 2011-04-28 19:17 -0400
          Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-28 18:05 -0700
            Re: text messes up in ps2pdf converted pdf "Mark T. B. Carroll" <mtbc@bcs.org> - 2011-04-29 15:53 -0400
        Re: text messes up in ps2pdf converted pdf Mirek <pisz_na.mirek@dionizos.zind.ikem.pwr.wroc.pl> - 2011-04-29 08:49 +0000
  Re: text messes up in ps2pdf converted pdf uhhu <M8R-kwn62n@mailinator.com> - 2011-04-26 16:57 +0300
    Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-26 08:03 -0700
      Re: text messes up in ps2pdf converted pdf uhhu <M8R-kwn62n@mailinator.com> - 2011-04-26 18:29 +0300
  Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-26 08:07 -0700
    Re: text messes up in ps2pdf converted pdf ken <ken@spamcop.net> - 2011-04-26 17:19 +0100
      Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-26 13:03 -0700
        Re: text messes up in ps2pdf converted pdf pipitas <pipitas@googlemail.com> - 2011-04-26 14:32 -0700
        Re: text messes up in ps2pdf converted pdf ken <ken@spamcop.net> - 2011-04-27 07:48 +0100
          Re: text messes up in ps2pdf converted pdf Peng  Yu <pengyu.ut@gmail.com> - 2011-04-27 22:40 -0700
            Re: text messes up in ps2pdf converted pdf ken <ken@spamcop.net> - 2011-04-28 07:50 +0100
            Re: text messes up in ps2pdf converted pdf luser- -droog <mijoryx@yahoo.com> - 2011-04-28 00:17 -0700
  Re: text messes up in ps2pdf converted pdf PGAGA <grifwood@glinx.com> - 2011-04-29 18:16 -0700

csiph-web