Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.postscript > #802 > unrolled thread

style of PostScript generated by pdf2ps?

Started bybugbear <bugbear@trim_papermule.co.uk_trim>
First post2012-07-16 10:29 +0100
Last post2012-07-16 11:15 +0100
Articles 8 — 3 participants

Back to article view | Back to comp.lang.postscript


Contents

  style of PostScript generated by pdf2ps? bugbear <bugbear@trim_papermule.co.uk_trim> - 2012-07-16 10:29 +0100
    Re: style of PostScript generated by pdf2ps? Helge Blischke <h.blischke@acm.org> - 2012-07-16 11:55 +0200
      Re: style of PostScript generated by pdf2ps? ken <ken@spamcop.net> - 2012-07-16 11:24 +0100
        Re: style of PostScript generated by pdf2ps? ken <ken@spamcop.net> - 2012-07-16 11:43 +0100
        Re: style of PostScript generated by pdf2ps? Helge Blischke <h.blischke@acm.org> - 2012-07-16 12:50 +0200
          Re: style of PostScript generated by pdf2ps? ken <ken@spamcop.net> - 2012-07-16 11:58 +0100
            Re: style of PostScript generated by pdf2ps? ken <ken@spamcop.net> - 2012-07-16 13:05 +0100
    Re: style of PostScript generated by pdf2ps? ken <ken@spamcop.net> - 2012-07-16 11:15 +0100

#802 — style of PostScript generated by pdf2ps?

Frombugbear <bugbear@trim_papermule.co.uk_trim>
Date2012-07-16 10:29 +0100
Subjectstyle of PostScript generated by pdf2ps?
Message-ID<CL-dne2qwa15Q57NnZ2dnUVZ7o-dnZ2d@brightview.co.uk>
This is (in fact) just the output of the pwwrite driver in GS.

This has changed:

The older driver generated stuff like this

%!PS-Adobe-3.0
%%Pages: (atend)
%%BoundingBox: 0 0 90 85
%%HiResBoundingBox: 0.000000 0.000000 90.000000 85.000000
%.............................................
%%Creator: GPL Ghostscript  900 (pswrite)
%%CreationDate: 2012/07/16 10:18:24
%%DocumentData: Clean7Bit
%%LanguageLevel: 2
%%EndComments

The newer driver:

%!PS-Adobe-3.0
%%BoundingBox: 0 0 612 792
%%Creator: GPL Ghostscript 904 (ps2write)
%%LanguageLevel: 2
%%CreationDate: D:20120713171249+01'00'
%%Pages: 1
%%EndComments

The %%BoundingBox comment in the new output is incorrect,
or at least deferred in a very unhelpful way; *this*
occurs later on...

%%Page: 1 1
%%PageBoundingBox: 0 0 198 198

Which is the "true" size. Indeed, every box (including media
and crop) in the driving PDF is 198x198, so th1 612x792
numbers in the BoundingBox comment are just defaults.

This has caused me trouble, since I had a script that was
using the BoundingBox comments as the size
of the image represented by the PostScript
(which was always a single page in my context).

    BugBear

[toc] | [next] | [standalone]


#804

FromHelge Blischke <h.blischke@acm.org>
Date2012-07-16 11:55 +0200
Message-ID<a6i6p7Fn13U1@mid.individual.net>
In reply to#802
bugbear wrote:

> This is (in fact) just the output of the pwwrite driver in GS.
> 
> This has changed:
> 
> The older driver generated stuff like this
> 
> %!PS-Adobe-3.0
> %%Pages: (atend)
> %%BoundingBox: 0 0 90 85
> %%HiResBoundingBox: 0.000000 0.000000 90.000000 85.000000
> %.............................................
> %%Creator: GPL Ghostscript  900 (pswrite)
> %%CreationDate: 2012/07/16 10:18:24
> %%DocumentData: Clean7Bit
> %%LanguageLevel: 2
> %%EndComments
> 
> The newer driver:
> 
> %!PS-Adobe-3.0
> %%BoundingBox: 0 0 612 792
> %%Creator: GPL Ghostscript 904 (ps2write)
> %%LanguageLevel: 2
> %%CreationDate: D:20120713171249+01'00'
> %%Pages: 1
> %%EndComments
> 
> The %%BoundingBox comment in the new output is incorrect,
> or at least deferred in a very unhelpful way; *this*
> occurs later on...
> 
> %%Page: 1 1
> %%PageBoundingBox: 0 0 198 198
> 
> Which is the "true" size. Indeed, every box (including media
> and crop) in the driving PDF is 198x198, so th1 612x792
> numbers in the BoundingBox comment are just defaults.
> 
> This has caused me trouble, since I had a script that was
> using the BoundingBox comments as the size
> of the image represented by the PostScript
> (which was always a single page in my context).
> 
>     BugBear

Ghostscript's ps2write device in essence outputs a linearized version of PDF 
prepended by a procset that permit an ordinary level2 interpreter to 
successfully render the stream. That implies reducing the input PDF or PS to 
level2 compatible objects.
One (weird?) feature of this driver is the attempt to tailor the output to 
specific printer features such as paper size. Therefore the global 
BoundingBox reflects Ghostscript's default page size or, if specified, the 
value fom the "-sPAPERSIZE=..." commandline switch.
Specifying "-dSetPageSize=true" for the ps2write device forces the media of 
crop (if present) box for every page to be reflected both in a setpagedevice 
dictionary and in a PageBoundingBox comment.

I'd recommend to modify your script to grep through the generated PS stream 
for the PageBoundingBox comments and use thoese values if specified.

Helge

[toc] | [prev] | [next] | [standalone]


#806

Fromken <ken@spamcop.net>
Date2012-07-16 11:24 +0100
Message-ID<MPG.2a6dfc3bcb8435ef989891@usenet.plus.net>
In reply to#804
In article <a6i6p7Fn13U1@mid.individual.net>, h.blischke@acm.org says...
 
> Ghostscript's ps2write device in essence outputs a linearized version of PDF 
> prepended by a procset that permit an ordinary level2 interpreter to 
> successfully render the stream.

That's only partially true these days, though it is still true up to a 
point.


> That implies reducing the input PDF or PS to 
> level2 compatible objects.

The conversion to leve 2 PostScript is what requires us to use level 2 
compatible objects, not the way the file is written.

If we ever do a ps3write then it will be able (foe example) to do 
shading dictionaries and CIDFonts. Currently these are converted to 
images and tyep 3 fonts respectively.


> One (weird?) feature of this driver is the attempt to tailor the output to 
> specific printer features such as paper size. 

Err, the paper size isn't printer-specific. We take the MediaBox from 
the PDF file and emit a PageSize media request in the PostScript. That's 
not specific to the printer, its specific to the PDF file, what the 
printer does with the request is up to the printer, it may select media 
from different trays, scale ther file etc.

I'm not sure what is weird about this, it makes sense to me.


> Therefore the global 
> BoundingBox reflects Ghostscript's default page size or, if specified, the 
> value fom the "-sPAPERSIZE=..." commandline switch.

The document BoundingBox ought to be the MediaBox from the PDF file or, 
as you correctly say, any overriding value such as the CropBox or 
PAPERSIZE, if specified. That's because these override what's in the PDF 
file.

You really shouldn't ever be seeing the GS default media size.

 
> I'd recommend to modify your script to grep through the generated PS stream 
> for the PageBoundingBox comments and use thoese values if specified.

Yes I agree with this, but I'm willing to look into the document level 
BoundingBox. I only ask for a bug to be entered so I have something to 
track.



				Ken

[toc] | [prev] | [next] | [standalone]


#807

Fromken <ken@spamcop.net>
Date2012-07-16 11:43 +0100
Message-ID<MPG.2a6e00c23e2f53cc989892@usenet.plus.net>
In reply to#806
In article <MPG.2a6dfc3bcb8435ef989891@usenet.plus.net>, ken@spamcop.net 
says...
 
> Yes I agree with this, but I'm willing to look into the document level 
> BoundingBox. I only ask for a bug to be entered so I have something to 
> track.

In fact it looks like we probably *can* emit the intersection of all the 
page bounding boxes as the document BoundingBox. It doesn't look too 
hard, though it would be nice if someone would open a bug report, 
otherwise I'll have to do it myself ;-)


			Ken

[toc] | [prev] | [next] | [standalone]


#808

FromHelge Blischke <h.blischke@acm.org>
Date2012-07-16 12:50 +0200
Message-ID<a6ia0iFc5lU1@mid.individual.net>
In reply to#806
ken wrote:

> In article <a6i6p7Fn13U1@mid.individual.net>, h.blischke@acm.org says...
>  
>> Ghostscript's ps2write device in essence outputs a linearized version of
>> PDF prepended by a procset that permit an ordinary level2 interpreter to
>> successfully render the stream.
> 
> That's only partially true these days, though it is still true up to a
> point.
> 
> 
>> That implies reducing the input PDF or PS to
>> level2 compatible objects.
> 
> The conversion to leve 2 PostScript is what requires us to use level 2
> compatible objects, not the way the file is written.
> 
> If we ever do a ps3write then it will be able (foe example) to do
> shading dictionaries and CIDFonts. Currently these are converted to
> images and tyep 3 fonts respectively.
> 
> 
>> One (weird?) feature of this driver is the attempt to tailor the output
>> to specific printer features such as paper size.
> 
> Err, the paper size isn't printer-specific. We take the MediaBox from
> the PDF file and emit a PageSize media request in the PostScript. That's
> not specific to the printer, its specific to the PDF file, what the
> printer does with the request is up to the printer, it may select media
> from different trays, scale ther file etc.
> 
> I'm not sure what is weird about this, it makes sense to me.
> 
> 
>> Therefore the global
>> BoundingBox reflects Ghostscript's default page size or, if specified,
>> the value fom the "-sPAPERSIZE=..." commandline switch.
> 
> The document BoundingBox ought to be the MediaBox from the PDF file or,
> as you correctly say, any overriding value such as the CropBox or
> PAPERSIZE, if specified. That's because these override what's in the PDF
> file.
> 
> You really shouldn't ever be seeing the GS default media size.
> 
>  
>> I'd recommend to modify your script to grep through the generated PS
>> stream for the PageBoundingBox comments and use thoese values if
>> specified.
> 
> Yes I agree with this, but I'm willing to look into the document level
> BoundingBox. I only ask for a bug to be entered so I have something to
> track.
> 
> 
> 
> Ken

Well, if I convert a PDF which contains no document level media box (which 
is OK) and all of the pages specify a media box of A4 size, where then stems 
the document level bounding box with letter size from (the default paper 
size of GS is letter)?
The gs version used is 9.05

Helge

[toc] | [prev] | [next] | [standalone]


#809

Fromken <ken@spamcop.net>
Date2012-07-16 11:58 +0100
Message-ID<MPG.2a6e041eb1b7ba6a989893@usenet.plus.net>
In reply to#808
In article <a6ia0iFc5lU1@mid.individual.net>, h.blischke@acm.org says...

> Well, if I convert a PDF which contains no document level media box 
(which 
> is OK) and all of the pages specify a media box of A4 size, where then stems 
> the document level bounding box with letter size from (the default paper 
> size of GS is letter)?

I don't have such a file, but I wouldn't have expected that. Anyway, 
I've implemented the intersection code now. I should not however that 
this is *not* a true BoundingBox, its the media sizes for each page.

In order to find the true BoundingBox for the document (and for each 
page) we would need to run the bbox device to determine the real 
bounding box (and intersect it with the media request anyway, presumably 
in case there are off-page objects)



				Ken

[toc] | [prev] | [next] | [standalone]


#810

Fromken <ken@spamcop.net>
Date2012-07-16 13:05 +0100
Message-ID<MPG.2a6e13dfe8836e33989894@usenet.plus.net>
In reply to#809
In article <MPG.2a6e041eb1b7ba6a989893@usenet.plus.net>, ken@spamcop.net 
says...

 
> I don't have such a file, but I wouldn't have expected that. Anyway, 
> I've implemented the intersection code now. I should not however that 
> this is *not* a true BoundingBox, its the media sizes for each page.
> 
> In order to find the true BoundingBox for the document (and for each 
> page) we would need to run the bbox device to determine the real 
> bounding box (and intersect it with the media request anyway, presumably 
> in case there are off-page objects)

The fix (to use the media from all pages, not the device) was raised as 
bug #693181:

http://bugs.ghostscript.com/show_bug.cgi?id=693181

and fixed with Git commit b49d3c75a70cbdcdb2214f22ad1a1f62f1bb90fc

http://git.ghostscript.com/?
p=ghostpdl.git;a=commit;h=b49d3c75a70cbdcdb2214f22ad1a1f62f1bb90fc

The point about this not being a true BoundingBox still applies

			Ken

[toc] | [prev] | [next] | [standalone]


#805

Fromken <ken@spamcop.net>
Date2012-07-16 11:15 +0100
Message-ID<MPG.2a6dfa2e26fb85ed989890@usenet.plus.net>
In reply to#802
In article <CL-dne2qwa15Q57NnZ2dnUVZ7o-dnZ2d@brightview.co.uk>, 
bugbear@trim_papermule.co.uk_trim says...
 
> This is (in fact) just the output of the pwwrite driver in GS.

That's because ps2pdf is just a script that calls Ghostscript.

> The older driver generated stuff like this
> 
> %!PS-Adobe-3.0
> %%Pages: (atend)
> %%BoundingBox: 0 0 90 85
> %%HiResBoundingBox: 0.000000 0.000000 90.000000 85.000000
> %.............................................
> %%Creator: GPL Ghostscript  900 (pswrite)
> %%CreationDate: 2012/07/16 10:18:24
> %%DocumentData: Clean7Bit
> %%LanguageLevel: 2
> %%EndComments
> 
> The newer driver:
> 
> %!PS-Adobe-3.0
> %%BoundingBox: 0 0 612 792
> %%Creator: GPL Ghostscript 904 (ps2write)
> %%LanguageLevel: 2
> %%CreationDate: D:20120713171249+01'00'
> %%Pages: 1
> %%EndComments

We now use ps2write instead of pswrite because it produces better output 
(smaller, less bitmapped content etc). In the long term pswrite will be 
removed, but for now you can still use it if you really want to.

 
> The %%BoundingBox comment in the new output is incorrect,
> or at least deferred in a very unhelpful way; *this*
> occurs later on...

Well %%BoundingBox is a high water mark for *all* the pages, and since 
that value is probably the media size, its certainly legitimate, if 
perhaps inaccurate. I'm not certain there is anything we can do about 
this, as I think the header may be generated before we know all the page 
sizes.

That said I'm willing to look at the problem if you raise a bug report 
at http://bugs.ghostscript.com


			Ken

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.postscript


csiph-web