Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.ruby > #6517 > unrolled thread

Zlib::GzipReader doesn't work as expected

Started byThomas Wolf <thomasw@viacanale.de>
First post2012-04-25 10:57 +0200
Last post2012-04-25 21:53 +0200
Articles 6 — 3 participants

Back to article view | Back to comp.lang.ruby


Contents

  Zlib::GzipReader doesn't work as expected Thomas Wolf <thomasw@viacanale.de> - 2012-04-25 10:57 +0200
    Re: Zlib::GzipReader doesn't work as expected Robert Klemme <shortcutter@googlemail.com> - 2012-04-25 21:03 +0200
      Re: Zlib::GzipReader doesn't work as expected Simon Krahnke <overlord@gmx.li> - 2012-04-25 21:55 +0200
      Re: Zlib::GzipReader doesn't work as expected Thomas Wolf <thomasw@viacanale.de> - 2012-04-26 11:54 +0200
        Re: Zlib::GzipReader doesn't work as expected Simon Krahnke <overlord@gmx.li> - 2012-04-26 22:02 +0200
    Re: Zlib::GzipReader doesn't work as expected Simon Krahnke <overlord@gmx.li> - 2012-04-25 21:53 +0200

#6517 — Zlib::GzipReader doesn't work as expected

FromThomas Wolf <thomasw@viacanale.de>
Date2012-04-25 10:57 +0200
SubjectZlib::GzipReader doesn't work as expected
Message-ID<jn8e9q$sg3$1@online.de>
Hi,
given 2 files:
cat 5lines.txt
5 lines
5 lines
5 lines
5 lines
5 lines

cat more5lines.txt
More 5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines

These files are "gzip"ed as follows:
gzip < 5lines.txt > foo.gz
gzip < more5lines.txt >> foo.gz

zcat foo.gz:
5 lines
5 lines
5 lines
5 lines
5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines
More 5 lines

This ruby code only reads the first 5 lines:
#!/usr/bin/ruby
require "zlib"
filename  = ARGV[0]

Zlib::GzipReader.open(filename) {|gz|
     print gz.read
}

./test.rb foo.gz
5 lines
5 lines
5 lines
5 lines
5 lines

How do I force Zlib::GzipReader do read the whole file?

ruby versions: 1.8.7 and 1.9.0

Thanks and regards,
Thomas Wolf

[toc] | [next] | [standalone]


#6518

FromRobert Klemme <shortcutter@googlemail.com>
Date2012-04-25 21:03 +0200
Message-ID<9vr04hFh0cU1@mid.individual.net>
In reply to#6517
On 04/25/2012 10:57 AM, Thomas Wolf wrote:
> Hi,
> given 2 files:
> cat 5lines.txt
> 5 lines
> 5 lines
> 5 lines
> 5 lines
> 5 lines
>
> cat more5lines.txt
> More 5 lines
> More 5 lines
> More 5 lines
> More 5 lines
> More 5 lines
>
> These files are "gzip"ed as follows:
> gzip < 5lines.txt > foo.gz
> gzip < more5lines.txt >> foo.gz
>
> zcat foo.gz:
> 5 lines
> 5 lines
> 5 lines
> 5 lines
> 5 lines
> More 5 lines
> More 5 lines
> More 5 lines
> More 5 lines
> More 5 lines
>
> This ruby code only reads the first 5 lines:
> #!/usr/bin/ruby
> require "zlib"
> filename = ARGV[0]
>
> Zlib::GzipReader.open(filename) {|gz|
> print gz.read
> }
>
> ./test.rb foo.gz
> 5 lines
> 5 lines
> 5 lines
> 5 lines
> 5 lines
>
> How do I force Zlib::GzipReader do read the whole file?

That's a fairly common limitation of GZip libs (Java's standard lib also 
has this limitation, or at least hat last time I checked).

You might get away with wrapping the GzipReader around an open IO object 
and wrapping another GzipReader when the first finishes.

Kind regards

	robert

[toc] | [prev] | [next] | [standalone]


#6520

FromSimon Krahnke <overlord@gmx.li>
Date2012-04-25 21:55 +0200
Message-ID<87ipgn7fjf.fsf@xts.gnuu.de>
In reply to#6518
* Robert Klemme <shortcutter@googlemail.com> (21:03) schrieb:

> You might get away with wrapping the GzipReader around an open IO object
> and wrapping another GzipReader when the first finishes.

Like this:

,----[ gz.rb ]
| #!/usr/bin/env ruby
| 
| require 'zlib'
| require 'pp'
| 
| filename  = *ARGV
| 
| File.open filename do | f |
|   gz1 = Zlib::GzipReader.new(f)
|   pp gz1.read
|   pp Zlib::GzipReader.new(f).read
| end
`----

Doesn't work.

mfg,               simon .... l

[toc] | [prev] | [next] | [standalone]


#6521

FromThomas Wolf <thomasw@viacanale.de>
Date2012-04-26 11:54 +0200
Message-ID<jnb61g$7ka$1@online.de>
In reply to#6518
Am 25.04.2012 21:03, schrieb Robert Klemme:
>> How do I force Zlib::GzipReader do read the whole file?
>
> That's a fairly common limitation of GZip libs (Java's standard lib also
> has this limitation, or at least hat last time I checked).
>
> You might get away with wrapping the GzipReader around an open IO object
> and wrapping another GzipReader when the first finishes.

Thank you.

I found the following thread:
http://www.velocityreviews.com/forums/t866074-zlib-gzipreader-and-multiple-compressed-blobs-in-a-single-stream.html

and that code works with ruby 1.9.3p0:

require 'stringio'
require 'zlib'

def inflate(filename)
   File.open(filename) do |file|
     zio = file
     loop do
       io = Zlib::GzipReader.new zio
       puts io.read
       unused = io.unused
       io.finish
       break if unused.nil?
       zio.pos -= unused.length
     end
   end
end

inflate "foo.gz"

Regards,
Thomas

[toc] | [prev] | [next] | [standalone]


#6522

FromSimon Krahnke <overlord@gmx.li>
Date2012-04-26 22:02 +0200
Message-ID<87aa1y6z40.fsf@xts.gnuu.de>
In reply to#6521
* Thomas Wolf <thomasw@viacanale.de> (11:54) schrieb:

> require 'stringio'

This is unneeded.

>require 'zlib'
>
>def inflate(filename)
>  File.open(filename) do |file|
>    zio = file

You could just use | zio | instead of |file| and get rid of the
assignment.

>    loop do
>      io = Zlib::GzipReader.new zio
>      puts io.read

puts here will put another "\n" at the end of the output, use print
instead.

>      unused = io.unused
>      io.finish
>      break if unused.nil?
>      zio.pos -= unused.length
>    end
>  end
>end
>
>inflate "foo.gz"

Note that as said in the thread this works only for files and other
seekable sources.

So "(seq 1 5 | gzip; seq 6 10 | gzip) | yourscript.rb" won't work.

mfg,                    simon .... hth

[toc] | [prev] | [next] | [standalone]


#6519

FromSimon Krahnke <overlord@gmx.li>
Date2012-04-25 21:53 +0200
Message-ID<87mx5z7fmc.fsf@xts.gnuu.de>
In reply to#6517
* Thomas Wolf <thomasw@viacanale.de> (10:57) schrieb:

> These files are "gzip"ed as follows:
> gzip < 5lines.txt > foo.gz
> gzip < more5lines.txt >> foo.gz

So you have two streams of gzipped data in foo.gz.

And the ruby library reads only the first one.

> How do I force Zlib::GzipReader do read the whole file?

I don't know, read the source.

mfg,                simon .... l

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.ruby


csiph-web