Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #9482
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: Piggypack Encoding/Decoding on RandomAccessFile |
| Date | 2011-11-03 20:40 -0400 |
| Organization | A noiseless patient Spider |
| Message-ID | <j8vcaa$tnj$1@dont-email.me> (permalink) |
| References | <j8ulue$or4$1@news.albasani.net> <j8uq9l$17l$1@dont-email.me> <j8urao$53e$1@news.albasani.net> |
On 11/3/2011 3:50 PM, Jan Burse wrote:
> Joshua Cranmer schrieb:
>> The "standard way" (at least, all of the use cases I've ever had for
>> RandomAccessFile) effectively uses the methods that are associated with
>> java.io.DataInput to read data: read(byte[]), and read*().
>
> I would like to use an arbirary encoding/decoding on top of the
> byte stream to get a character stream. But since RandomAccessFile
> does not implement InputStream/OutputStream, I cannot create
> a InputStreamReader/OutputStreamWrite on top.
For a completely "arbitrary" encoding, I think you're out of luck.
Stateful encodings (where the encoding of byte B[n] is a function of
B[n-1],B[n-2],...) make it difficult to begin in medias res: You cannot
know how to decode the first byte you read without already having seen
all its predecessors.
To support random access, where you'd like to jump directly to B[n]
without plowing through all that goes before, one usually addresses the
problem by restricting the valid n to multiples of some "block size,"
and encoding each "block" independently. You seek to the next lower
multiple of 32K or whatever, set your decryptor/compressor/decoder to
its initial state, and roll merrily along.
There's a problem if the encoding does not always map K input bytes
to f(K) output bytes: compressors, for example, output different amounts
of data depending on the values of the bytes compressed. There are two
principal methods for dealing with this difficulty:
1) Encode the original in blocks of 32K (say), and store each
encoded block in a file region that's sure to be large enough -- 40K,
perhaps. Pad with nulls or other junk values as needed, so long as
your decompressor can recognize and ignore the padding. Then original
byte N is in block number N/32K, whose encoding starts at (N/32K)*40K
in the file; seek to that spot and start decoding.
2) As before, encode the original in fixed-size blocks, but write
them cheek by jowl to the file. As you do so, also write an index file
that's essentially Map<OriginalByteNumber,EncodedByteNumber> for each
block boundary. Then original byte N is in the block beginning at
theMap.get(N/32K); seek to that spot and start decoding.
Elsethread you mention that RandomAccessFile provides neither
InputStream nor OutputStream. If you think about this a bit, you'll
see it's a natural consequence of the "Random" part: a Stream provides
the abstraction of a linear sequence of things, and does not admit of
leaping forward or backward to unrelated positions. Yes, there are
skip() and mark() and reset(), but I think you'll agree these are of
a different character than "read bytes 3000-3999, then 10000-10999,
then 936-22728." Streams are sequential; Random isn't.
--
Eric Sosman
esosman@ieee-dot-org.invalid
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar
Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-03 19:18 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-11-03 14:32 -0500
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-03 20:50 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile markspace <-@.> - 2011-11-03 13:52 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-03 23:13 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Knute Johnson <nospam@knutejohnson.com> - 2011-11-03 16:17 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Lew <lewbloch@gmail.com> - 2011-11-03 13:58 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-11-03 20:40 -0400
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-04 02:28 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-04 03:06 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-11-04 08:05 -0400
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-04 16:12 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile rossum <rossum48@coldmail.com> - 2011-11-04 16:54 +0000
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-03 23:24 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Arne Vajhøj <arne@vajhoej.dk> - 2011-11-03 20:14 -0400
Re: Piggypack Encoding/Decoding on RandomAccessFile Roedy Green <see_website@mindprod.com.invalid> - 2011-11-03 21:56 -0700
[OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Lew <lewbloch@gmail.com> - 2011-11-04 10:50 -0700
Re: [OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Arne Vajhøj <arne@vajhoej.dk> - 2011-11-04 21:07 -0400
Re: [OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Roedy Green <see_website@mindprod.com.invalid> - 2011-11-05 20:21 -0700
Re: [OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Roedy Green <see_website@mindprod.com.invalid> - 2011-11-05 20:24 -0700
Re: [OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Jan Burse <janburse@fastmail.fm> - 2011-11-06 10:36 +0100
Re: [OT] Conspiracy theories are BS (Was: Piggypack Encoding/Decoding on RandomAccessFile) Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-11-06 14:16 -0600
Re: Piggypack Encoding/Decoding on RandomAccessFile Stanimir Stamenkov <s7an10@netscape.net> - 2011-11-05 16:51 +0200
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-05 16:27 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Lew <lewbloch@gmail.com> - 2011-11-05 10:03 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-05 19:37 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Lew <lewbloch@gmail.com> - 2011-11-05 13:25 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-05 19:47 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-05 19:56 +0100
Re: Piggypack Encoding/Decoding on RandomAccessFile Lew <lewbloch@gmail.com> - 2011-11-05 13:29 -0700
Re: Piggypack Encoding/Decoding on RandomAccessFile Jan Burse <janburse@fastmail.fm> - 2011-11-06 10:42 +0100
csiph-web