Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #11844
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | Re: Interplatform (interprocess, interlanguage) communication |
| Date | 2012-02-08 00:55 -0700 |
| Organization | albasani.net |
| Message-ID | <jgt9s7$f7i$1@news.albasani.net> (permalink) |
| References | <IPC-20120203200443@ram.dialup.fu-berlin.de> <aM6dnWFo75_W9KzSnZ2dnUVZ_sqdnZ2d@giganews.com> <jgscnm$1td$1@news.albasani.net> <jgsj9c$sl6$2@localhost.localdomain> |
On 2/7/2012 6:31 PM, Martin Gregorie wrote:
> On Tue, 07 Feb 2012 16:38:31 -0700, BGB wrote:
>
>> in general, I agree (sockets generally make the most sense), although
>> there are cases where file-based communications can make sense, although
>> probably not in the form as described in the OP.
>>
> Yes, for small amounts of data or message passing between processes I
> tend to like sockets - as others have said, the fact that they are
> agnostic about the location of the communicating processes is often very
> useful.
>
yep.
>> usually, for passing messages over sockets, I have used "compact"
>> specialized binary formats,
>>
> Yep. ASN.1 has to be about the most compact way of encoding structured,
> multi-field messages with XML occupying the other end of the scale.
>
I disagree partly WRT ASN.1:
a disadvantage of ASN.1 is that a lot of times it tends to use
fixed-width integer encodings (and often sends structures in a
"reasonably raw" form), whereas one can shave more bytes using a
variable-length-integer scheme (why encode an integer in 4 bytes if you
only need 1 byte in a given case?). it is also possible to shave more
bytes if one makes the format use an adaptive/context-sensitive encoding
scheme and maybe a variant of Huffman coding or similar (and possibly
encode integer values using a similar scheme to that used in Deflate).
it is in-fact not particularly difficult to outperform ASN.1 in these
regards.
granted, yes, custom Huffman-based data encodings are probably not "the
norm" for network protocols (though some programs, such as the Quake 3
engine, have used Huffman-compressed network protocols).
there is also "arithmetic coding" and "range coding", but with these it
is a lot harder to make the codec be acceptably fast (whereas there are
some tricks to allow optimizing Huffman codecs).
in cases where I have used XML, I have typically used a custom binary
XML variant, which can greatly reduce the overhead vs textual XML. in
terms of saving bytes, my encoding can be more compact than WBXML or
XML+Deflate, but is arguably more "esoteric", and as-is doesn't make use
of schemas (it is instead a basic adaptive coding, and is vaguely
similar to an LZ-Markov coding, attempting to exploit repeating patterns
in tag-structure and similar via prediction, but like most adaptive
codings initially transmits the data in a less dense form as it needs to
build up a new context for each message). the coding in question doesn't
use Huffman coding (for sake of simplicity, and because I don't always
particularly need "maximum compactness"), but a Huffman-based variant
could be created if needed.
there is also EXI, but I don't know how my encoding compares (EXI
probably does better though, given that IIRC it uses binary universal
codes and schemas).
for something else of mine I am using S-Expression based messages
(currently between components within the same process), and had
considered using a vaguely similar binary coding if/when I get around to it.
> That said, for short, list of fields messages I often use a CSV string
> preceded by an unsigned binary byte value containing the string length:
> this type of message is both easy to transfer, even if the connection
> wants to fragment it during transmission, and by having a printable text
> payload, its also convenient for trouble shooting.
>
yes, this is possible.
also possibly would be a TLV encoding (say, possibly doing something
similar to the Matroska MKV file-format).
say, the integer values are encoded something like (range, encoding):
0-127 0xxxxxxx
128-16383 10xxxxxx xxxxxxxx
16384-2097151 110xxxxx xxxxxxxx xxxxxxxx
2097152-... ...
likewise, one can get a signed variant by folding the sign into the LSB,
forming a pattern like: 0, -1, 1, -2, 2, ...
then, one defines tags as:
{
VLI tag;
VLI length;
byte data[length];
}
where tags can hold either data or messages (and, the smallest tag size
needs 2 bytes, or 3 bytes if one has 1 byte of payload for the tag).
if the length is optional (presence depends on tag), one can reduce the
typical tag size to 1 byte. likewise, tags can be combined with an
MTF/MRU scheme such that any recently used tags have a small value (and
can thus be encoded in a single byte). (many of my formats define tags
inline, rather than relying on some large hard-coded tag-list).
more bytes can be saved if more of the message structure is known, say
that not only does the tag encode a particular tag-type, but also may
carry information about what follows after it (various combinations of
attributes, and if it contains sub-tags and what they might be, ...).
if a new tag is defined, it is added to the MRU, but if not used
frequently may move "backwards" (towards higher index numbers) or
eventually be forgotten (falls off the end of the list).
note that some hard-coded tag-numbers will be needed for basic control
purposes (encoding new/unfamiliar tags, ...).
a Huffman-based variant could be similar, just one may encode integers
differently. an example scheme is to use a prefix value (Huffman coded)
and a suffix bit pattern (similar to Deflate). a simpler (but less
compact) scheme was used in JPEG, and IIRC I had before "compromised"
between them by having the Huffman table be stored using Rice codes.
example (prefix range, value range, suffix bits):
0-15 0-15 0
16-23 16-31 1
24-31 32-63 2
32-39 64-127 3
40-47 128-255 4
48-55 512-1024 5
56-63 1024-2047 6
64-71 2048-4095 7
72-79 4096-8191 8
80-87 8192-16383 9
...
also note that a nifty thing (also used in Deflate) is to compress the
Huffman table itself using Huffman coding.
likewise, one can save a few bytes if the encoder is smart enough to
recognize when tags encode numeric data (mostly specific to XML, with
S-Expressions or similar one knows when they are dealing with numeric data).
likewise, one can encode floats as a pair of integer values (although
floats present a few of their own complexities). one can also devise
special encodings for things like numeric vectors, quaternions, ... if
needed as well.
likewise, either an LZ77 or LZ-Markov scheme can be used for encoding
strings (an example would be to used a fixed-size rotating window like
in Deflate, and essentially using the same basic encoding for strings,
albeit likely with the use of an "End-Of-String" marker).
say (range, meaning):
0-255: literal byte values
258: End Of String
259-321: LZ77 Run (encodes length, followed by window offset).
String encoding would be used, say, for encoding both literal text, and
also for escaping things like tag and attribute names.
...
the main variability is mostly in terms of the type of payload being
transmitted:
be it XML-based, S-Expression based, or potentially object-based
(similar to either JSON, or a sort of "heap pickling" style system).
for most structured data, it shouldn't be needed to change the
"fundamentals" too much. the main difference is between tree-structured
and heap-like / graph-structured data, as graph-structured data is often
better sent as a flat list of objects with a certain entry being a "root
node" than as a tree (this can be accomplished either by building a
list, or using an algorithm to detect and break-up cycles when needed).
granted, for most use-cases something like this is likely to be overkill.
or such...
Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Find similar | Unroll thread
Re: Interplatform (interprocess, interlanguage) communication jebblue <n@n.nnn> - 2012-02-07 12:11 -0600
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-07 16:38 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-07 20:26 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 01:41 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-08 07:19 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 12:07 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:16 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 19:50 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 06:24 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 09:15 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 18:58 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 16:15 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:50 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:40 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:47 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:06 -0800
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:18 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 23:03 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:27 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 13:33 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 15:50 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:34 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:48 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:46 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 08:51 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:43 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 13:15 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 14:50 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 14:32 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 17:10 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-10 22:08 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 00:49 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-11 14:04 -0400
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:55 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:52 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 20:06 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 22:41 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 00:46 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:29 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:31 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 16:02 +0000
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 11:16 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 22:46 +0000
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 11:33 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 20:18 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 01:36 -0700
Re: Interplatform (interprocess, interlanguage) communication Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-02-12 13:52 -0600
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:43 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:49 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:46 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:45 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 14:02 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 18:49 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:14 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 20:07 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 23:29 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-09 09:40 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 17:02 -0700
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 21:10 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:54 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:25 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:45 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:14 -0800
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:20 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 22:20 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:23 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 12:13 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-07 20:24 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-08 01:31 +0000
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 00:55 -0700
csiph-web