Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #11844

Re: Interplatform (interprocess, interlanguage) communication

From BGB <cr88192@hotmail.com>
Newsgroups comp.lang.java.programmer
Subject Re: Interplatform (interprocess, interlanguage) communication
Date 2012-02-08 00:55 -0700
Organization albasani.net
Message-ID <jgt9s7$f7i$1@news.albasani.net> (permalink)
References <IPC-20120203200443@ram.dialup.fu-berlin.de> <aM6dnWFo75_W9KzSnZ2dnUVZ_sqdnZ2d@giganews.com> <jgscnm$1td$1@news.albasani.net> <jgsj9c$sl6$2@localhost.localdomain>

Show all headers | View raw


On 2/7/2012 6:31 PM, Martin Gregorie wrote:
> On Tue, 07 Feb 2012 16:38:31 -0700, BGB wrote:
>
>> in general, I agree (sockets generally make the most sense), although
>> there are cases where file-based communications can make sense, although
>> probably not in the form as described in the OP.
>>
> Yes, for small amounts of data or message passing between processes I
> tend to like sockets - as others have said, the fact that they are
> agnostic about the location of the communicating processes is often very
> useful.
>

yep.


>> usually, for passing messages over sockets, I have used "compact"
>> specialized binary formats,
>>
> Yep. ASN.1 has to be about the most compact way of encoding structured,
> multi-field messages with XML occupying the other end of the scale.
>

I disagree partly WRT ASN.1:
a disadvantage of ASN.1 is that a lot of times it tends to use 
fixed-width integer encodings (and often sends structures in a 
"reasonably raw" form), whereas one can shave more bytes using a 
variable-length-integer scheme (why encode an integer in 4 bytes if you 
only need 1 byte in a given case?). it is also possible to shave more 
bytes if one makes the format use an adaptive/context-sensitive encoding 
scheme and maybe a variant of Huffman coding or similar (and possibly 
encode integer values using a similar scheme to that used in Deflate). 
it is in-fact not particularly difficult to outperform ASN.1 in these 
regards.


granted, yes, custom Huffman-based data encodings are probably not "the 
norm" for network protocols (though some programs, such as the Quake 3 
engine, have used Huffman-compressed network protocols).

there is also "arithmetic coding" and "range coding", but with these it 
is a lot harder to make the codec be acceptably fast (whereas there are 
some tricks to allow optimizing Huffman codecs).


in cases where I have used XML, I have typically used a custom binary 
XML variant, which can greatly reduce the overhead vs textual XML. in 
terms of saving bytes, my encoding can be more compact than WBXML or 
XML+Deflate, but is arguably more "esoteric", and as-is doesn't make use 
of schemas (it is instead a basic adaptive coding, and is vaguely 
similar to an LZ-Markov coding, attempting to exploit repeating patterns 
in tag-structure and similar via prediction, but like most adaptive 
codings initially transmits the data in a less dense form as it needs to 
build up a new context for each message). the coding in question doesn't 
use Huffman coding (for sake of simplicity, and because I don't always 
particularly need "maximum compactness"), but a Huffman-based variant 
could be created if needed.

there is also EXI, but I don't know how my encoding compares (EXI 
probably does better though, given that IIRC it uses binary universal 
codes and schemas).


for something else of mine I am using S-Expression based messages 
(currently between components within the same process), and had 
considered using a vaguely similar binary coding if/when I get around to it.


> That said, for short, list of fields messages I often use a CSV string
> preceded by an unsigned binary byte value containing the string length:
> this type of message is both easy to transfer, even if the connection
> wants to fragment it during transmission, and by having a printable text
> payload, its also convenient for trouble shooting.
>

yes, this is possible.

also possibly would be a TLV encoding (say, possibly doing something 
similar to the Matroska MKV file-format).


say, the integer values are encoded something like (range, encoding):
0-127		0xxxxxxx
128-16383	10xxxxxx xxxxxxxx
16384-2097151	110xxxxx xxxxxxxx xxxxxxxx
2097152-...	...

likewise, one can get a signed variant by folding the sign into the LSB, 
forming a pattern like: 0, -1, 1, -2, 2, ...

then, one defines tags as:
{
VLI tag;
VLI length;
byte data[length];
}

where tags can hold either data or messages (and, the smallest tag size 
needs 2 bytes, or 3 bytes if one has 1 byte of payload for the tag).


if the length is optional (presence depends on tag), one can reduce the 
typical tag size to 1 byte. likewise, tags can be combined with an 
MTF/MRU scheme such that any recently used tags have a small value (and 
can thus be encoded in a single byte). (many of my formats define tags 
inline, rather than relying on some large hard-coded tag-list).

more bytes can be saved if more of the message structure is known, say 
that not only does the tag encode a particular tag-type, but also may 
carry information about what follows after it (various combinations of 
attributes, and if it contains sub-tags and what they might be, ...).

if a new tag is defined, it is added to the MRU, but if not used 
frequently may move "backwards" (towards higher index numbers) or 
eventually be forgotten (falls off the end of the list).

note that some hard-coded tag-numbers will be needed for basic control 
purposes (encoding new/unfamiliar tags, ...).


a Huffman-based variant could be similar, just one may encode integers 
differently. an example scheme is to use a prefix value (Huffman coded) 
and a suffix bit pattern (similar to Deflate). a simpler (but less 
compact) scheme was used in JPEG, and IIRC I had before "compromised" 
between them by having the Huffman table be stored using Rice codes.


example (prefix range, value range, suffix bits):
0-15	0-15		0
16-23	16-31		1
24-31	32-63		2
32-39	64-127		3
40-47	128-255		4
48-55	512-1024	5
56-63	1024-2047	6
64-71	2048-4095	7
72-79	4096-8191	8
80-87	8192-16383	9
...

also note that a nifty thing (also used in Deflate) is to compress the 
Huffman table itself using Huffman coding.


likewise, one can save a few bytes if the encoder is smart enough to 
recognize when tags encode numeric data (mostly specific to XML, with 
S-Expressions or similar one knows when they are dealing with numeric data).

likewise, one can encode floats as a pair of integer values (although 
floats present a few of their own complexities). one can also devise 
special encodings for things like numeric vectors, quaternions, ... if 
needed as well.


likewise, either an LZ77 or LZ-Markov scheme can be used for encoding 
strings (an example would be to used a fixed-size rotating window like 
in Deflate, and essentially using the same basic encoding for strings, 
albeit likely with the use of an "End-Of-String" marker).

say (range, meaning):
0-255: literal byte values
258: End Of String
259-321: LZ77 Run (encodes length, followed by window offset).

String encoding would be used, say, for encoding both literal text, and 
also for escaping things like tag and attribute names.

...


the main variability is mostly in terms of the type of payload being 
transmitted:
be it XML-based, S-Expression based, or potentially object-based 
(similar to either JSON, or a sort of "heap pickling" style system).


for most structured data, it shouldn't be needed to change the 
"fundamentals" too much. the main difference is between tree-structured 
and heap-like / graph-structured data, as graph-structured data is often 
better sent as a flat list of objects with a certain entry being a "root 
node" than as a tree (this can be accomplished either by building a 
list, or using an algorithm to detect and break-up cycles when needed).


granted, for most use-cases something like this is likely to be overkill.


or such...

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Re: Interplatform (interprocess, interlanguage) communication jebblue <n@n.nnn> - 2012-02-07 12:11 -0600
  Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-07 16:38 -0700
    Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-07 20:26 -0400
      Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 01:41 -0700
        Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-08 07:19 -0400
          Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 12:07 -0700
            Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:16 -0500
              Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 19:50 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 06:24 -0400
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 09:15 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 18:58 -0400
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 16:15 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:50 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:40 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:47 -0500
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:06 -0800
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:18 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 23:03 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:27 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 13:33 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 15:50 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:34 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:48 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:46 -0700
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 08:51 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:43 -0700
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 13:15 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 14:50 -0700
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 14:32 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 17:10 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-10 22:08 -0400
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 00:49 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-11 14:04 -0400
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:55 -0500
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:52 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 20:06 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 22:41 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 00:46 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:29 -0500
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:31 -0500
                Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 16:02 +0000
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 11:16 -0500
                Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 22:46 +0000
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 11:33 -0700
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 20:18 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 01:36 -0700
                Re: Interplatform (interprocess, interlanguage) communication Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-02-12 13:52 -0600
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:43 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:49 -0500
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:46 -0500
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:45 -0500
        Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 14:02 -0800
          Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 18:49 -0700
            Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:14 -0500
              Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 20:07 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 23:29 -0700
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-09 09:40 -0800
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 17:02 -0700
              Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 21:10 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:54 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:25 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:45 -0500
                Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:14 -0800
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:20 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 22:20 -0700
                Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:23 -0500
                Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 12:13 -0700
    Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-07 20:24 -0500
    Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-08 01:31 +0000
      Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 00:55 -0700

csiph-web