Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #11821 > unrolled thread
| Started by | jebblue <n@n.nnn> |
|---|---|
| First post | 2012-02-07 12:11 -0600 |
| Last post | 2012-02-08 00:55 -0700 |
| Articles | 10 on this page of 70 — 7 participants |
Back to article view | Back to comp.lang.java.programmer
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Interplatform (interprocess, interlanguage) communication jebblue <n@n.nnn> - 2012-02-07 12:11 -0600
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-07 16:38 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-07 20:26 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 01:41 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-08 07:19 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 12:07 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:16 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 19:50 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 06:24 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 09:15 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-09 18:58 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 16:15 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:50 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:40 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:47 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:06 -0800
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:18 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 23:03 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:27 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 13:33 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 15:50 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:34 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:48 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 21:46 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 08:51 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:43 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 13:15 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 14:50 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-10 14:32 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 17:10 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-10 22:08 -0400
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 00:49 -0700
Re: Interplatform (interprocess, interlanguage) communication Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-11 14:04 -0400
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:55 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:52 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 20:06 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 22:41 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 00:46 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:29 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:31 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 16:02 +0000
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 11:16 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-12 22:46 +0000
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 11:33 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 20:18 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 01:36 -0700
Re: Interplatform (interprocess, interlanguage) communication Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-02-12 13:52 -0600
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 14:43 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:49 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:46 -0500
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:45 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 14:02 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 18:49 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-08 21:14 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-08 20:07 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 23:29 -0700
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-09 09:40 -0800
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-09 17:02 -0700
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 21:10 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-09 18:54 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-10 10:25 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 14:45 -0500
Re: Interplatform (interprocess, interlanguage) communication Lew <lewbloch@gmail.com> - 2012-02-11 12:14 -0800
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-11 15:20 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-11 22:20 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-12 09:23 -0500
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-12 12:13 -0700
Re: Interplatform (interprocess, interlanguage) communication Arne Vajhøj <arne@vajhoej.dk> - 2012-02-07 20:24 -0500
Re: Interplatform (interprocess, interlanguage) communication Martin Gregorie <martin@address-in-sig.invalid> - 2012-02-08 01:31 +0000
Re: Interplatform (interprocess, interlanguage) communication BGB <cr88192@hotmail.com> - 2012-02-08 00:55 -0700
Page 4 of 4 — ← Prev page 1 2 3 [4]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-02-10 10:25 -0700 |
| Message-ID | <jh3k0f$mbj$1@news.albasani.net> |
| In reply to | #11892 |
On 2/9/2012 4:54 PM, Arne Vajhøj wrote: > On 2/8/2012 11:10 PM, BGB wrote: >> On 2/8/2012 7:14 PM, Arne Vajhøj wrote: >>> On 2/8/2012 8:49 PM, BGB wrote: >>>> as noted, many people neither use schemas nor any sort of schema >>>> validation. in many use-cases, schemas are overly constraining to the >>>> ability of using XML to represent free-form data, or using them >>>> otherwise would offer little particular advantage. >>> >>> xsd:any do provide some flexibility in schemas. >>> >> >> yep, but one can wonder what is the gain of using a schema if one is >> just going to use "xsd:any"?... > > You still have some structure. > probably. >> it is also a mystery how well EXI behaves in this case (admittedly, I >> have not personally looked into EXI in-depth, as I only briefly skimmed >> over the spec a long time ago). > > No idea. But I would assume EXI supports what is valid XML and XSD. > yes, it is just that, IIRC, EXI uses the schema to know how to efficiently encode structures (values are directly coded), and falls back to a more naive strategy (describing the encoded tags) if the schema doesn't cover a given case. admittedly, I am less certain, partly as skimming over the spec, admittedly I am not entirely certain how EXI works (would have to invest a bit more time in reading over the spec). note: even in the worst case, the output will still likely be tiny vs textual XML. more skimming... sudden mystery: if the format is a bitstream, why are they apparently using a byte-aligned scheme for storing integers?... (the cost here is that one has to then re-align with the next byte boundary, potentially wasting on average several bits). >>>> say, if one is using XML for compiler ASTs or similar (say, the XML is >>>> used to represent a just-parsed glob of source-code), do they really >>>> need any sort of schema? >>> >>> I would expect syntax trees to follow certain rules and not be free >>> form. >>> >> >> well, there are some rules, but the question is more if a schema or the >> use of validation would offer much advantage to make using it worth the >> bother?... > > Enforcing correctness of data is usually a good idea. > potentially, but checking against schemas isn't free. depending on the application, it could be hard to justify spending the extra clock cycles (except maybe for debugging purposes or similar). a issue with ASTs is that they come in several forms: giant, like in the output of a C compiler, where many tasks tend towards "expensive" (it may take easily anywhere from 250ms-1500ms to shove all this stuff through the various compiler stages); small, like in a script-language VM, where typically it is desirable that compile times still be fairly fast, since a major strength of scripting languages is trying to keep "eval" and similar fairly close to free. granted, one could debate the sanity of using XML for ASTs in the first place, but this started originally as a historical accident in my case (I was writing an interpreter, and it was what I had on-hand, actually: I partly hacked an existing XML-RPC implementation into being a script interpreter...). however, it doesn't seem to actually hurt performance too badly (ironically, in my C compiler, much more time goes into the preprocessor and tokenizer, which are far more efficient and more highly optimized). side note: the C compiler doesn't use a standard DOM, but rather a highly specialized, but still DOM-like, system (and may still dump ASTs as text-form XML for debugging reasons). it involves, among other things, optimizations for numerical data (attributes may store numeric data directly, vs needing to use a string) and large hash-tables and chaining for look-ups, as well as specialized operations to reduce typing. my current scripting VM, however, internally uses lists/s-expressions (note: they are neither AST compatible, nor will C code work effectively on my scripting VM). this was due to a later rewrite "switching over" (I was also reusing a lot of parts from a prior Scheme interpreter of mine for this one). but, anyways, I am more left thinking schema-checking would probably make sense more when either some sort of security is a concern, or maybe when sending data "over the wire" between multiple parties. inserting a schema check between ones' parser and ones' bytecode emitter doesn't seem nearly as compelling. I guess, if a person really wanted, they could write a schema for the ASTs, but it is not clear how useful it would be to do so (since, generally, apart from someone mucking around with the compiler internals, there is little direct reason to know or care what is going on in there...). or such...
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-02-11 14:45 -0500 |
| Message-ID | <4f36c569$0$294$14726298@news.sunsite.dk> |
| In reply to | #11910 |
On 2/10/2012 12:25 PM, BGB wrote: > On 2/9/2012 4:54 PM, Arne Vajhøj wrote: >> On 2/8/2012 11:10 PM, BGB wrote: >>> On 2/8/2012 7:14 PM, Arne Vajhøj wrote: >>>> On 2/8/2012 8:49 PM, BGB wrote: >>>>> say, if one is using XML for compiler ASTs or similar (say, the XML is >>>>> used to represent a just-parsed glob of source-code), do they really >>>>> need any sort of schema? >>>> >>>> I would expect syntax trees to follow certain rules and not be free >>>> form. >>>> >>> >>> well, there are some rules, but the question is more if a schema or the >>> use of validation would offer much advantage to make using it worth the >>> bother?... >> >> Enforcing correctness of data is usually a good idea. >> > > potentially, but checking against schemas isn't free. > depending on the application, it could be hard to justify spending the > extra clock cycles (except maybe for debugging purposes or similar). One of the points is that you can validate during integration test and if you encounter a problem but keep validation turned off otherwise. And besides I would assume the big XML parser libraries to have optimized the validation quite a bit. Arne
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2012-02-11 12:14 -0800 |
| Message-ID | <14291890.498.1328991256328.JavaMail.geo-discussion-forums@pbr7> |
| In reply to | #11936 |
On Saturday, February 11, 2012 11:45:42 AM UTC-8, Arne Vajhøj wrote: > On 2/10/2012 12:25 PM, BGB wrote: > > On 2/9/2012 4:54 PM, Arne Vajhøj wrote: > >> On 2/8/2012 11:10 PM, BGB wrote: > >>> On 2/8/2012 7:14 PM, Arne Vajhøj wrote: > >>>> On 2/8/2012 8:49 PM, BGB wrote: > >>>>> say, if one is using XML for compiler ASTs or similar (say, the XML is > >>>>> used to represent a just-parsed glob of source-code), do they really > >>>>> need any sort of schema? > >>>> > >>>> I would expect syntax trees to follow certain rules and not be free > >>>> form. > >>>> > >>> > >>> well, there are some rules, but the question is more if a schema or the > >>> use of validation would offer much advantage to make using it worth the > >>> bother?... > >> > >> Enforcing correctness of data is usually a good idea. > >> > > > > potentially, but checking against schemas isn't free. Oh, yeah, micro-optimize that last $0.0000001 of performance. Great thinking. Checking against schemas isn't so expensive, either. You spout this drivel, BGB, about "isn't free", but where are your numbers? Show us reality, dude - exactly how "not free" is schema validation, under what loads, on what platforms? Hm? I thought not. >> depending on the application, it could be hard to justify spending the >> extra clock cycles (except maybe for debugging purposes or similar). > How many "extra clock cycles", and does it cost less than the damage your development techniques cause? > One of the points is that you can validate during integration test > and if you encounter a problem but keep validation turned off otherwise. > > And besides I would assume the big XML parser libraries to have > optimized the validation quite a bit. Given that BGB is just spewing dream talk with zero or less than zero facts, evidence or measurement behind it, it's pretty safe to dismiss his "conclusions". or such ... -- Lew
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-02-11 15:20 -0500 |
| Message-ID | <4f36cd93$0$289$14726298@news.sunsite.dk> |
| In reply to | #11944 |
On 2/11/2012 3:14 PM, Lew wrote: > On Saturday, February 11, 2012 11:45:42 AM UTC-8, Arne Vajhøj wrote: >> On 2/10/2012 12:25 PM, BGB wrote: >>> On 2/9/2012 4:54 PM, Arne Vajhøj wrote: >>>> On 2/8/2012 11:10 PM, BGB wrote: >>>>> On 2/8/2012 7:14 PM, Arne Vajhøj wrote: >>>>>> On 2/8/2012 8:49 PM, BGB wrote: >>>>>>> say, if one is using XML for compiler ASTs or similar (say, the XML is >>>>>>> used to represent a just-parsed glob of source-code), do they really >>>>>>> need any sort of schema? >>>>>> >>>>>> I would expect syntax trees to follow certain rules and not be free >>>>>> form. >>>>>> >>>>> >>>>> well, there are some rules, but the question is more if a schema or the >>>>> use of validation would offer much advantage to make using it worth the >>>>> bother?... >>>> >>>> Enforcing correctness of data is usually a good idea. >>>> >>> >>> potentially, but checking against schemas isn't free. > > Oh, yeah, micro-optimize that last $0.0000001 of performance. > > Great thinking. > > Checking against schemas isn't so expensive, either. You spout this drivel, > BGB, about "isn't free", but where are your numbers? Show us reality, dude - > exactly how "not free" is schema validation, under what loads, on what > platforms? Hm? > > I thought not. > >>> depending on the application, it could be hard to justify spending the >>> extra clock cycles (except maybe for debugging purposes or similar). >> > > How many "extra clock cycles", and does it cost less than the damage your > development techniques cause? > >> One of the points is that you can validate during integration test >> and if you encounter a problem but keep validation turned off otherwise. >> >> And besides I would assume the big XML parser libraries to have >> optimized the validation quite a bit. > > Given that BGB is just spewing dream talk with zero or less than zero facts, > evidence or measurement behind it, it's pretty safe to dismiss his > "conclusions". > > or such ... In science you dismiss hypothesis's based on proving them wrong not by noting the lack of proof. Arne
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-02-11 22:20 -0700 |
| Message-ID | <jh7i91$tl2$1@news.albasani.net> |
| In reply to | #11946 |
On 2/11/2012 1:20 PM, Arne Vajhøj wrote: > On 2/11/2012 3:14 PM, Lew wrote: >> On Saturday, February 11, 2012 11:45:42 AM UTC-8, Arne Vajhøj wrote: >>> On 2/10/2012 12:25 PM, BGB wrote: >>>> On 2/9/2012 4:54 PM, Arne Vajhøj wrote: >>>>> On 2/8/2012 11:10 PM, BGB wrote: >>>>>> On 2/8/2012 7:14 PM, Arne Vajhøj wrote: >>>>>>> On 2/8/2012 8:49 PM, BGB wrote: >>>>>>>> say, if one is using XML for compiler ASTs or similar (say, the >>>>>>>> XML is >>>>>>>> used to represent a just-parsed glob of source-code), do they >>>>>>>> really >>>>>>>> need any sort of schema? >>>>>>> >>>>>>> I would expect syntax trees to follow certain rules and not be free >>>>>>> form. >>>>>>> >>>>>> >>>>>> well, there are some rules, but the question is more if a schema >>>>>> or the >>>>>> use of validation would offer much advantage to make using it >>>>>> worth the >>>>>> bother?... >>>>> >>>>> Enforcing correctness of data is usually a good idea. >>>>> >>>> >>>> potentially, but checking against schemas isn't free. >> >> Oh, yeah, micro-optimize that last $0.0000001 of performance. >> >> Great thinking. >> >> Checking against schemas isn't so expensive, either. You spout this >> drivel, >> BGB, about "isn't free", but where are your numbers? Show us reality, >> dude - >> exactly how "not free" is schema validation, under what loads, on what >> platforms? Hm? >> >> I thought not. >> >>>> depending on the application, it could be hard to justify spending the >>>> extra clock cycles (except maybe for debugging purposes or similar). >>> >> >> How many "extra clock cycles", and does it cost less than the damage your >> development techniques cause? >> >>> One of the points is that you can validate during integration test >>> and if you encounter a problem but keep validation turned off otherwise. >>> >>> And besides I would assume the big XML parser libraries to have >>> optimized the validation quite a bit. >> >> Given that BGB is just spewing dream talk with zero or less than zero >> facts, >> evidence or measurement behind it, it's pretty safe to dismiss his >> "conclusions". >> >> or such ... > > In science you dismiss hypothesis's based on proving them wrong > not by noting the lack of proof. > yeah... and anyways, I am not about "making conclusions" or "decreeing how things should be done" or anything, rather, my view is there may be a time and place for everything (and whatever is or is not the case can be decided on a case-by-case basis or similar, based on whatever may apply in the particular case in question, and whichever options may be cheaper or more expensive, and similar). IMHO, the idea that a person "should" always do things the same way in every situation is itself arguably questionable. likewise goes for a beliefs that something is universally required or universally prohibited, ... [ decided to leave out most of the rest of what I wrote. ] basically, it all amounted to the frustration that there is little point in trying to "prove" something which ultimately results to little more than "hair splitting over a few percentage points...". the thing is... textual XML is kind of bulky, but doing damn near anything to it (like running it through deflate) will significantly reduce its size (say, to around 10-25% its original size). one can outperform this with specialized formats, but at this point it is worrying about a few percentage points +/-. what is the point of "proving" something which is ultimately of a fairly limited significance and scope?... maybe one can try to "prove" that people "should" actually give a crap. or, for that matter, finding a particular claim to disprove (say, that X is always true or always false). this is rarely the case with data compression, as it is typically more about averages, and likewise, there are cases for which the data may actually get bigger (about the only real "absolute" in data compression is something commonly known as the "Shannon limit"). secondarily is the "law of diminishing returns" (itself a natural result of the Shannon limit), where essentially the compressibility of a piece of data will form a sort of curve, and any (lossless) algorithms will fall somewhere along this curve, and typically with a fairly consistent ordering (say, for example, LZMA tends to compress better than BZip2 which tends to compress better than Deflate/GZip). one can look at how each algorithm works internally, or experiment with how they can use the basic parts to build other things or achieve interesting results (and note mostly that the parts themselves tend to fall along these sorts of curves, reducing "compression" mostly to a matter of "going mix and match" with various parts and making cost/benefit tradeoffs between particular combinations of parts). note that going further along the curve tends to become increasingly costly, hence why tradeoffs need to be made. but, ultimately, how much something is relevant will itself tend to depend somewhat on context.
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-02-12 09:23 -0500 |
| Message-ID | <4f37cb55$0$281$14726298@news.sunsite.dk> |
| In reply to | #11961 |
On 2/12/2012 12:20 AM, BGB wrote: > On 2/11/2012 1:20 PM, Arne Vajhøj wrote: >> On 2/11/2012 3:14 PM, Lew wrote: >>> On Saturday, February 11, 2012 11:45:42 AM UTC-8, Arne Vajhøj wrote: >>>> One of the points is that you can validate during integration test >>>> and if you encounter a problem but keep validation turned off >>>> otherwise. >>>> >>>> And besides I would assume the big XML parser libraries to have >>>> optimized the validation quite a bit. >>> >>> Given that BGB is just spewing dream talk with zero or less than zero >>> facts, >>> evidence or measurement behind it, it's pretty safe to dismiss his >>> "conclusions". >>> >>> or such ... >> >> In science you dismiss hypothesis's based on proving them wrong >> not by noting the lack of proof. >> > > yeah... > > and anyways, I am not about "making conclusions" or "decreeing how > things should be done" or anything, rather, my view is there may be a > time and place for everything (and whatever is or is not the case can be > decided on a case-by-case basis or similar, based on whatever may apply > in the particular case in question, and whichever options may be cheaper > or more expensive, and similar). > > IMHO, the idea that a person "should" always do things the same way in > every situation is itself arguably questionable. likewise goes for a > beliefs that something is universally required or universally > prohibited, ... ... > but, ultimately, how much something is relevant will itself tend to > depend somewhat on context. The fact that there is exceptions to most rules should not lead to a perception that rules does not matter. You should strive to go by the rules and only very reluctant go for the exception if it is really needed. Arne
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-02-12 12:13 -0700 |
| Message-ID | <jh931v$q33$1@news.albasani.net> |
| In reply to | #11974 |
On 2/12/2012 7:23 AM, Arne Vajhøj wrote: > On 2/12/2012 12:20 AM, BGB wrote: >> On 2/11/2012 1:20 PM, Arne Vajhøj wrote: >>> On 2/11/2012 3:14 PM, Lew wrote: >>>> On Saturday, February 11, 2012 11:45:42 AM UTC-8, Arne Vajhøj wrote: >>>>> One of the points is that you can validate during integration test >>>>> and if you encounter a problem but keep validation turned off >>>>> otherwise. >>>>> >>>>> And besides I would assume the big XML parser libraries to have >>>>> optimized the validation quite a bit. >>>> >>>> Given that BGB is just spewing dream talk with zero or less than zero >>>> facts, >>>> evidence or measurement behind it, it's pretty safe to dismiss his >>>> "conclusions". >>>> >>>> or such ... >>> >>> In science you dismiss hypothesis's based on proving them wrong >>> not by noting the lack of proof. >>> >> >> yeah... >> >> and anyways, I am not about "making conclusions" or "decreeing how >> things should be done" or anything, rather, my view is there may be a >> time and place for everything (and whatever is or is not the case can be >> decided on a case-by-case basis or similar, based on whatever may apply >> in the particular case in question, and whichever options may be cheaper >> or more expensive, and similar). >> >> IMHO, the idea that a person "should" always do things the same way in >> every situation is itself arguably questionable. likewise goes for a >> beliefs that something is universally required or universally >> prohibited, ... > ... >> but, ultimately, how much something is relevant will itself tend to >> depend somewhat on context. > > The fact that there is exceptions to most rules should not lead to > a perception that rules does not matter. > > You should strive to go by the rules and only very reluctant go > for the exception if it is really needed. > possible. others may go for an "all is allowed in programming, so long as it works ok and gets the job done" mindset. whether or not rules are followed may in turn depend on an evaluation of whether or not the rules work in ones' favor. so, on one hand: well, I can follow this rule, and get certain desirable effects. or, it may also work out as: this rule is stupid and inconvenient, I am not going to bother following it. or maybe: the existing rule is stupid/inconvenient/..., so I am going to make up my own rules and follow them instead. this does not necessarily mean making a standard of non-standard, as some piece of standardized technology (formally, or de-facto, it really doesn't matter) may itself carry desirable benefits. as well noted, PNGs and JPEGs are an example of this: they allow compatibility with existing applications which use these formats, etc, ... so, although one could devise their own graphics format (I have done so before), using it may turn out to be so incredibly inconvenient for everyone involved that using it is ultimately not worth the bother. likewise, in the everyday world, breaking laws may lead in turn to the police breaking down ones' door, and breaking moral and ethical rules may lead to various other consequences (do bad things and bad things may follow in turn). so, all this doesn't give a person to do "whatever they want, whenever they want", because the rules of cost/benefit will prevent this (too many costs in these cases, defeating the benefits). likewise, making a standard of non-standard, though not inherently bad, would likely end up being overly costly (in terms of use or maintenance or whatever else). but, I am not going to try to list all of the costs and benefits one might encounter or how one may weight them, as there are too many and how much each may apply in a given situation is itself prone to vary.
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-02-07 20:24 -0500 |
| Message-ID | <4f31ced9$0$282$14726298@news.sunsite.dk> |
| In reply to | #11835 |
On 2/7/2012 6:38 PM, BGB wrote: > On 2/7/2012 11:11 AM, jebblue wrote: >> On Fri, 03 Feb 2012 19:52:08 +0000, Stefan Ram wrote: >>> »X« below is another language than Java, for example, >>> VBA, C#, or C. >>> >>> When an X process and a Java process have to exchange information on >>> the same computer, what possibilites are there? The Java process >>> should act as a client, sending commands to the X process and also >>> wants to read answers from the X process. So, the X process is a kind >>> of server. >>> >>> My criteria are: reliability and it should not be extremely slow (say >>> exchanging a string should not take more than about 10 ms). The main >>> criterion is reliability. >>> >> >>> Sockets >>> >>> This is slightly less transparent than files, but has the advantage >>> that it becomes very easy to have the two processes running on >>> different computers later, if this should ever be required. Debugging >>> should be possible by a man-in-the-middle proxy that prints all >>> information it sees or by connecting to the server with a terminal. >>> >> >> I recommend using sockets. > > in general, I agree (sockets generally make the most sense), > another issue (besides how to pass messages), is what sort of form to > pass messages in. > > usually, in my case, if storing data in files, I tend to prefer > ASCII-based formats. > > usually, for passing messages over sockets, I have used "compact" > specialized binary formats, typically serialized data from some other > form (such as XML nodes or S-Expressions). although "magic byte value" > based message formats are initially simpler, they tend to be harder to > expand later (whereas encoding/decoding some more generic form, though > initially more effort, can turn out to be easier to maintain and extend > later). If you want compact and text go for JSON. Arne
[toc] | [prev] | [next] | [standalone]
| From | Martin Gregorie <martin@address-in-sig.invalid> |
|---|---|
| Date | 2012-02-08 01:31 +0000 |
| Message-ID | <jgsj9c$sl6$2@localhost.localdomain> |
| In reply to | #11835 |
On Tue, 07 Feb 2012 16:38:31 -0700, BGB wrote: > in general, I agree (sockets generally make the most sense), although > there are cases where file-based communications can make sense, although > probably not in the form as described in the OP. > Yes, for small amounts of data or message passing between processes I tend to like sockets - as others have said, the fact that they are agnostic about the location of the communicating processes is often very useful. > usually, for passing messages over sockets, I have used "compact" > specialized binary formats, > Yep. ASN.1 has to be about the most compact way of encoding structured, multi-field messages with XML occupying the other end of the scale. That said, for short, list of fields messages I often use a CSV string preceded by an unsigned binary byte value containing the string length: this type of message is both easy to transfer, even if the connection wants to fragment it during transmission, and by having a printable text payload, its also convenient for trouble shooting. -- martin@ | Martin Gregorie gregorie. | Essex, UK org |
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-02-08 00:55 -0700 |
| Message-ID | <jgt9s7$f7i$1@news.albasani.net> |
| In reply to | #11839 |
On 2/7/2012 6:31 PM, Martin Gregorie wrote:
> On Tue, 07 Feb 2012 16:38:31 -0700, BGB wrote:
>
>> in general, I agree (sockets generally make the most sense), although
>> there are cases where file-based communications can make sense, although
>> probably not in the form as described in the OP.
>>
> Yes, for small amounts of data or message passing between processes I
> tend to like sockets - as others have said, the fact that they are
> agnostic about the location of the communicating processes is often very
> useful.
>
yep.
>> usually, for passing messages over sockets, I have used "compact"
>> specialized binary formats,
>>
> Yep. ASN.1 has to be about the most compact way of encoding structured,
> multi-field messages with XML occupying the other end of the scale.
>
I disagree partly WRT ASN.1:
a disadvantage of ASN.1 is that a lot of times it tends to use
fixed-width integer encodings (and often sends structures in a
"reasonably raw" form), whereas one can shave more bytes using a
variable-length-integer scheme (why encode an integer in 4 bytes if you
only need 1 byte in a given case?). it is also possible to shave more
bytes if one makes the format use an adaptive/context-sensitive encoding
scheme and maybe a variant of Huffman coding or similar (and possibly
encode integer values using a similar scheme to that used in Deflate).
it is in-fact not particularly difficult to outperform ASN.1 in these
regards.
granted, yes, custom Huffman-based data encodings are probably not "the
norm" for network protocols (though some programs, such as the Quake 3
engine, have used Huffman-compressed network protocols).
there is also "arithmetic coding" and "range coding", but with these it
is a lot harder to make the codec be acceptably fast (whereas there are
some tricks to allow optimizing Huffman codecs).
in cases where I have used XML, I have typically used a custom binary
XML variant, which can greatly reduce the overhead vs textual XML. in
terms of saving bytes, my encoding can be more compact than WBXML or
XML+Deflate, but is arguably more "esoteric", and as-is doesn't make use
of schemas (it is instead a basic adaptive coding, and is vaguely
similar to an LZ-Markov coding, attempting to exploit repeating patterns
in tag-structure and similar via prediction, but like most adaptive
codings initially transmits the data in a less dense form as it needs to
build up a new context for each message). the coding in question doesn't
use Huffman coding (for sake of simplicity, and because I don't always
particularly need "maximum compactness"), but a Huffman-based variant
could be created if needed.
there is also EXI, but I don't know how my encoding compares (EXI
probably does better though, given that IIRC it uses binary universal
codes and schemas).
for something else of mine I am using S-Expression based messages
(currently between components within the same process), and had
considered using a vaguely similar binary coding if/when I get around to it.
> That said, for short, list of fields messages I often use a CSV string
> preceded by an unsigned binary byte value containing the string length:
> this type of message is both easy to transfer, even if the connection
> wants to fragment it during transmission, and by having a printable text
> payload, its also convenient for trouble shooting.
>
yes, this is possible.
also possibly would be a TLV encoding (say, possibly doing something
similar to the Matroska MKV file-format).
say, the integer values are encoded something like (range, encoding):
0-127 0xxxxxxx
128-16383 10xxxxxx xxxxxxxx
16384-2097151 110xxxxx xxxxxxxx xxxxxxxx
2097152-... ...
likewise, one can get a signed variant by folding the sign into the LSB,
forming a pattern like: 0, -1, 1, -2, 2, ...
then, one defines tags as:
{
VLI tag;
VLI length;
byte data[length];
}
where tags can hold either data or messages (and, the smallest tag size
needs 2 bytes, or 3 bytes if one has 1 byte of payload for the tag).
if the length is optional (presence depends on tag), one can reduce the
typical tag size to 1 byte. likewise, tags can be combined with an
MTF/MRU scheme such that any recently used tags have a small value (and
can thus be encoded in a single byte). (many of my formats define tags
inline, rather than relying on some large hard-coded tag-list).
more bytes can be saved if more of the message structure is known, say
that not only does the tag encode a particular tag-type, but also may
carry information about what follows after it (various combinations of
attributes, and if it contains sub-tags and what they might be, ...).
if a new tag is defined, it is added to the MRU, but if not used
frequently may move "backwards" (towards higher index numbers) or
eventually be forgotten (falls off the end of the list).
note that some hard-coded tag-numbers will be needed for basic control
purposes (encoding new/unfamiliar tags, ...).
a Huffman-based variant could be similar, just one may encode integers
differently. an example scheme is to use a prefix value (Huffman coded)
and a suffix bit pattern (similar to Deflate). a simpler (but less
compact) scheme was used in JPEG, and IIRC I had before "compromised"
between them by having the Huffman table be stored using Rice codes.
example (prefix range, value range, suffix bits):
0-15 0-15 0
16-23 16-31 1
24-31 32-63 2
32-39 64-127 3
40-47 128-255 4
48-55 512-1024 5
56-63 1024-2047 6
64-71 2048-4095 7
72-79 4096-8191 8
80-87 8192-16383 9
...
also note that a nifty thing (also used in Deflate) is to compress the
Huffman table itself using Huffman coding.
likewise, one can save a few bytes if the encoder is smart enough to
recognize when tags encode numeric data (mostly specific to XML, with
S-Expressions or similar one knows when they are dealing with numeric data).
likewise, one can encode floats as a pair of integer values (although
floats present a few of their own complexities). one can also devise
special encodings for things like numeric vectors, quaternions, ... if
needed as well.
likewise, either an LZ77 or LZ-Markov scheme can be used for encoding
strings (an example would be to used a fixed-size rotating window like
in Deflate, and essentially using the same basic encoding for strings,
albeit likely with the use of an "End-Of-String" marker).
say (range, meaning):
0-255: literal byte values
258: End Of String
259-321: LZ77 Run (encodes length, followed by window offset).
String encoding would be used, say, for encoding both literal text, and
also for escaping things like tag and attribute names.
...
the main variability is mostly in terms of the type of payload being
transmitted:
be it XML-based, S-Expression based, or potentially object-based
(similar to either JSON, or a sort of "heap pickling" style system).
for most structured data, it shouldn't be needed to change the
"fundamentals" too much. the main difference is between tree-structured
and heap-like / graph-structured data, as graph-structured data is often
better sent as a flat list of objects with a certain entry being a "root
node" than as a tree (this can be accomplished either by building a
list, or using an algorithm to detect and break-up cycles when needed).
granted, for most use-cases something like this is likely to be overkill.
or such...
[toc] | [prev] | [standalone]
Page 4 of 4 — ← Prev page 1 2 3 [4]
Back to top | Article view | comp.lang.java.programmer
csiph-web