Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #20864 > unrolled thread
| Started by | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| First post | 2013-01-02 00:20 -0800 |
| Last post | 2013-01-02 19:54 -0500 |
| Articles | 20 on this page of 41 — 7 participants |
Back to article view | Back to comp.lang.java.programmer
question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 00:20 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 00:24 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Patricia Shanahan <pats@acm.org> - 2013-01-02 12:24 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Lew <lewbloch@gmail.com> - 2013-01-02 11:16 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 19:55 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Lew <lewbloch@gmail.com> - 2013-01-02 17:21 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 20:40 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Roedy Green <see_website@mindprod.com.invalid> - 2013-01-02 11:17 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 19:56 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 17:27 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 17:32 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Lew <lewbloch@gmail.com> - 2013-01-02 17:42 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 17:55 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 18:02 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:12 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 18:16 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:20 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 18:22 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:26 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 18:27 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:46 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 20:41 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-06 21:54 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:15 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-02 18:20 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 21:17 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Patricia Shanahan <pats@acm.org> - 2013-01-02 22:33 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2013-01-05 12:58 +0000
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-05 05:34 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-05 05:40 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-06 21:56 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Martin Gregorie <martin@address-in-sig.invalid> - 2013-01-03 21:14 +0000
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-03 17:51 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-03 20:54 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Martin Gregorie <martin@address-in-sig.invalid> - 2013-01-05 00:15 +0000
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2013-01-05 13:03 +0000
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-05 05:25 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-06 21:49 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> - 2013-01-06 23:26 -0800
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-06 21:44 -0500
Re: question on java lang spec chapter 3.3 (unicode char lexing) Arne Vajhøj <arne@vajhoej.dk> - 2013-01-02 19:54 -0500
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-02 21:46 -0500 |
| Message-ID | <50e4f0f7$0$284$14726298@news.sunsite.dk> |
| In reply to | #20907 |
On 1/2/2013 9:27 PM, Aryeh M. Friedman wrote: > On Wednesday, January 2, 2013 9:26:03 PM UTC-5, Arne Vajhøj wrote: >> On 1/2/2013 9:22 PM, Aryeh M. Friedman wrote: >>> an other requirement not satisfied by any IDE we have found is >>> the >>> ability to lay the source tree out in such a way that it can be >>> compiled without the IDE (a requirement for almost all our >>> projects >>> because none of our clients have IDE's and in almost all cases >>> there >>> are minor changes needed to make the code happy on their site >>> that >> >>> make testing impossible on the development machine) >> >> The Java IDE's I know put code in a structure that fits >> >> java tools, ant and maven. > > And in almost any non-trivial case this is completely incorrect... Given that a big part (my estimate: 80-90%!) of all Java applications are build: - developer use IDE and checkin to VCS - build process checkout from VCS and use ant/maven to build then it has to be correct. > even though I love Java as a lang I have a serious issue with some of > the attitudes/assumptions made by tools... namely the universe does > not revolve around the JVM I find it natural that tools developed for Java development are the best for Java development and tools developed for C development are the best for C development and ... PHP ... Python ... etc.. Arne
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-02 20:41 -0800 |
| Message-ID | <d4cd0273-9399-4353-8e44-53e03f59162a@googlegroups.com> |
| In reply to | #20909 |
On Wednesday, January 2, 2013 9:46:12 PM UTC-5, Arne Vajhøj wrote: > On 1/2/2013 9:27 PM, Aryeh M. Friedman wrote: > > > On Wednesday, January 2, 2013 9:26:03 PM UTC-5, Arne Vajh�j wrote: > > >> On 1/2/2013 9:22 PM, Aryeh M. Friedman wrote: > > >>> an other requirement not satisfied by any IDE we have found is > > >>> the > > >>> ability to lay the source tree out in such a way that it can be > > >>> compiled without the IDE (a requirement for almost all our > > >>> projects > > >>> because none of our clients have IDE's and in almost all cases > > >>> there > > >>> are minor changes needed to make the code happy on their site > > >>> that > > >> > > >>> make testing impossible on the development machine) > > >> > > >> The Java IDE's I know put code in a structure that fits > > >> > > >> java tools, ant and maven. > > > > > > And in almost any non-trivial case this is completely incorrect... > > > > Given that a big part (my estimate: 80-90%!) of all Java applications > > are build: > > - developer use IDE and checkin to VCS > > - build process checkout from VCS and use ant/maven to build > > then it has to be correct. Correct in what sense? Passing it's own tests? If that is the case aegis is the *ONLY* VCS that actually requires this before a checkin. The idea there is the baseline (repo in most other VCS's jargon) is guernteed to be working (as defined above)). Namely every modification is *NEW* [see note] atomic in regards to new functionality and *MUST* be accompanied by automated tests (it is possible to turn this off but for obvious reasons not recommended unless the change is essencially untestable like documentation updates). > > > > > even though I love Java as a lang I have a serious issue with some of > > > the attitudes/assumptions made by tools... namely the universe does > > > not revolve around the JVM > > > > I find it natural that tools developed for Java development are the > > best for Java development and tools developed for C development are > > the best for C development and ... PHP ... Python ... etc.. Most real world projects (unless they a part of a larger effort) have several components/languages (for us for example it is typical to have a HTML/CSS/JS component and a Java/"JSP" component [I am defining "JSP" a little loosely because we often need to support more then just web front-ends]... it is also common for us to have some native code accessed via a JNLP wrapper)... Note: There is a slight mismatch between aegis's requirements in this reguard and how xUnit like frameworks work. We typically solve this by reusing the same test script but requiring that the total number of pass's needs to be at least one larger then the previous change.
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-06 21:54 -0500 |
| Message-ID | <50ea38f0$0$282$14726298@news.sunsite.dk> |
| In reply to | #20912 |
On 1/2/2013 11:41 PM, Aryeh M. Friedman wrote: > On Wednesday, January 2, 2013 9:46:12 PM UTC-5, Arne Vajhøj wrote: >> On 1/2/2013 9:27 PM, Aryeh M. Friedman wrote: >>> On Wednesday, January 2, 2013 9:26:03 PM UTC-5, Arne Vajh�j wrote: >>>> On 1/2/2013 9:22 PM, Aryeh M. Friedman wrote: >>>>> an other requirement not satisfied by any IDE we have found is >>>>> the >>>>> ability to lay the source tree out in such a way that it can be >>>>> compiled without the IDE (a requirement for almost all our >>>>> projects >>>>> because none of our clients have IDE's and in almost all cases >>>>> there >>>>> are minor changes needed to make the code happy on their site >>>>> that >>>>> make testing impossible on the development machine) >>>> The Java IDE's I know put code in a structure that fits >>>> java tools, ant and maven. >>> And in almost any non-trivial case this is completely incorrect... >> >> Given that a big part (my estimate: 80-90%!) of all Java applications >> >> are build: >> >> - developer use IDE and checkin to VCS >> - build process checkout from VCS and use ant/maven to build >> >> then it has to be correct. > > Correct in what sense? Same sense as you used incorrect! >>> even though I love Java as a lang I have a serious issue with some of >>> the attitudes/assumptions made by tools... namely the universe does >>> not revolve around the JVM >> >> I find it natural that tools developed for Java development are the >> best for Java development and tools developed for C development are >> the best for C development and ... PHP ... Python ... etc.. > > Most real world projects (unless they a part of a larger effort) have > several components/languages (for us for example it is typical to > have a HTML/CSS/JS component and a Java/"JSP" component [I am > defining "JSP" a little loosely because we often need to support more > then just web front-ends]... it is also common for us to have some > native code accessed via a JNLP wrapper)... (JNI wrapper??) Eclipse and NetBeans can support all those languages. But if you have sufficient much work in each language then a different IDE for the HTML/CSS/JS and another for the C/C++ could make sense. You may want to use ant for the Java stuff and make for the C/C++ stuff. But ant can call make and make can call ant, so they can be integrated. Arne
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-02 21:15 -0500 |
| Message-ID | <50e4e9d0$0$287$14726298@news.sunsite.dk> |
| In reply to | #20897 |
On 1/2/2013 8:55 PM, Aryeh M. Friedman wrote: > On Wednesday, January 2, 2013 8:42:57 PM UTC-5, Lew wrote: >> On Wednesday, January 2, 2013 5:27:21 PM UTC-8, Aryeh M. Friedman >>> A long term personal project of mine is to write a OS completely >>> from the ground up in a super set of Java (the only addition I >>> see that is needed is some type of "safe" pointer type)... in >>> this case safe being defined as you can assign a literal address >>> to it but your not allowed to do ptr math on it >> >> Also known as "a JVM"? > > As far I know the JVM can not be directly booted (as in if I turn on > my PC it can not boot into the JVM)... The JVM was not created for writing OS'es but for writing applications so that is correct. > neither for performance > reasons does it make sense to run a VM at the bottom layer... Performance should not be a problem. > an > other reason is there is a lot of junk in the JRE (like how do you do > garbage collection if you do not have some way of the OS allocating > mem to a process in the first place) Again. Java was designed to write applications not OS'es. The "junk" you are talking about is what makes it useful for the big majority. Arne
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-02 18:20 -0800 |
| Message-ID | <1c8ea3fc-155c-471c-b7ba-6e3fb6f6c71d@googlegroups.com> |
| In reply to | #20900 |
> The JVM was not created for writing OS'es but for writing > > applications so that is correct. In my professional life that's how I use java the comment only pertained to the motivation for writing a native compiler (which is for fun)
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-02 21:17 -0500 |
| Message-ID | <50e4ea1f$0$287$14726298@news.sunsite.dk> |
| In reply to | #20891 |
On 1/2/2013 8:27 PM, Aryeh M. Friedman wrote: > the ground up in a super set of Java (the only addition I see that is > needed is some type of "safe" pointer type)... in this case safe > being defined as you can assign a literal address to it but your not > allowed to do ptr math on it See http://en.wikipedia.org/wiki/Singularity_%28operating_system%29 for an example of OS in managed language. Arne
[toc] | [prev] | [next] | [standalone]
| From | Patricia Shanahan <pats@acm.org> |
|---|---|
| Date | 2013-01-02 22:33 -0800 |
| Message-ID | <NYqdnSF1y-Gvu3jNnZ2dnUVZ_qmdnZ2d@earthlink.com> |
| In reply to | #20891 |
On 1/2/2013 5:27 PM, Aryeh M. Friedman wrote: > >> >> Well - since he is writing a lexer for Java then ... > > A little more on the project... while the over all project *IS* for > fun a few components may find there way into more serious work > related projects but only to be used on code written by me or others > on my team... specifically we may use the lexing/parsing component to > make the following tools (the actual code generation/etc. of the > compilation is currently purely fun [see note]): If you intend the lexing/parsing component for production work, you should deal with the escapes. You may someday want to import code that was, for example, edited using a keyboard that did not support all the characters that were needed, or where the programmer wanted to be absolutely sure that a particular Unicode character was used. You would at least need to detect the escapes to get a usable error message. Once you have done that, it is so easy to replace each escape with the equivalent Unicode character that it is not worth doing anything else. Patricia
[toc] | [prev] | [next] | [standalone]
| From | "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> |
|---|---|
| Date | 2013-01-05 12:58 +0000 |
| Message-ID | <WZ6dnS6Bc7o4vnXNnZ2dnUVZ8kOdnZ2d@bt.com> |
| In reply to | #20915 |
Patricia Shanahan wrote:
> You would at least need to detect the escapes to get a usable error
> message. Once you have done that, it is so easy to replace each escape
> with the equivalent Unicode character that it is not worth doing
> anything else.
I'm not so sure about that. IIRC the rules about interpretting Unicode escapes
have some seriously wierd convolutions. Something to do with protecting against
multiply-encoded files, I think. It badly fails the Principle of Least WTF.
It's in the spec, but I'm too lazy to go find the exact reference :-(
-- chruis
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-05 05:34 -0800 |
| Message-ID | <70744efd-9848-42ef-944f-dcd667f75045@googlegroups.com> |
| In reply to | #20979 |
On Saturday, January 5, 2013 7:58:57 AM UTC-5, Chris Uppal wrote: > Patricia Shanahan wrote: > > > > > You would at least need to detect the escapes to get a usable error > > > message. Once you have done that, it is so easy to replace each escape > > > with the equivalent Unicode character that it is not worth doing > > > anything else. > > > > I'm not so sure about that. IIRC the rules about interpretting Unicode escapes > > have some seriously wierd convolutions. Something to do with protecting against > > multiply-encoded files, I think. It badly fails the Principle of Least WTF. > > > > It's in the spec, but I'm too lazy to go find the exact reference :-( > > > > -- chruis agreed for example the following is just ugly but perfectly valid Java code: Foo.java: \u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0063\u006C\u0061\u0073\u0073\u0020\u0046\u006F\u006F\u000A\u007B\u000A\u0009\u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0073\u0074\u0061\u0074\u0069\u0063\u0020\u0076\u006F\u0069\u0064\u0020\u006D\u0061\u0069\u006E\u0028\u0053\u0074\u0072\u0069\u006E\u0067\u005B\u005D\u0020\u0061\u0072\u0067\u0073\u0029\u000A\u0009\u007B\u000A\u0009\u0009\u0053\u0079\u0073\u0074\u0065\u006D\u002E\u006F\u0075\u0074\u002E\u0070\u0072\u0069\u006E\u0074\u006C\u006E\u0028\u0022\u0068\u0065\u006C\u006C\u006F\u002C\u0020\u0077\u006F\u0072\u006C\u0064\u0022\u0029\u003B\u000A\u0009\u007D\u000A\u007D\u000A % javac Foo.java % java Foo hello, world
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-05 05:40 -0800 |
| Message-ID | <aaabd6d4-87ba-4522-8953-25bca99d1ccd@googlegroups.com> |
| In reply to | #20985 |
On Saturday, January 5, 2013 8:34:38 AM UTC-5, Aryeh M. Friedman wrote: > On Saturday, January 5, 2013 7:58:57 AM UTC-5, Chris Uppal wrote: > > > Patricia Shanahan wrote: > > > > > > > > > > > > > You would at least need to detect the escapes to get a usable error > > > > > > > message. Once you have done that, it is so easy to replace each escape > > > > > > > with the equivalent Unicode character that it is not worth doing > > > > > > > anything else. > > > > > > > > > > > > I'm not so sure about that. IIRC the rules about interpretting Unicode escapes > > > > > > have some seriously wierd convolutions. Something to do with protecting against > > > > > > multiply-encoded files, I think. It badly fails the Principle of Least WTF. > > > > > > > > > > > > It's in the spec, but I'm too lazy to go find the exact reference :-( > > > > > > > > > > > > -- chruis > > > > agreed for example the following is just ugly but perfectly valid Java code: > > > > Foo.java: > > \u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0063\u006C\u0061\u0073\u0073\u0020\u0046\u006F\u006F\u000A\u007B\u000A\u0009\u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0073\u0074\u0061\u0074\u0069\u0063\u0020\u0076\u006F\u0069\u0064\u0020\u006D\u0061\u0069\u006E\u0028\u0053\u0074\u0072\u0069\u006E\u0067\u005B\u005D\u0020\u0061\u0072\u0067\u0073\u0029\u000A\u0009\u007B\u000A\u0009\u0009\u0053\u0079\u0073\u0074\u0065\u006D\u002E\u006F\u0075\u0074\u002E\u0070\u0072\u0069\u006E\u0074\u006C\u006E\u0028\u0022\u0068\u0065\u006C\u006C\u006F\u002C\u0020\u0077\u006F\u0072\u006C\u0064\u0022\u0029\u003B\u000A\u0009\u007D\u000A\u007D\u000A > > > > % javac Foo.java > > % java Foo > > hello, world Just a quick note I did end up implementing unicode escapes the way JLSv3 says to and the above is one our test inputs...
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-06 21:56 -0500 |
| Message-ID | <50ea395e$0$282$14726298@news.sunsite.dk> |
| In reply to | #20985 |
On 1/5/2013 8:34 AM, Aryeh M. Friedman wrote: agreed for example the following is just ugly but perfectly valid Java code: > > Foo.java: > \u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0063\u006C\u0061\u0073\u0073\u0020\u0046\u006F\u006F\u000A\u007B\u000A\u0009\u0070\u0075\u0062\u006C\u0069\u0063\u0020\u0073\u0074\u0061\u0074\u0069\u0063\u0020\u0076\u006F\u0069\u0064\u0020\u006D\u0061\u0069\u006E\u0028\u0053\u0074\u0072\u0069\u006E\u0067\u005B\u005D\u0020\u0061\u0072\u0067\u0073\u0029\u000A\u0009\u007B\u000A\u0009\u0009\u0053\u0079\u0073\u0074\u0065\u006D\u002E\u006F\u0075\u0074\u002E\u0070\u0072\u0069\u006E\u0074\u006C\u006E\u0028\u0022\u0068\u0065\u006C\u006C\u006F\u002C\u0020\u0077\u006F\u0072\u006C\u0064\u0022\u0029\u003B\u000A\u0009\u007D\u000A\u007D\u000A > > % javac Foo.java > % java Foo > hello, world :-) It is one of those features that can certainly be misused. Arne
[toc] | [prev] | [next] | [standalone]
| From | Martin Gregorie <martin@address-in-sig.invalid> |
|---|---|
| Date | 2013-01-03 21:14 +0000 |
| Message-ID | <kc4sbi$4at$2@localhost.localdomain> |
| In reply to | #20888 |
On Wed, 02 Jan 2013 19:56:13 -0500, Arne Vajhøj wrote: > On 1/2/2013 2:17 PM, Roedy Green wrote: >> On Wed, 2 Jan 2013 00:20:12 -0800 (PST), "Aryeh M. Friedman" >> <Aryeh.Friedman@gmail.com> wrote, quoted or indirectly quoted someone >> who said : >> >>> (\uXXXX) >> >> The only places you encounter such escapes are in Java source and >> possibly resource bundles. > > Well - since he is writing a lexer for Java then ... > ...which, being lazy, I would not do from scratch. Instead, I'd use the Java version of the Coco/R package, which generates the lexer and parser as Java source within a framework. Unlike some similar tools, you're almost encouraged to rewrite the framework to suit your requirements. This is quite short and written in standard Java, so modifying it is very easy. -- martin@ | Martin Gregorie gregorie. | Essex, UK org |
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-03 17:51 -0500 |
| Message-ID | <50e60b8f$0$282$14726298@news.sunsite.dk> |
| In reply to | #20929 |
On 1/3/2013 4:14 PM, Martin Gregorie wrote: > On Wed, 02 Jan 2013 19:56:13 -0500, Arne Vajhøj wrote: > >> On 1/2/2013 2:17 PM, Roedy Green wrote: >>> On Wed, 2 Jan 2013 00:20:12 -0800 (PST), "Aryeh M. Friedman" >>> <Aryeh.Friedman@gmail.com> wrote, quoted or indirectly quoted someone >>> who said : >>> >>>> (\uXXXX) >>> >>> The only places you encounter such escapes are in Java source and >>> possibly resource bundles. >> >> Well - since he is writing a lexer for Java then ... >> > ...which, being lazy, I would not do from scratch. > > Instead, I'd use the Java version of the Coco/R package, which generates > the lexer and parser as Java source within a framework. Unlike some > similar tools, you're almost encouraged to rewrite the framework to suit > your requirements. This is quite short and written in standard Java, so > modifying it is very easy. Good point. Arne
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-03 20:54 -0800 |
| Message-ID | <f3060a14-b802-48f2-8251-a50c77fd7e67@googlegroups.com> |
| In reply to | #20933 |
On Thursday, January 3, 2013 5:51:55 PM UTC-5, Arne Vajhøj wrote: > On 1/3/2013 4:14 PM, Martin Gregorie wrote: > > > On Wed, 02 Jan 2013 19:56:13 -0500, Arne Vajhøj wrote: > > > > > >> On 1/2/2013 2:17 PM, Roedy Green wrote: > > >>> On Wed, 2 Jan 2013 00:20:12 -0800 (PST), "Aryeh M. Friedman" > > >>> <Aryeh.Friedman@gmail.com> wrote, quoted or indirectly quoted someone > > >>> who said : > > >>> > > >>>> (\uXXXX) > > >>> > > >>> The only places you encounter such escapes are in Java source and > > >>> possibly resource bundles. > > >> > > >> Well - since he is writing a lexer for Java then ... > > >> > > > ...which, being lazy, I would not do from scratch. > > > > > > Instead, I'd use the Java version of the Coco/R package, which generates > > > the lexer and parser as Java source within a framework. Unlike some > > > similar tools, you're almost encouraged to rewrite the framework to suit > > > your requirements. This is quite short and written in standard Java, so > > > modifying it is very easy. > > > > Good point. > > > > Arne The only issue is likely a philosophical one in that I have *NEVER* trusted code generators of any kind they either produce impossible to follow/debug code or have all kinds of fluff in them (the classic example in my mind [html which is not really a programming lang ;-)] is Dreamweaver that produces 75 lines of HTML for "hello, world").
[toc] | [prev] | [next] | [standalone]
| From | Martin Gregorie <martin@address-in-sig.invalid> |
|---|---|
| Date | 2013-01-05 00:15 +0000 |
| Message-ID | <kc7rb5$uah$1@localhost.localdomain> |
| In reply to | #20940 |
On Thu, 03 Jan 2013 20:54:09 -0800, Aryeh M. Friedman wrote: > The only issue is likely a philosophical one in that I have *NEVER* > trusted code generators of any kind they either produce impossible to > follow/debug code or have all kinds of fluff in them (the classic > example in my mind [html which is not really a programming lang ;-)] is > Dreamweaver that produces 75 lines of HTML for "hello, world"). > Just saying. Try it. Look at the generated code. Use it or not. Your choice. If you've used Lex and YACC (or Flex and Bison) the learning curve is short. -- martin@ | Martin Gregorie gregorie. | Essex, UK org |
[toc] | [prev] | [next] | [standalone]
| From | "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> |
|---|---|
| Date | 2013-01-05 13:03 +0000 |
| Message-ID | <lYKdnXBDefeeu3XNnZ2dnUVZ8r6dnZ2d@bt.com> |
| In reply to | #20940 |
Aryeh M. Friedman wrote:
> The only issue is likely a philosophical one in that I have *NEVER*
> trusted code generators of any kind
So you don't care for compilers ?
;-)
-- chris
P.S. Seriously: the point of classic compiler generators (or
"compiler-compilers" as they were often called) are to produce code that works
and that runs fast in little space. It is not /AT ALL/ a design principle that
the code should be comprehensible to humans -- in fact for the kinds of
algorithms they use, there is no way the resulting code and tables could be
remotely comprehensible (to an ordinary programmer), that is /why/ we use code
generators.
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-05 05:25 -0800 |
| Message-ID | <af9ec6de-5eb4-49e3-a2b0-99805fcc48ab@googlegroups.com> |
| In reply to | #20981 |
On Saturday, January 5, 2013 8:03:00 AM UTC-5, Chris Uppal wrote: > Aryeh M. Friedman wrote: > > > > > The only issue is likely a philosophical one in that I have *NEVER* > > > trusted code generators of any kind > > > > So you don't care for compilers ? > > > > ;-) > > > > -- chris > > > > P.S. Seriously: the point of classic compiler generators (or > > "compiler-compilers" as they were often called) are to produce code that works > > and that runs fast in little space. It is not /AT ALL/ a design principle that > > the code should be comprehensible to humans -- in fact for the kinds of > > algorithms they use, there is no way the resulting code and tables could be > > remotely comprehensible (to an ordinary programmer), that is /why/ we use code > > generators. Machine code was never meant to be readable but high level languages can and should be ;-).... on the serious side of the debate there are reasons for shying away from code generators in my case that are currently proprietary (some of the lesser results will likely be FOSS'ed though)... the main reason is we need to (in some cases) deal with multiple languages in the same compilation unit and have developed fairly good (at least in theory and my "fun work" is really nothing more then a proof of concept, without the pressure of deadlines and such, with Java as a typical non-trivial language to work with from the compiler POV)... due to the above using a parse generator would make it very inefficient to create the needed parsers since they are (by there very nature) very non-OO in how they deal with more then one grammar at once... namely they are designed to deal with single languages at a time and not "families" of them
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-06 21:49 -0500 |
| Message-ID | <50ea37b0$0$282$14726298@news.sunsite.dk> |
| In reply to | #20982 |
On 1/5/2013 8:25 AM, Aryeh M. Friedman wrote: > Machine code was never meant to be readable but high level languages > can and should be ;-).... on the serious side of the debate there are > reasons for shying away from code generators in my case that are > currently proprietary (some of the lesser results will likely be > FOSS'ed though)... the main reason is we need to (in some cases) deal > with multiple languages in the same compilation unit and have > developed fairly good (at least in theory and my "fun work" is really > nothing more then a proof of concept, without the pressure of > deadlines and such, with Java as a typical non-trivial language to > work with from the compiler POV)... due to the above using a parse > generator would make it very inefficient to create the needed parsers > since they are (by there very nature) very non-OO in how they deal > with more then one grammar at once... namely they are designed to > deal with single languages at a time and not "families" of them ???? You have: 1 handwritten lexer + 1 handwritten parser vs 1 generated lexer + 1 generated parser and: N handwritten lexers + N handwritten parsers vs N generated lexers + N generated parsers If it is cheaper to generate for 1 then I would expect it to be cheaper to generate for N as well. That the generated lexers and parsers may be more procedural than object oriented should not be a show stopper. Common languages like C++ and Java can fine call different functions from different classes. Arne
[toc] | [prev] | [next] | [standalone]
| From | "Aryeh M. Friedman" <Aryeh.Friedman@gmail.com> |
|---|---|
| Date | 2013-01-06 23:26 -0800 |
| Message-ID | <1217f831-3562-4c62-856f-8a32f8ff2e38@googlegroups.com> |
| In reply to | #21122 |
On Sunday, January 6, 2013 9:49:17 PM UTC-5, Arne Vajhøj wrote: > On 1/5/2013 8:25 AM, Aryeh M. Friedman wrote: > > > Machine code was never meant to be readable but high level languages > > > can and should be ;-).... on the serious side of the debate there are > > > reasons for shying away from code generators in my case that are > > > currently proprietary (some of the lesser results will likely be > > > FOSS'ed though)... the main reason is we need to (in some cases) deal > > > with multiple languages in the same compilation unit and have > > > developed fairly good (at least in theory and my "fun work" is really > > > nothing more then a proof of concept, without the pressure of > > > deadlines and such, with Java as a typical non-trivial language to > > > work with from the compiler POV)... due to the above using a parse > > > generator would make it very inefficient to create the needed parsers > > > since they are (by there very nature) very non-OO in how they deal > > > with more then one grammar at once... namely they are designed to > > > deal with single languages at a time and not "families" of them > > > > ???? > > > > You have: > > > > 1 handwritten lexer + 1 handwritten parser vs 1 generated lexer + 1 > > generated parser > > > > and: > > > > N handwritten lexers + N handwritten parsers vs N generated lexers + N > > generated parsers > > > > If it is cheaper to generate for 1 then I would expect it to be cheaper > > to generate for N as well. > > > > That the generated lexers and parsers may be more procedural than > > object oriented should not be a show stopper. > > > > Common languages like C++ and Java can fine call different > > functions from different classes. > > > > Arne Don't forget domain specific langs some of which may rewrite the actual content of the other embedded langs... bottom line a well designed version of this is cheaper in the long run if one of the goals is to quickly add new langs to each family besides which I compared my hand written code to that produced by yacc/lex (and antlr to make sure I was not seeing stuff) and 1) mine is a fraction of the line count [about 90% smaller], 2) Has a much lower big-O (O(n) vs. O(n^2)), 3) is trivial to hand trace (why I would want to is any other point ;-)), 4) easier to test with unit testing because you can actual get under the hood unlike the above that is totally opaque
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-06 21:44 -0500 |
| Message-ID | <50ea3696$0$282$14726298@news.sunsite.dk> |
| In reply to | #20940 |
On 1/3/2013 11:54 PM, Aryeh M. Friedman wrote: > The only issue is likely a philosophical one in that I have *NEVER* > trusted code generators of any kind they either produce impossible to > follow/debug code or have all kinds of fluff in them (the classic > example in my mind [html which is not really a programming lang ;-)] > is Dreamweaver that produces 75 lines of HTML for "hello, world"). Sounds like NIH. The generated code may be hard to follow, but will be more well tested. Arne
[toc] | [prev] | [next] | [standalone]
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web