Path: csiph.com!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail From: qwertmonkey@syberianoutpost.ru Newsgroups: comp.lang.java.programmer Subject: regexp(ing) Backus-Naurish expressions ... Date: Wed, 13 Mar 2013 21:54:22 +0000 (UTC) Organization: Aioe.org NNTP Server Lines: 177 Message-ID: NNTP-Posting-Host: jcJFTAhAdFLWfAXONlzC5w.user.speranza.aioe.org X-Complaints-To: abuse@aioe.org X-Notice: Filtered by postfilter v. 0.8.2 X-Newsreader: NetComponents Xref: csiph.com comp.lang.java.programmer:22946 Arne Vajhøj schrieb: > I would do it as: > - switch from properties to XML > - define a schema for the XML with strict restrictions on data > - let the application parse that with a validating parser and > read it into some config object, this will ensure that required > information is there and that the data types are correct > - let the application apply business validation rules in Java code > on the config objects - this will ensure that the various > information is consistent ~ Arne, what do you specifically mean when you say "read it into some config object"? Using JAXB? AFAIK JAXB needs source (re)compilation in Android: ~ http://code.google.com/p/android/issues/detail?id=314 ~ Also I am trying to deal with it in a general "named-value" pair way, so that different schema files should be parsed and the result (as I see it) should be some String[*][2] with the names and values of parameters/properties ~ Leif Roar Moldskred schrieb: > When working with regular expressions you should always remember that > you don't need to do everything in a single expression. There's no law > against splitting things up into sub-expressions or using "boring old > code" for parts of the match. > You should also bear in mind that some parsing tasks are just not > suited to regular expressions and if the regular expression starts > getting complicated you should consider if the task might be solved > more easily with another approach. > Here, assuming I've understood the problem right, I might do something > as below (I'm not on my development computer, so note that this has > not been checked for errors): ~ Yeah, I would agree with you but the switch case block is really awful and totally useless to me. While doing NLP work you would go mad with code full of switch-case sections for every single and virtually endless cases ~ lipska the kat schrieb: > Not sure if this is what you are after as I've never used it myself but > http://commons.apache.org/proper/commons-cli/ ~ well, no. It wasn't helpful because I need to do my work at the parsing stage http://commons.apache.org/proper/commons-cli/usage.html ~ Roedy Green schrieb: > > Any ideas you would share? > Regexes are quite limited. When you bang into their limits you can write a finite state machine or use a parser. ~ and I have been constantly banking against their limits ;-) in fact I find regexes quite limited for what I do ~ markspace schrieb: > Based on your syntax example and you title, why bother with > "Backus-Naurish?" Java has full parser generators. > http://www.antlr.org/ for my needs antlr is an overkill ~ Martin Gregorie schrieb: > This is implemented as the ArgParser class in my environ.jar library and > can be found at: > http://sourceforge.net/projects/cdocumenter/files/cdocumenter/environment/ your ArgParser: Constructor Detail public ArgParser(java.lang.String progName, java.lang.String[] args, java.lang.String optlist) must be passed an optlist and, similar to commons-cli, must be navigated/parsed. All I make my users do is: 1) setup everything in .properties files for default settings, and 2) let users set specific (protocolled) parameters as command line if they so decide my ArgParser-like constructor looks like this: SysEnvCtxt(){ ... } public void setCtxt(String aKNm, String[] aKLnArgs, File ODir, String aPropsMetaMD5Sign, long lTm00Start) throws IOException where: aKNm: class name (passed from calling env) aKLnArgs: command line args (automatically passed from calling env) ODir: output dir (set and passed from calling env) aPropsMetaMD5Sign: MD5 Signature of properties definitions and names (passed from calling env and set for some type of running context/properties) lTm00Start: start time (automatically passed from calling env) and then the user sets up a system and logical context running env properties (or xml) files which look like this: # fully explicit and declaratively defined running properties written in a Backus?Naur(ish) form # all system property names must start with (*nix standard) double hyphen # metadata names are prefixed and suffixed as system__values_def # options are explicitly piped (with "|") "true[|false]" means it must [|not] be defined # the last of the existing options after closing square bracket is the default # if default option is not listed, it must be retrievable via java.lang.System.getProperty(