Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #22946
| From | qwertmonkey@syberianoutpost.ru |
|---|---|
| Newsgroups | comp.lang.java.programmer |
| Subject | regexp(ing) Backus-Naurish expressions ... |
| Date | 2013-03-13 21:54 +0000 |
| Organization | Aioe.org NNTP Server |
| Message-ID | <khqsie$jee$1@speranza.aioe.org> (permalink) |
Arne Vajhøj schrieb:
> I would do it as:
> - switch from properties to XML
> - define a schema for the XML with strict restrictions on data
> - let the application parse that with a validating parser and
> read it into some config object, this will ensure that required
> information is there and that the data types are correct
> - let the application apply business validation rules in Java code
> on the config objects - this will ensure that the various
> information is consistent
~
Arne, what do you specifically mean when you say "read it into some
config object"? Using JAXB? AFAIK JAXB needs source (re)compilation in
Android:
~
http://code.google.com/p/android/issues/detail?id=314
~
Also I am trying to deal with it in a general "named-value" pair way, so that
different schema files should be parsed and the result (as I see it) should
be some String[*][2] with the names and values of parameters/properties
~
Leif Roar Moldskred schrieb:
> When working with regular expressions you should always remember that
> you don't need to do everything in a single expression. There's no law
> against splitting things up into sub-expressions or using "boring old
> code" for parts of the match.
> You should also bear in mind that some parsing tasks are just not
> suited to regular expressions and if the regular expression starts
> getting complicated you should consider if the task might be solved
> more easily with another approach.
> Here, assuming I've understood the problem right, I might do something
> as below (I'm not on my development computer, so note that this has
> not been checked for errors):
~
Yeah, I would agree with you but the switch case block is really awful
and totally useless to me. While doing NLP work you would go mad with
code full of switch-case sections for every single and virtually endless
cases
~
lipska the kat schrieb:
> Not sure if this is what you are after as I've never used it myself but
> http://commons.apache.org/proper/commons-cli/
~
well, no. It wasn't helpful because I need to do my work at the parsing stage
http://commons.apache.org/proper/commons-cli/usage.html
~
Roedy Green schrieb:
> > Any ideas you would share?
> Regexes are quite limited. When you bang into their limits you can write a finite state machine or use a parser.
~
and I have been constantly banking against their limits ;-) in fact I find regexes quite limited for what I do
~
markspace schrieb:
> Based on your syntax example and you title, why bother with
> "Backus-Naurish?" Java has full parser generators.
> http://www.antlr.org/
for my needs antlr is an overkill
~
Martin Gregorie schrieb:
> This is implemented as the ArgParser class in my environ.jar library and
> can be found at:
> http://sourceforge.net/projects/cdocumenter/files/cdocumenter/environment/
your ArgParser:
Constructor Detail
public ArgParser(java.lang.String progName,
java.lang.String[] args,
java.lang.String optlist)
must be passed an optlist and, similar to commons-cli, must be navigated/parsed.
All I make my users do is:
1) setup everything in <program_name>.properties files for default settings, and
2) let users set specific (protocolled) parameters as command line if they so decide
my ArgParser-like constructor looks like this:
SysEnvCtxt(){ ... }
public void setCtxt(String aKNm, String[] aKLnArgs, File ODir, String aPropsMetaMD5Sign, long lTm00Start) throws IOException
where:
aKNm: class name (passed from calling env)
aKLnArgs: command line args (automatically passed from calling env)
ODir: output dir (set and passed from calling env)
aPropsMetaMD5Sign: MD5 Signature of properties definitions and names (passed from calling env and set for some type of running context/properties)
lTm00Start: start time (automatically passed from calling env)
and then the user sets up a system and logical context running env properties (or xml) files which look like this:
# fully explicit and declaratively defined running properties written in a Backus?Naur(ish) form
# all system property names must start with (*nix standard) double hyphen
# metadata names are prefixed and suffixed as system_<property name>_values_def
# options are explicitly piped (with "|") "true[|false]" means it must [|not] be defined
# the last of the existing options after closing square bracket is the default
# if default option is not listed, it must be retrievable via java.lang.System.getProperty(<option>)
# ~ ~ ~ ~ ~ ~ ~ ~ java system level settings
# y: prints to standard error all java system and current process properties, as well as OS-level env variables the JVM has access to
--print-env-context: n
# y: redirects standard error file to <output dirirectory>/yyyyMMddHHmmss.SSSS"_err.log
--redirect-err: n
# y: redirects standard output file ...
--redirect-out: q
# file encoding used for file (it must be UTF-8 like)
--char-encoding: UTF-8
# version: <release>.<update[even:finished|prime:editing]>_<date +%Y-%m-%d>_<girl name>_phase
--version: 0.3_2013-03-08_kerala_pre-alpha
# code points are read off files line by line
--end-of-line:
# ~ ~ ~ ~ ~ ~ METADATA ~ DO NOT EDIT! ~ ~ ~ ~ ~ ~ ~ ~
system_print-env-context_values_def: true[y|n]n
system_redirect-err_values_def: true[y|n]n
system_redirect-out_values_def: true[y|n]n
system_char-encoding_values_def: true[UTF-8|UTF8|UTF-7|US-ASCII|ISO-8859-1|ISO-LATIN-1|ISO646-US|ANSI X3.4-1968]UTF-8
system_version_values_def: true[0.3_2013-03-08_kerala_pre-alpha]
system_end-of-line_values_def: false[nix|windows|mac]line.separator
# ~ ~ ~ ~ ~ logical context for java running instance ~ ~ ~ ~ ~
--input-files-list:
# ~ ~ ~ ~ ~ ~ METADATA ~ DO NOT EDIT! ~ ~ ~ ~ ~ ~ ~ ~
# file containing one liner of input files must be defined
input-files-list_values_def: true
thank you guys and I think I will go ahead and do the parsing myself
lbrtchx
Back to comp.lang.java.programmer | Previous | Next — Next in thread | Find similar | Unroll thread
regexp(ing) Backus-Naurish expressions ... qwertmonkey@syberianoutpost.ru - 2013-03-13 21:54 +0000
Re: regexp(ing) Backus-Naurish expressions ... markspace <markspace@nospam.nospam> - 2013-03-13 15:12 -0700
Re: regexp(ing) Backus-Naurish expressions ... Arne Vajhøj <arne@vajhoej.dk> - 2013-03-13 18:20 -0400
Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-13 20:00 -0300
Re: regexp(ing) Backus-Naurish expressions ... Arne Vajhøj <arne@vajhoej.dk> - 2013-03-14 21:16 -0400
Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-15 06:31 -0300
Re: regexp(ing) Backus-Naurish expressions ... Lew <lewbloch@gmail.com> - 2013-03-15 11:34 -0700
Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-15 20:17 -0300
Re: regexp(ing) Backus-Naurish expressions ... Leif Roar Moldskred <leifm@dimnakorr.com> - 2013-03-13 17:47 -0500
csiph-web