Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #22946

regexp(ing) Backus-Naurish expressions ...

From qwertmonkey@syberianoutpost.ru
Newsgroups comp.lang.java.programmer
Subject regexp(ing) Backus-Naurish expressions ...
Date 2013-03-13 21:54 +0000
Organization Aioe.org NNTP Server
Message-ID <khqsie$jee$1@speranza.aioe.org> (permalink)

Show all headers | View raw


Arne Vajhøj schrieb:

> I would do it as:
> - switch from properties to XML
> - define a schema for the XML with strict restrictions on data
> - let the application parse that with a validating parser and
>    read it into some config object, this will ensure that required
>    information is there and that the data types are correct
> - let the application apply business validation rules in Java code
>    on the config objects - this will ensure that the various
>    information is consistent
~
 Arne, what do you specifically mean when you say "read it into some 
config object"? Using JAXB? AFAIK JAXB needs source (re)compilation in
Android:
~
 http://code.google.com/p/android/issues/detail?id=314
~
 Also I am trying to deal with it in a general "named-value" pair way, so that
different schema files should be parsed and the result (as I see it) should
be some String[*][2] with the names and values of parameters/properties
~

Leif Roar Moldskred schrieb: 

> When working with regular expressions you should always remember that
> you don't need to do everything in a single expression. There's no law
> against splitting things up into sub-expressions or using "boring old
> code" for parts of the match.

> You should also bear in mind that some parsing tasks are just not
> suited to regular expressions and if the regular expression starts
> getting complicated you should consider if the task might be solved
> more easily with another approach.

> Here, assuming I've understood the problem right, I might do something
> as below (I'm not on my development computer, so note that this has
> not been checked for errors):
~
 Yeah, I would agree with you but the switch case block is really awful
and totally useless to me. While doing NLP work you would go mad with
code full of switch-case sections for every single and virtually endless
cases
~

lipska the kat schrieb:

> Not sure if this is what you are after as I've never used it myself but

> http://commons.apache.org/proper/commons-cli/
~
 well, no. It wasn't helpful because I need to do my work at the parsing stage

 http://commons.apache.org/proper/commons-cli/usage.html
~

Roedy Green schrieb:

> > Any ideas you would share?

> Regexes are quite limited. When you bang into their limits you can write a finite state machine or use a parser.
~
 and I have been constantly banking against their limits ;-) in fact I find regexes quite limited for what I do
~

markspace schrieb:

>  Based on your syntax example and you title, why bother with
> "Backus-Naurish?"  Java has full parser generators.

> http://www.antlr.org/


 for my needs antlr is an overkill

~


Martin Gregorie schrieb:

> This is implemented as the ArgParser class in my environ.jar library and
> can be found at:

> http://sourceforge.net/projects/cdocumenter/files/cdocumenter/environment/

 your ArgParser:

Constructor Detail
public ArgParser(java.lang.String progName,
                 java.lang.String[] args,
                 java.lang.String optlist)


 must be passed an optlist and, similar to commons-cli, must be navigated/parsed.

 All I make my users do is:

 1) setup everything in <program_name>.properties files for default settings, and 

 2) let users set specific (protocolled) parameters as command line if they so decide


 my ArgParser-like constructor looks like this:

 SysEnvCtxt(){ ... }

 public void setCtxt(String aKNm, String[] aKLnArgs, File ODir, String aPropsMetaMD5Sign, long lTm00Start) throws IOException

 where:
 aKNm: class name (passed from calling env)
 aKLnArgs: command line args (automatically passed from calling env)
 ODir: output dir (set and passed from calling env)
 aPropsMetaMD5Sign: MD5 Signature of properties definitions and names (passed from calling env and set for some type of running context/properties)
 lTm00Start: start time (automatically passed from calling env)

 and then the user sets up a system and logical context running env properties (or xml) files which look like this:

# fully explicit and declaratively defined running properties written in a Backus?Naur(ish) form

# all system property names must start with (*nix standard) double hyphen
# metadata names are prefixed and suffixed as system_<property name>_values_def
# options are explicitly piped (with "|") "true[|false]" means it must [|not] be defined
# the last of the existing options after closing square bracket is the default
# if default option is not listed, it must be retrievable via java.lang.System.getProperty(<option>)


# ~ ~ ~ ~ ~ ~ ~ ~ java system level settings

# y: prints to standard error all java system and current process properties, as well as OS-level env variables the JVM has access to
--print-env-context: n  

# y: redirects standard error file to <output dirirectory>/yyyyMMddHHmmss.SSSS"_err.log
--redirect-err: n

# y: redirects standard output file ... 
--redirect-out: q

# file encoding used for file (it must be UTF-8 like)
--char-encoding: UTF-8

# version: <release>.<update[even:finished|prime:editing]>_<date +%Y-%m-%d>_<girl name>_phase
--version: 0.3_2013-03-08_kerala_pre-alpha

# code points are read off files line by line
--end-of-line:

# ~ ~ ~ ~ ~ ~ METADATA ~ DO NOT EDIT! ~ ~ ~ ~ ~ ~ ~ ~

system_print-env-context_values_def: true[y|n]n

system_redirect-err_values_def: true[y|n]n

system_redirect-out_values_def: true[y|n]n

system_char-encoding_values_def: true[UTF-8|UTF8|UTF-7|US-ASCII|ISO-8859-1|ISO-LATIN-1|ISO646-US|ANSI X3.4-1968]UTF-8

system_version_values_def: true[0.3_2013-03-08_kerala_pre-alpha]

system_end-of-line_values_def: false[nix|windows|mac]line.separator





# ~ ~ ~ ~ ~ logical context for java running instance ~ ~ ~ ~ ~  

--input-files-list:


# ~ ~ ~ ~ ~ ~ METADATA ~ DO NOT EDIT! ~ ~ ~ ~ ~ ~ ~ ~

# file containing one liner of input files must be defined
input-files-list_values_def: true


 thank you guys and I think I will go ahead and do the parsing myself
 lbrtchx

Back to comp.lang.java.programmer | Previous | NextNext in thread | Find similar | Unroll thread


Thread

regexp(ing) Backus-Naurish expressions ... qwertmonkey@syberianoutpost.ru - 2013-03-13 21:54 +0000
  Re: regexp(ing) Backus-Naurish expressions ... markspace <markspace@nospam.nospam> - 2013-03-13 15:12 -0700
  Re: regexp(ing) Backus-Naurish expressions ... Arne Vajhøj <arne@vajhoej.dk> - 2013-03-13 18:20 -0400
    Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-13 20:00 -0300
      Re: regexp(ing) Backus-Naurish expressions ... Arne Vajhøj <arne@vajhoej.dk> - 2013-03-14 21:16 -0400
        Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-15 06:31 -0300
          Re: regexp(ing) Backus-Naurish expressions ... Lew <lewbloch@gmail.com> - 2013-03-15 11:34 -0700
            Re: regexp(ing) Backus-Naurish expressions ... Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-03-15 20:17 -0300
  Re: regexp(ing) Backus-Naurish expressions ... Leif Roar Moldskred <leifm@dimnakorr.com> - 2013-03-13 17:47 -0500

csiph-web