Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: blmblm@myrealbox.com <blmblm.myrealbox@gmail.com>
Newsgroups: comp.lang.java.programmer
Subject: Re: Passing a Method Name to a Method, Redux
Date: 4 Jul 2011 03:26:26 GMT
Organization: None
Lines: 91
Message-ID: <97cq72FbqnU1@mid.individual.net>
References: <fpg7079ca2dtgipdphr8rm234kgmkd1t3l@4ax.com> <iu0lgf$ec8$1@dont-email.me> <jcu707d1fb592i91m40fsjtfg0784knkd1@4ax.com>
X-Trace: individual.net 0ANc+Av/yW05jBnaIfcimgqZ35iLPH/HLPAO6U3ia+KbbrQ8AX
X-Orig-Path: not-for-mail
Cancel-Lock: sha1:4x4B2g/oFn2049mDuy7U6t2P6Uw=
X-Newsreader: trn 4.0-test76 (Apr 2, 2001)
Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:5839

In article <jcu707d1fb592i91m40fsjtfg0784knkd1@4ax.com>,
Gene Wirchenko  <genew@ocis.net> wrote:
> On Thu, 23 Jun 2011 17:24:43 -0700, markspace <-@.> wrote:
> 
> >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
> >>
> >>       So how would you have written this benchmark?
> 
> >Um, realistically?  Is this really what you want to do?
> >
> >  static boolean TreesetSearch( char CurrChar )  {
> >         return IdentCharsSet.contains( CurrChar );
> >     }
> 
>      Yes.  I wanted a simple method call in the parser so I could
> cut-and-paste.  I did not know if I would need more than one call.  I
> am going to go with a Treeset so I will not have a separate method in
> the implementation.

Another "no separate method" approach would be to use the String
class's indexOf method:

    static boolean StringLibSearch
        (
         char CurrChar
        )
        {
            return IdentChars.indexOf(CurrChar) >= 0;
        }

I added this to your benchmark suite and found it to give performance
comparable to the TreeSet implementation (indeed, usually it was a
bit faster).  The overhead of building the TreeSet probably doesn't
matter in the grand scheme of things, and probably it also doesn't
matter a lot that every call to the TreeSet's "contains" method
(AFAIK) has to convert a character primitive to a Character object,
but -- <shrug>.

But if you're going to use a Set, why a TreeSet?  As best I can tell,
you don't use/need the sorted-ness it provides.  Just out of curiosity,
I also added to your benchmark suite something that declares the set
as a Set and creates it as an instance of HashSet, and the resulting
code was noticeably faster than any of the other alternatives.


And finally, I wondered how all of these methods compared to
something using regular expressions (the java.util.regex classes), so
i tried that too, replacing your whole parse code with the following:

    import java.util.regex.*;
   
    // .... 

    static Pattern IdentRegexPattern=Pattern.compile("[" + IdentChars + "]+");

    // .... 

    // code to be called repeatedly from timing loop
    static void ParseRegex()
    {
        Matcher IdentMatcher = IdentRegexPattern.matcher(cParseString);
        String sIdent;
        while (IdentMatcher.find())
        {
            sIdent = IdentMatcher.group();
            if (nRepetitions==1)
                System.out.println(sIdent);
        }
    }

    // .... 

This was a clear winner (with regard to performance) on the system
where I measured performance, *unless* I ran the tests with the
"-server" flag, in which case it took second place, behind the
HashSet-based approach.  As I understand things, though, the
"-server" flag results in the compiler doing more to try to optimize
the code, including being more aggressive about eliminating dead
code, so I'm not entirely confident about the results I'm getting
being meaningful.

(Probably your actual code needs to do something other than
finding and printing identifiers, so the above code would need
some adjustment.  Still, if you like regular expressions, it's
another possibility, maybe .... )

[ snip ]

-- 
B. L. Massingill
ObDisclaimer:  I don't speak for my employers; they return the favor.