Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #5610 > unrolled thread
| Started by | Gene Wirchenko <genew@ocis.net> |
|---|---|
| First post | 2011-06-23 16:03 -0700 |
| Last post | 2011-06-24 15:50 -0700 |
| Articles | 20 on this page of 50 — 10 participants |
Back to article view | Back to comp.lang.java.programmer
Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-23 16:03 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-23 16:26 -0700
Re: Passing a Method Name to a Method, Redux blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> - 2011-06-27 21:41 +0000
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-23 17:24 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-23 19:46 -0700
Re: Passing a Method Name to a Method, Redux blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> - 2011-07-04 03:26 +0000
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-07-04 03:41 -0700
Re: Passing a Method Name to a Method, Redux blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> - 2011-07-05 19:07 +0000
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-23 17:34 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-23 19:42 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-23 18:30 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-23 19:48 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-23 21:02 -0700
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-06-24 08:38 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-24 09:04 -0700
Re: Passing a Method Name to a Method, Redux Lew <noone@lewscanon.com> - 2011-06-26 13:43 -0400
Re: Passing a Method Name to a Method, Redux Lew <noone@lewscanon.com> - 2011-06-26 14:31 -0400
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-24 11:45 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-24 12:19 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-26 20:39 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-26 23:33 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-27 13:53 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-27 18:03 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-28 11:41 -0700
Re: Passing a Method Name to a Method, Redux blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> - 2011-06-24 19:19 +0000
Re: Passing a Method Name to a Method, Redux Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-06-23 19:36 -0700
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-24 11:50 -0700
Re: Passing a Method Name to a Method, Redux Jeff Higgins <jeff@invalid.invalid> - 2011-06-24 17:25 -0400
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-26 20:42 -0700
Re: Passing a Method Name to a Method, Redux markspace <-@.> - 2011-06-26 23:27 -0700
Re: Passing a Method Name to a Method, Redux Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-06-27 03:04 -0400
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-06-27 13:12 -0700
Re: Passing a Method Name to a Method, Redux Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-06-27 13:36 -0700
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-21 20:43 -0400
Re: Passing a Method Name to a Method, Redux Gene Wirchenko <genew@ocis.net> - 2011-07-22 12:35 -0700
Re: Passing a Method Name to a Method, Redux Patricia Shanahan <pats@acm.org> - 2011-07-22 13:09 -0700
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-07-22 13:35 -0700
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-22 16:53 -0400
Re: Passing a Method Name to a Method, Redux Martin Gregorie <martin@address-in-sig.invalid> - 2011-07-23 13:19 +0000
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-23 11:19 -0400
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-07-23 09:20 -0700
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-23 13:33 -0400
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-07-23 11:43 -0700
Re: Passing a Method Name to a Method, Redux lewbloch <lewbloch@gmail.com> - 2011-07-23 12:14 -0700
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-23 19:19 -0400
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-23 19:12 -0400
Re: Passing a Method Name to a Method, Redux Martin Gregorie <martin@address-in-sig.invalid> - 2011-07-23 17:24 +0000
Re: Passing a Method Name to a Method, Redux Arne Vajhøj <arne@vajhoej.dk> - 2011-07-23 19:22 -0400
Re: Passing a Method Name to a Method, Redux Martin Gregorie <martin@address-in-sig.invalid> - 2011-07-24 10:02 +0000
Re: Passing a Method Name to a Method, Redux Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-06-24 15:50 -0700
Page 1 of 3 [1] 2 3 Next page →
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-23 16:03 -0700 |
| Subject | Passing a Method Name to a Method, Redux |
| Message-ID | <fpg7079ca2dtgipdphr8rm234kgmkd1t3l@4ax.com> |
Dear Java'ers:
I have completed my benchmarking. The code is below. Note that
the difference in the bodies between ParseSequentialSearch(),
ParseBinarySearch(), and ParseTreesetSearch() is but one line. I
really would have preferred not having to duplicate the code.
Oddly, the timings have a LOT of noise in them. In some runs, a
sequential search has out-performed a binary search. Occasionally, a
sequential search has beaten both a binary search and a Treeset
search. The times for sequential searching are only a bit worse than
for binary searching. Treeset searching is about 20% faster. Any
explanations?
I had to kludge this:
cIdent=""+CurrChar;
cIdent is a String, CurrChar is a char.
cIdent=CurrChar;
does not compile.
So how would you have written this benchmark?
***** Start of Code *****
// TimingTesting
// Timing Testing of Character Searching
// Last Modification: 2011-06-23
import java.util.*;
class TimingTesting
{
static String cParseString=
"//identifier//IDENTIFIER//a_b_c abc123
4b5%$__dbl;one;two;three;END";
static String IdentChars=
"0123456789"+
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"+
"_"+
"abcdefghijklmnopqrstuvwxyz"; // sorted order!
static SortedSet<Character> IdentCharsSet=new TreeSet<Character>();
static int nRepetitions=1000000;
static boolean SequentialSearch
(
char CurrChar
)
{
boolean fFound=false;
for (int i=0; i<IdentChars.length() && !fFound; i++)
fFound=IdentChars.charAt(i)==CurrChar;
return fFound;
}
static boolean BinarySearch
(
char CurrChar
)
{
int xLow=0;
int xHigh=IdentChars.length()-1;
int xTry;
boolean fFound=false;
while (xLow<=xHigh)
{
xTry=(xLow+xHigh)/2;
if (CurrChar==IdentChars.charAt(xTry))
return true;
if (CurrChar<IdentChars.charAt(xTry))
xHigh=xTry-1;
else
xLow=xTry+1;
}
return false;
}
static boolean TreesetSearch
(
char CurrChar
)
{
return IdentCharsSet.contains(CurrChar);
}
static void ParseSequentialSearch()
{
int xScan=0;
boolean fBuildingIdent=false;
boolean fInIdentChars;
String cIdent=""; // fussy init
while (xScan<cParseString.length())
{
char CurrChar=cParseString.charAt(xScan);
fInIdentChars=SequentialSearch(CurrChar);
if (SequentialSearch(CurrChar)) // different code
if (fBuildingIdent)
cIdent+=CurrChar;
else
{
fBuildingIdent=true;
cIdent=""+CurrChar;
}
else
if (fBuildingIdent)
{
fBuildingIdent=false;
if (nRepetitions==1)
System.out.println(cIdent);
}
else
{}
xScan++;
}
if (fBuildingIdent)
if (nRepetitions==1)
System.out.println(cIdent);
}
static void ParseBinarySearch()
{
int xScan=0;
boolean fBuildingIdent=false;
boolean fInIdentChars;
String cIdent=""; // fussy init
while (xScan<cParseString.length())
{
char CurrChar=cParseString.charAt(xScan);
fInIdentChars=SequentialSearch(CurrChar);
if (SequentialSearch(CurrChar)) // different code
if (fBuildingIdent)
cIdent+=CurrChar;
else
{
fBuildingIdent=true;
cIdent=""+CurrChar;
}
else
if (fBuildingIdent)
{
fBuildingIdent=false;
if (nRepetitions==1)
System.out.println(cIdent);
}
else
{}
xScan++;
}
if (fBuildingIdent)
if (nRepetitions==1)
System.out.println(cIdent);
}
static void ParseTreesetSearch()
{
int xScan=0;
boolean fBuildingIdent=false;
boolean fInIdentChars;
String cIdent=""; // fussy init
while (xScan<cParseString.length())
{
char CurrChar=cParseString.charAt(xScan);
fInIdentChars=SequentialSearch(CurrChar);
if (TreesetSearch(CurrChar)) // different code
if (fBuildingIdent)
cIdent+=CurrChar;
else
{
fBuildingIdent=true;
cIdent=""+CurrChar;
}
else
if (fBuildingIdent)
{
fBuildingIdent=false;
if (nRepetitions==1)
System.out.println(cIdent);
}
else
{}
xScan++;
}
if (fBuildingIdent)
if (nRepetitions==1)
System.out.println(cIdent);
}
public static void main(String[] args)
{
int i;
long StartTime;
long EndTime;
long Duration;
System.out.println("Timing Testing of Character Searching");
System.out.println();
// Initialise Set.
for (i=0; i<IdentChars.length(); i++)
IdentCharsSet.add(IdentChars.charAt(i));
// Character Sequential
System.out.print("Character Sequential Search");
StartTime=System.nanoTime();
for (i=1; i<=nRepetitions; i++)
ParseSequentialSearch();
EndTime=System.nanoTime();
Duration=EndTime-StartTime;
System.out.println(" Duration="+Duration);
// Character Binary Search
System.out.print("Character Binary Search ");
StartTime=System.nanoTime();
for (i=1; i<=nRepetitions; i++)
ParseBinarySearch();
EndTime=System.nanoTime();
Duration=EndTime-StartTime;
System.out.println(" Duration="+Duration);
// Character Treeset
System.out.print("Character Treeset Search ");
StartTime=System.nanoTime();
for (i=1; i<=nRepetitions; i++)
ParseTreesetSearch();
EndTime=System.nanoTime();
Duration=EndTime-StartTime;
System.out.println(" Duration="+Duration);
}
}
***** End of Code *****
Sincerely,
Gene Wirchenko
[toc] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-23 16:26 -0700 |
| Message-ID | <8pi707547ufe2klfl36fqnqplpiaq3g8dq@4ax.com> |
| In reply to | #5610 |
On 23 Jun 2011 23:11:50 GMT, ram@zedat.fu-berlin.de (Stefan Ram)
wrote:
>Gene Wirchenko <genew@ocis.net> writes:
>>So how would you have written this benchmark?
>
>http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java
>http://www.kdgregory.com/index.php?page=java.microBenchmark
>http://www.ibm.com/developerworks/java/library/j-jtp02225/index.html
>...
Good links. Thank you.
But my question was really about method calling to
SequentialSearch(), BinarySearch(), and TreesetSearch().
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> |
|---|---|
| Date | 2011-06-27 21:41 +0000 |
| Message-ID | <96sbnvF4fuU2@mid.individual.net> |
| In reply to | #5613 |
In article <8pi707547ufe2klfl36fqnqplpiaq3g8dq@4ax.com>, Gene Wirchenko <genew@ocis.net> wrote: > On 23 Jun 2011 23:11:50 GMT, ram@zedat.fu-berlin.de (Stefan Ram) > wrote: > > >Gene Wirchenko <genew@ocis.net> writes: > >>So how would you have written this benchmark? > > > >http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java > >http://www.kdgregory.com/index.php?page=java.microBenchmark > >http://www.ibm.com/developerworks/java/library/j-jtp02225/index.html > >... > > Good links. Thank you. > > But my question was really about method calling to > SequentialSearch(), BinarySearch(), and TreesetSearch(). > Interesting how the discussion has veered off onto other subjects -- but hey, this is Usenet! But you did get at least two replies focusing on reducing the amount of code duplication, one from markspace and one from me [*]. It would be nice to know whether you found them useful. ? [*] Message-ID: <96k6atFt60U2@mid.individual.net> -- B. L. Massingill ObDisclaimer: I don't speak for my employers; they return the favor.
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-23 17:24 -0700 |
| Message-ID | <iu0lgf$ec8$1@dont-email.me> |
| In reply to | #5610 |
On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
>
> So how would you have written this benchmark?
Um, realistically? Is this really what you want to do?
static boolean TreesetSearch( char CurrChar ) {
return IdentCharsSet.contains( CurrChar );
}
All of the identifiers in your "language" are single characters?
I would have used actual strings, preferably from existing code so you
could test performance. Although I appreciate you making a
self-contained example for us'm here on usenet.
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-23 19:46 -0700 |
| Message-ID | <jcu707d1fb592i91m40fsjtfg0784knkd1@4ax.com> |
| In reply to | #5616 |
On Thu, 23 Jun 2011 17:24:43 -0700, markspace <-@.> wrote:
>On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
>>
>> So how would you have written this benchmark?
>Um, realistically? Is this really what you want to do?
>
> static boolean TreesetSearch( char CurrChar ) {
> return IdentCharsSet.contains( CurrChar );
> }
Yes. I wanted a simple method call in the parser so I could
cut-and-paste. I did not know if I would need more than one call. I
am going to go with a Treeset so I will not have a separate method in
the implementation.
>All of the identifiers in your "language" are single characters?
No. An identifier is a sequence of one or more characters that
are in IdentChars (or IdentCharsSet).
>I would have used actual strings, preferably from existing code so you
>could test performance. Although I appreciate you making a
>self-contained example for us'm here on usenet.
I wanted an example. I have test files for when I have this
worked out.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> |
|---|---|
| Date | 2011-07-04 03:26 +0000 |
| Message-ID | <97cq72FbqnU1@mid.individual.net> |
| In reply to | #5624 |
In article <jcu707d1fb592i91m40fsjtfg0784knkd1@4ax.com>,
Gene Wirchenko <genew@ocis.net> wrote:
> On Thu, 23 Jun 2011 17:24:43 -0700, markspace <-@.> wrote:
>
> >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
> >>
> >> So how would you have written this benchmark?
>
> >Um, realistically? Is this really what you want to do?
> >
> > static boolean TreesetSearch( char CurrChar ) {
> > return IdentCharsSet.contains( CurrChar );
> > }
>
> Yes. I wanted a simple method call in the parser so I could
> cut-and-paste. I did not know if I would need more than one call. I
> am going to go with a Treeset so I will not have a separate method in
> the implementation.
Another "no separate method" approach would be to use the String
class's indexOf method:
static boolean StringLibSearch
(
char CurrChar
)
{
return IdentChars.indexOf(CurrChar) >= 0;
}
I added this to your benchmark suite and found it to give performance
comparable to the TreeSet implementation (indeed, usually it was a
bit faster). The overhead of building the TreeSet probably doesn't
matter in the grand scheme of things, and probably it also doesn't
matter a lot that every call to the TreeSet's "contains" method
(AFAIK) has to convert a character primitive to a Character object,
but -- <shrug>.
But if you're going to use a Set, why a TreeSet? As best I can tell,
you don't use/need the sorted-ness it provides. Just out of curiosity,
I also added to your benchmark suite something that declares the set
as a Set and creates it as an instance of HashSet, and the resulting
code was noticeably faster than any of the other alternatives.
And finally, I wondered how all of these methods compared to
something using regular expressions (the java.util.regex classes), so
i tried that too, replacing your whole parse code with the following:
import java.util.regex.*;
// ....
static Pattern IdentRegexPattern=Pattern.compile("[" + IdentChars + "]+");
// ....
// code to be called repeatedly from timing loop
static void ParseRegex()
{
Matcher IdentMatcher = IdentRegexPattern.matcher(cParseString);
String sIdent;
while (IdentMatcher.find())
{
sIdent = IdentMatcher.group();
if (nRepetitions==1)
System.out.println(sIdent);
}
}
// ....
This was a clear winner (with regard to performance) on the system
where I measured performance, *unless* I ran the tests with the
"-server" flag, in which case it took second place, behind the
HashSet-based approach. As I understand things, though, the
"-server" flag results in the compiler doing more to try to optimize
the code, including being more aggressive about eliminating dead
code, so I'm not entirely confident about the results I'm getting
being meaningful.
(Probably your actual code needs to do something other than
finding and printing identifiers, so the above code would need
some adjustment. Still, if you like regular expressions, it's
another possibility, maybe .... )
[ snip ]
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
[toc] | [prev] | [next] | [standalone]
| From | lewbloch <lewbloch@gmail.com> |
|---|---|
| Date | 2011-07-04 03:41 -0700 |
| Message-ID | <20c411d9-8955-4f4c-9a78-fea7540ed9a1@p10g2000prf.googlegroups.com> |
| In reply to | #5839 |
On Jul 3, 8:26 pm, blm...@myrealbox.com <blmblm.myreal...@gmail.com>
wrote:
> In article <jcu707d1fb592i91m40fsjtfg0784kn...@4ax.com>,
> Gene Wirchenko <ge...@ocis.net> wrote:
>
>
>
>
>
>
>
>
>
> > On Thu, 23 Jun 2011 17:24:43 -0700, markspace <-@.> wrote:
>
> > >On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
>
> > >> So how would you have written this benchmark?
>
> > >Um, realistically? Is this really what you want to do?
>
> > > static boolean TreesetSearch( char CurrChar ) {
> > > return IdentCharsSet.contains( CurrChar );
> > > }
>
> > Yes. I wanted a simple method call in the parser so I could
> > cut-and-paste. I did not know if I would need more than one call. I
> > am going to go with a Treeset so I will not have a separate method in
> > the implementation.
>
> Another "no separate method" approach would be to use the String
> class's indexOf method:
>
> static boolean StringLibSearch
> (
> char CurrChar
> )
> {
> return IdentChars.indexOf(CurrChar) >= 0;
> }
>
> I added this to your benchmark suite and found it to give performance
> comparable to the TreeSet implementation (indeed, usually it was a
> bit faster). The overhead of building the TreeSet probably doesn't
> matter in the grand scheme of things, and probably it also doesn't
> matter a lot that every call to the TreeSet's "contains" method
> (AFAIK) has to convert a character primitive to a Character object,
> but -- <shrug>.
>
> But if you're going to use a Set, why a TreeSet? As best I can tell,
> you don't use/need the sorted-ness it provides. Just out of curiosity,
> I also added to your benchmark suite something that declares the set
> as a Set and creates it as an instance of HashSet, and the resulting
> code was noticeably faster than any of the other alternatives.
>
> And finally, I wondered how all of these methods compared to
> something using regular expressions (the java.util.regex classes), so
> i tried that too, replacing your whole parse code with the following:
>
> import java.util.regex.*;
>
> // ....
>
> static Pattern IdentRegexPattern=Pattern.compile("[" + IdentChars + "]+");
>
> // ....
>
> // code to be called repeatedly from timing loop
> static void ParseRegex()
> {
> Matcher IdentMatcher = IdentRegexPattern.matcher(cParseString);
> String sIdent;
> while (IdentMatcher.find())
> {
> sIdent = IdentMatcher.group();
> if (nRepetitions==1)
> System.out.println(sIdent);
> }
> }
>
> // ....
>
> This was a clear winner (with regard to performance) on the system
> where I measured performance, *unless* I ran the tests with the
> "-server" flag, in which case it took second place, behind the
> HashSet-based approach. As I understand things, though, the
> "-server" flag results in the compiler doing more to try to optimize
> the code, including being more aggressive about eliminating dead
> code, so I'm not entirely confident about the results I'm getting
> being meaningful.
>
> (Probably your actual code needs to do something other than
> finding and printing identifiers, so the above code would need
> some adjustment. Still, if you like regular expressions, it's
> another possibility, maybe .... )
>
> [ snip ]
>
Your points are excellent, but the ongoing violations of the naming
conventions is making my brain hurt. Can't we please revert to
conformant names in our replies at least?
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | blmblm@myrealbox.com <blmblm.myrealbox@gmail.com> |
|---|---|
| Date | 2011-07-05 19:07 +0000 |
| Message-ID | <97h5npFhaeU1@mid.individual.net> |
| In reply to | #5843 |
In article <20c411d9-8955-4f4c-9a78-fea7540ed9a1@p10g2000prf.googlegroups.com>,
lewbloch <lewbloch@gmail.com> wrote:
> On Jul 3, 8:26 pm, blm...@myrealbox.com <blmblm.myreal...@gmail.com>
> wrote:
> > In article <jcu707d1fb592i91m40fsjtfg0784kn...@4ax.com>,
> > Gene Wirchenko <ge...@ocis.net> wrote:
[ snip ]
> > Another "no separate method" approach would be to use the String
> > class's indexOf method:
> >
> > static boolean StringLibSearch
> > (
> > char CurrChar
> > )
> > {
> > return IdentChars.indexOf(CurrChar) >= 0;
> > }
> >
> > I added this to your benchmark suite and found it to give performance
> > comparable to the TreeSet implementation (indeed, usually it was a
> > bit faster). The overhead of building the TreeSet probably doesn't
> > matter in the grand scheme of things, and probably it also doesn't
> > matter a lot that every call to the TreeSet's "contains" method
> > (AFAIK) has to convert a character primitive to a Character object,
> > but -- <shrug>.
> >
> > But if you're going to use a Set, why a TreeSet? As best I can tell,
> > you don't use/need the sorted-ness it provides. Just out of curiosity,
> > I also added to your benchmark suite something that declares the set
> > as a Set and creates it as an instance of HashSet, and the resulting
> > code was noticeably faster than any of the other alternatives.
> >
> > And finally, I wondered how all of these methods compared to
> > something using regular expressions (the java.util.regex classes), so
> > i tried that too, replacing your whole parse code with the following:
> >
> > import java.util.regex.*;
> >
> > // ....
> >
> > static Pattern IdentRegexPattern=Pattern.compile("[" + IdentChars + "]+");
> >
> > // ....
> >
> > // code to be called repeatedly from timing loop
> > static void ParseRegex()
> > {
> > Matcher IdentMatcher = IdentRegexPattern.matcher(cParseString);
> > String sIdent;
> > while (IdentMatcher.find())
> > {
> > sIdent = IdentMatcher.group();
> > if (nRepetitions==1)
> > System.out.println(sIdent);
> > }
> > }
> >
> > // ....
> >
> > This was a clear winner (with regard to performance) on the system
> > where I measured performance, *unless* I ran the tests with the
> > "-server" flag, in which case it took second place, behind the
> > HashSet-based approach. As I understand things, though, the
> > "-server" flag results in the compiler doing more to try to optimize
> > the code, including being more aggressive about eliminating dead
> > code, so I'm not entirely confident about the results I'm getting
> > being meaningful.
> >
> > (Probably your actual code needs to do something other than
> > finding and printing identifiers, so the above code would need
> > some adjustment. Still, if you like regular expressions, it's
> > another possibility, maybe .... )
> >
> > [ snip ]
> >
>
> Your points are excellent, but the ongoing violations of the naming
> conventions is making my brain hurt. Can't we please revert to
> conformant names in our replies at least?
Well .... I guess I figure it's a choice between two things that
seem desirable -- (1) working in with the conventions of the code
I'm modifying and (2) applying the conventions used by most Java
programmers. I chose the former, though it rather makes my brain
hurt as well. :-)?
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-23 17:34 -0700 |
| Message-ID | <iu0m3c$glc$1@dont-email.me> |
| In reply to | #5610 |
On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
> search. The times for sequential searching are only a bit worse than
> for binary searching. Treeset searching is about 20% faster. Any
> explanations?
Why does your TreeSetSearch() call SequentialSearch? Doesn't that
defeat the purpose of your timing comparisons? I'm also not following
the parsing you are doing at all. What is the goal of that method?
static void ParseTreesetSearch()
{
int xScan = 0;
boolean fBuildingIdent = false;
boolean fInIdentChars;
String cIdent = ""; // fussy init
while( xScan < cParseString.length() ) {
char CurrChar = cParseString.charAt( xScan );
fInIdentChars = SequentialSearch( CurrChar );
^^^^^^^^^^^^^^^^
if( TreesetSearch( CurrChar ) ) // different code
Odd call indicated above. Doesn't that just do the exact same thing
that TreesetSearch does?
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-23 19:42 -0700 |
| Message-ID | <v0u7075lmbu8u87fsbumau5blrln9vac2a@4ax.com> |
| In reply to | #5618 |
On Thu, 23 Jun 2011 17:34:50 -0700, markspace <-@.> wrote:
>On 6/23/2011 4:03 PM, Gene Wirchenko wrote:
>> search. The times for sequential searching are only a bit worse than
>> for binary searching. Treeset searching is about 20% faster. Any
>> explanations?
>
>
>Why does your TreeSetSearch() call SequentialSearch? Doesn't that
I goofed.
>defeat the purpose of your timing comparisons? I'm also not following
>the parsing you are doing at all. What is the goal of that method?
I am parsing for identifiers. In this test, I just throw them
away.
[snip]
>Odd call indicated above. Doesn't that just do the exact same thing
>that TreesetSearch does?
Cut and paste did me in. Thank you for catching this.
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-23 18:30 -0700 |
| Message-ID | <iu0pbg$26n$1@dont-email.me> |
| In reply to | #5610 |
Some other things I'd do:
1. Make life easier on the Mark I eyeball and print out your timing in
seconds:
System.out.println( " Duration=" + (Duration/1e9) );
2. Specify your command line arguments. I found this affected the
timing greatly (over 100% faster than in the IDE) .
C:\Users\Brenden\Dev\Test2\src>javac -g:none test\CallingTest.java
C:\Users\Brenden\Dev\Test2\src>
C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest
Timing Testing of Character Searching
Character Sequential Search Duration=22.251718931
Character Binary Search Duration=21.492850347
Character Treeset Search Duration=19.472623928
Hash Test Duration=0.098741719
C:\Users\Brenden\Dev\Test2\src>
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-23 19:48 -0700 |
| Message-ID | <1lu707p0cpr9vhrpv51d7hmst6bt1qdbcv@4ax.com> |
| In reply to | #5619 |
On Thu, 23 Jun 2011 18:30:21 -0700, markspace <-@.> wrote:
>Some other things I'd do:
>1. Make life easier on the Mark I eyeball and print out your timing in
>seconds:
>
> System.out.println( " Duration=" + (Duration/1e9) );
Yes, if this were not just test code.
>2. Specify your command line arguments. I found this affected the
>timing greatly (over 100% faster than in the IDE) .
There are none.
>C:\Users\Brenden\Dev\Test2\src>javac -g:none test\CallingTest.java
>
>C:\Users\Brenden\Dev\Test2\src>
>C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest
>Timing Testing of Character Searching
>
>Character Sequential Search Duration=22.251718931
>Character Binary Search Duration=21.492850347
>Character Treeset Search Duration=19.472623928
>Hash Test Duration=0.098741719
^^^^^^^^^
What is this one, please?
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-23 21:02 -0700 |
| Message-ID | <iu129n$d0q$1@dont-email.me> |
| In reply to | #5625 |
On 6/23/2011 7:48 PM, Gene Wirchenko wrote:
> On Thu, 23 Jun 2011 18:30:21 -0700, markspace<-@.> wrote:
>
>> Some other things I'd do:
>
>> 1. Make life easier on the Mark I eyeball and print out your timing in
>> seconds:
>>
>> System.out.println( " Duration=" + (Duration/1e9) );
>
> Yes, if this were not just test code.
Test code is probably more important to get right than the actual
production code. You'll be using the test code more than the production. ;)
I have another question. This code here, most of this is debugging?
And you take the entire string and concatenate the characters together,
but you don't actually parse this into tokens:
while( xScan < cParseString.length() ) {
char CurrChar = cParseString.charAt( xScan );
fInIdentChars = SequentialSearch( CurrChar );
if( TreesetSearch( CurrChar ) ) // different code
{
if( fBuildingIdent ) {
cIdent += CurrChar;
} else {
fBuildingIdent = true;
cIdent = "" + CurrChar;
}
} else if( fBuildingIdent ) {
fBuildingIdent = false;
if( nRepetitions == 1 ) {
System.out.println( cIdent );
}
} else {
}
xScan++;
}
Your just double checking that SequentialSearch and TreesetSearch return
the same thing? And you concat the whole cParseString into cIdent, you
don't look for white space or anything and break cParseString into tokens?
Don't worry about Hash Test, I have something faster now.
BTW I refactored your test that you were copy-and-pasting around into
one method. Using techniques I mentioned in my first post to you on
this subject.
private static void time( TestCase r ) {
long StartTime = System.nanoTime();
for( int i = 1; i <= nRepetitions; i++ ) {
r.parse();
}
long EndTime = System.nanoTime();
long Duration = EndTime - StartTime;
System.out.println( " Duration=" + (Duration/1e9) );
}
[toc] | [prev] | [next] | [standalone]
| From | lewbloch <lewbloch@gmail.com> |
|---|---|
| Date | 2011-06-24 08:38 -0700 |
| Message-ID | <b2d9149c-fd3a-4c6e-b32c-b95e961867e8@r33g2000prh.googlegroups.com> |
| In reply to | #5626 |
markspace wrote:
> ...
> BTW I refactored your test that you were copy-and-pasting around into
> one method. Using techniques I mentioned in my first post to you on
> this subject.
>
> private static void time( TestCase r ) {
> long StartTime = System.nanoTime();
> for( int i = 1; i <= nRepetitions; i++ ) {
> r.parse();
> }
> long EndTime = System.nanoTime();
> long Duration = EndTime - StartTime;
> System.out.println( " Duration=" + (Duration/1e9) );
> }
You'll want to run each loop a bunch of times (10,000? 100,000? >=
1M?) before starting the timing loop in order to cancel the effects of
HotSpot warmup.
Unless your real-world scenario pretty much guarantees that HotSpot
won't be a factor.
Micro-benchmarks in Java are, at best, a dicey basis for any
performance conclusions.
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-24 09:04 -0700 |
| Message-ID | <iu2cj9$4o4$1@dont-email.me> |
| In reply to | #5644 |
On 6/24/2011 8:38 AM, lewbloch wrote: > You'll want to run each loop a bunch of times (10,000? 100,000?>= > 1M?) before starting the timing loop in order to cancel the effects of > HotSpot warmup. I was hoping that the -server flag would obviate most or all warm-up (this is from another post in this tread): C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest No?
[toc] | [prev] | [next] | [standalone]
| From | Lew <noone@lewscanon.com> |
|---|---|
| Date | 2011-06-26 13:43 -0400 |
| Message-ID | <iu7r41$hhr$2@news.albasani.net> |
| In reply to | #5647 |
On 06/24/2011 12:04 PM, markspace wrote: > On 6/24/2011 8:38 AM, lewbloch wrote: > >> You'll want to run each loop a bunch of times (10,000? 100,000?>= >> 1M?) before starting the timing loop in order to cancel the effects of >> HotSpot warmup. > > > I was hoping that the -server flag would obviate most or all warm-up (this is > from another post in this tread): > > C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest > > No? No. How could it? -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
[toc] | [prev] | [next] | [standalone]
| From | Lew <noone@lewscanon.com> |
|---|---|
| Date | 2011-06-26 14:31 -0400 |
| Message-ID | <iu7tt5$nkr$1@news.albasani.net> |
| In reply to | #5680 |
On 06/26/2011 01:43 PM, Lew wrote: > On 06/24/2011 12:04 PM, markspace wrote: >> On 6/24/2011 8:38 AM, lewbloch wrote: >> >>> You'll want to run each loop a bunch of times (10,000? 100,000?>= >>> 1M?) before starting the timing loop in order to cancel the effects of >>> HotSpot warmup. >> >> >> I was hoping that the -server flag would obviate most or all warm-up (this is >> from another post in this tread): >> >> C:\Users\Brenden\Dev\Test2\src>java -server test.CallingTest >> >> No? > > No. How could it? "-server "JVMs based on Sun's Hotspot technology initially compile class methods with a low optimization level. These JVMs use a simple complier [sic] and an optimizing JIT compiler. Normally the simple JIT compiler is used. However you can use this option to make the optimizing compiler the one that is used. This change will significantly increases [sic] the performance of the server but the server takes longer to warm up when the optimizing compiler is used." <http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/tprf_tunejvm.html> -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-24 11:45 -0700 |
| Message-ID | <aim907llv5vnl8910jb7bim7htjm4a4ben@4ax.com> |
| In reply to | #5626 |
On Thu, 23 Jun 2011 21:02:59 -0700, markspace <-@.> wrote:
>On 6/23/2011 7:48 PM, Gene Wirchenko wrote:
>> On Thu, 23 Jun 2011 18:30:21 -0700, markspace<-@.> wrote:
>>
>>> Some other things I'd do:
>>
>>> 1. Make life easier on the Mark I eyeball and print out your timing in
>>> seconds:
>>>
>>> System.out.println( " Duration=" + (Duration/1e9) );
>>
>> Yes, if this were not just test code.
>
>Test code is probably more important to get right than the actual
>production code. You'll be using the test code more than the production. ;)
No. I expect that I will be using the resulting preprocessor for
years. The test code will be tossed shortly.
>I have another question. This code here, most of this is debugging?
>And you take the entire string and concatenate the characters together,
>but you don't actually parse this into tokens:
Yes, I do. The only tokens that I am interested in are
identifiers.
[snip]
>Your just double checking that SequentialSearch and TreesetSearch return
>the same thing? And you concat the whole cParseString into cIdent, you
>don't look for white space or anything and break cParseString into tokens?
No. I am just after the timing.
The only tokens that I am interested in are the identifiers. The
identifiers are defined as sequences of characters in IdentChars or
IdentCharsSet. The rest of the input will be echoed.
>Don't worry about Hash Test, I have something faster now.
But *I* want something faster. That was the whole pioont of this
testing: to find out which way was fastest.
[snip]
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-06-24 12:19 -0700 |
| Message-ID | <iu2o12$li3$1@dont-email.me> |
| In reply to | #5651 |
On 6/24/2011 11:45 AM, Gene Wirchenko wrote:
> No. I expect that I will be using the resulting preprocessor for
> years. The test code will be tossed shortly.
You have a couple of problems with your code, one organizational and the
other understanding the effeciencies.
The organizational one relates to the idea that you'll just toss your
tests away. Don't ever do that! The test code is part of the project,
and should remain with it. Test code is also put under code control,
and managed along with the projects. It's important because every time
you want to change your parser, you'll need to re-run the tests to make
sure everything is working.
Are you using an IDE? Most will auto generate a test framework for you.
It's very handy and you should be doing this regardless how you write
code. The IDE just makes it very handy.
> No. I am just after the timing.
The other thing, efficiency, I'll show you right now. The
organizational stuff is actually probably a bigger deal, but I think
you'll be happy to see how to make code faster.
This line here is the biggest offender.
> cIdent += CurrChar;
This is super inefficient inside a loop. To do this, the system has to
create a new string with one extra character, and then toss away the old
string. Making a new object and tossing an old one is bound to slow you
down.
final public void parse() {
StringBuilder sb1 = new StringBuilder( 255 );
for( int xScan = 0; xScan <
TimingTesting.cParseString.length(); xScan++ ) {
char c = TimingTesting.cParseString.charAt( xScan );
if( find( c ) ) {
sb1.append( c );
}
}
String ... = sb1.toString()
Here's my adaptation of your loop. Notice I make a StringBuilder once,
outside the loop, and call append() inside the loop, which is much much
faster. Then I call toString once outside the loop again, so I only
create a new String once, not each time inside the loop. Try to
refactor your code to do this, it will make it much faster.
One last thing for now: on splitting a string into tokens, look at this:
String[] tok = TimingTesting.cParseString.split( "[^a-zA-Z0-9]+" );
System.out.println( Arrays.toString( tok ) );
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2011-06-26 20:39 -0700 |
| Message-ID | <mguf07h7vl3aju927nfj67h4gh6naullgu@4ax.com> |
| In reply to | #5655 |
On Fri, 24 Jun 2011 12:19:58 -0700, markspace <-@.> wrote:
>On 6/24/2011 11:45 AM, Gene Wirchenko wrote:
>
>> No. I expect that I will be using the resulting preprocessor for
>> years. The test code will be tossed shortly.
>
>You have a couple of problems with your code, one organizational and the
>other understanding the effeciencies.
>
>The organizational one relates to the idea that you'll just toss your
>tests away. Don't ever do that! The test code is part of the project,
You are misunderstanding. The test code that I am referring to
is proof-of-concept code to test ideas, *not* my test cases.
>and should remain with it. Test code is also put under code control,
>and managed along with the projects. It's important because every time
>you want to change your parser, you'll need to re-run the tests to make
>sure everything is working.
>
>Are you using an IDE? Most will auto generate a test framework for you.
> It's very handy and you should be doing this regardless how you write
>code. The IDE just makes it very handy.
>
>> No. I am just after the timing.
>
>The other thing, efficiency, I'll show you right now. The
>organizational stuff is actually probably a bigger deal, but I think
>you'll be happy to see how to make code faster.
>
>This line here is the biggest offender.
>
>> cIdent += CurrChar;
>
>This is super inefficient inside a loop. To do this, the system has to
>create a new string with one extra character, and then toss away the old
>string. Making a new object and tossing an old one is bound to slow you
>down.
>
>final public void parse() {
> StringBuilder sb1 = new StringBuilder( 255 );
> for( int xScan = 0; xScan <
>TimingTesting.cParseString.length(); xScan++ ) {
> char c = TimingTesting.cParseString.charAt( xScan );
> if( find( c ) ) {
> sb1.append( c );
> }
> }
> String ... = sb1.toString()
>
>Here's my adaptation of your loop. Notice I make a StringBuilder once,
>outside the loop, and call append() inside the loop, which is much much
>faster. Then I call toString once outside the loop again, so I only
>create a new String once, not each time inside the loop. Try to
>refactor your code to do this, it will make it much faster.
I have heard about the String/StringBuilder dichotomy. I will be
addressing it.
>One last thing for now: on splitting a string into tokens, look at this:
>
> String[] tok = TimingTesting.cParseString.split( "[^a-zA-Z0-9]+" );
> System.out.println( Arrays.toString( tok ) );
But I do not want to do that. I am writing a preprocessor to
process files like:
***** Start of Test File *****
* testin.dat
* Test Input File for Preprocessor
* Last Modification: 2011-06-16
*
* This is VFP code.
$idchars ABC 1 2 A
$idchars
$quotes "" '' [] ~
$rem testin2.dat contains the definitions of STARTTEXT and ENDTEXT.
$rem
$include "testin2.dat"
$include testin2.dat
$include ~Atestin.datA
$include "testin2.dat"X
$include ~Atestin.datAX
$define FROM 1
$define TO 10
set talk off
? "STARTTEXT"
for i=FROM to TO
? i
endfor
? ENDTEXT
return
$undef FROM
$undef TO
***** End of Test File *****
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
Page 1 of 3 [1] 2 3 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web