Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #8832 > unrolled thread
| Started by | Aéris <aeris@imirhil.fr> |
|---|---|
| First post | 2011-10-15 23:03 +0200 |
| Last post | 2011-10-22 21:11 +0200 |
| Articles | 20 on this page of 25 — 15 participants |
Back to article view | Back to comp.lang.java.programmer
Getter performance Aéris <aeris@imirhil.fr> - 2011-10-15 23:03 +0200
Re: Getter performance Arne Vajhøj <arne@vajhoej.dk> - 2011-10-15 17:36 -0400
Re: Getter performance Arne Vajhøj <arne@vajhoej.dk> - 2011-10-15 17:42 -0400
Re: Getter performance BGB <cr88192@hotmail.com> - 2011-10-15 15:00 -0700
Re: Getter performance markspace <-@.> - 2011-10-15 15:20 -0700
Re: Getter performance David Lamb <dalamb@cs.queensu.ca> - 2011-10-20 12:45 -0400
Re: Getter performance Roedy Green <see_website@mindprod.com.invalid> - 2011-10-21 14:27 -0700
Re: Getter performance Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-10-21 18:57 -0700
Re: Getter performance Patricia Shanahan <pats@acm.org> - 2011-10-22 07:27 +0100
Re: Getter performance Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-10-22 09:57 -0300
Re: Getter performance Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-10-21 22:12 -0400
Re: Getter performance BGB <cr88192@hotmail.com> - 2011-11-13 10:14 -0700
Re: Getter performance Lew <lewbloch@gmail.com> - 2011-11-13 11:00 -0800
Re: Getter performance Arne Vajhøj <arne@vajhoej.dk> - 2011-11-06 16:09 -0500
Re: Getter performance Aéris <aeris@imirhil.fr> - 2011-10-15 23:59 +0200
Re: Getter performance Arne Vajhøj <arne@vajhoej.dk> - 2011-10-15 19:44 -0400
Re: Getter performance Aéris <aeris@imirhil.fr> - 2011-10-16 13:14 +0200
Re: Getter performance Lars Enderin <lars.enderin@telia.com> - 2011-10-16 16:28 +0200
Re: Getter performance Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-10-16 09:47 -0400
Re: Getter performance Jaap Droogers <JaapDroogers@unusable.meel.homelinux.net> - 2011-10-16 22:12 +0200
Re: Getter performance BGB <cr88192@hotmail.com> - 2011-10-16 13:58 -0700
Re: Getter performance David Lamb <dalamb@cs.queensu.ca> - 2011-10-20 12:51 -0400
Re: Getter performance Paul Cager <paul.cager@googlemail.com> - 2011-10-21 08:49 -0700
Re: Getter performance Roedy Green <see_website@mindprod.com.invalid> - 2011-10-21 08:02 -0700
Re: Getter performance Wanja Gayk <brixomatic@yahoo.com> - 2011-10-22 21:11 +0200
Page 1 of 2 [1] 2 Next page →
| From | Aéris <aeris@imirhil.fr> |
|---|---|
| Date | 2011-10-15 23:03 +0200 |
| Subject | Getter performance |
| Message-ID | <4e99f537$0$623$426a74cc@news.free.fr> |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello everybody. I work on an application where performances are important. To optimize it, I thought a direct access to a variable (foo.bar) would be more efficient than a getter call (foo.getBar()). I thought by avoiding the call to a method and all that goes with it (context switching, stacking, return value…), I can save time, but this code sample prove the contrary : http://pastebin.com/bP1nqxce Direct variable access : 1041 ms Getter call : 556 ms The difference is even more important if I don't modify the variable value (lines 29 and 36 commented) : Direct variable access : 95 ms Getter call : 4 ms How can we explain this not obvious huge difference ( 50 and 95% ) ? - -- Aeris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOmfUwAAoJEK8zQvxDY4P97f4H/2qXiwLnDwuKxsHEkGYtk0va HSASqi5Dj5g0RxD6h4G65UveBI1ALLFZAJmjrsklOl2OgN5YSvmudxpLBzcdk53S 4ans2g5lLzc21Mw5UtdW5oz3TeZZ6Z0Cbna9MloDopw6wJM06VzblMPbNr+XQfwt S49rnd0RckKToR8lzveIJXNWuPmxKmh+JsV85ZFWepSLXB4xhsGCxuNdWq/Opjq5 s1nPvCUL5R6vKEbspyV4BRqrDEXwNKE4EPTHqgTNkklc9IMNTDjM0bPM2PhJwKlW iX8xFqMeyizSAOPmNdaRq907LLSZhFo5jSc2JsKBj7AYb4oy8ewP0ozYnDMaFt8= =AztU -----END PGP SIGNATURE-----
[toc] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2011-10-15 17:36 -0400 |
| Message-ID | <4e99fcdf$0$295$14726298@news.sunsite.dk> |
| In reply to | #8832 |
On 10/15/2011 5:03 PM, Aéris wrote: > I work on an application where performances are important. > > To optimize it, I thought a direct access to a variable (foo.bar) would > be more efficient than a getter call (foo.getBar()). Wrong approach. Write nice clean code. If it is fast enough then fine. If not then measure where the bottleneck is. It seems highly unlikely to be in getters. > I thought by avoiding the call to a method and all that goes with it > (context switching, stacking, return value…), I can save time, but this > code sample prove the contrary : > http://pastebin.com/bP1nqxce > Direct variable access : 1041 ms > Getter call : 556 ms > > The difference is even more important if I don't modify the variable > value (lines 29 and 36 commented) : > Direct variable access : 95 ms > Getter call : 4 ms > > How can we explain this not obvious huge difference ( 50 and 95% ) ? First thing would be to run a lot more than 100000 times. Such small intervals will be very random on a multi tasking OS. Arne
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2011-10-15 17:42 -0400 |
| Message-ID | <4e99fe53$0$281$14726298@news.sunsite.dk> |
| In reply to | #8834 |
On 10/15/2011 5:36 PM, Arne Vajhøj wrote: > On 10/15/2011 5:03 PM, Aéris wrote: >> I work on an application where performances are important. >> >> To optimize it, I thought a direct access to a variable (foo.bar) would >> be more efficient than a getter call (foo.getBar()). > > Wrong approach. Write nice clean code. If it is fast enough then > fine. If not then measure where the bottleneck is. It seems highly > unlikely to be in getters. > >> I thought by avoiding the call to a method and all that goes with it >> (context switching, stacking, return value…), I can save time, but this >> code sample prove the contrary : >> http://pastebin.com/bP1nqxce >> Direct variable access : 1041 ms >> Getter call : 556 ms >> >> The difference is even more important if I don't modify the variable >> value (lines 29 and 36 commented) : >> Direct variable access : 95 ms >> Getter call : 4 ms >> >> How can we explain this not obvious huge difference ( 50 and 95% ) ? > > First thing would be to run a lot more than 100000 times. Such > small intervals will be very random on a multi tasking OS. With 1000000 and -server I get: 1763 1608 which is rather close. Arne
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2011-10-15 15:00 -0700 |
| Message-ID | <j7cvqo$tcp$1@news.albasani.net> |
| In reply to | #8835 |
On 10/15/2011 2:42 PM, Arne Vajhøj wrote: > On 10/15/2011 5:36 PM, Arne Vajhøj wrote: >> On 10/15/2011 5:03 PM, Aéris wrote: >>> I work on an application where performances are important. >>> >>> To optimize it, I thought a direct access to a variable (foo.bar) would >>> be more efficient than a getter call (foo.getBar()). >> >> Wrong approach. Write nice clean code. If it is fast enough then >> fine. If not then measure where the bottleneck is. It seems highly >> unlikely to be in getters. >> >>> I thought by avoiding the call to a method and all that goes with it >>> (context switching, stacking, return value…), I can save time, but this >>> code sample prove the contrary : >>> http://pastebin.com/bP1nqxce >>> Direct variable access : 1041 ms >>> Getter call : 556 ms >>> >>> The difference is even more important if I don't modify the variable >>> value (lines 29 and 36 commented) : >>> Direct variable access : 95 ms >>> Getter call : 4 ms >>> >>> How can we explain this not obvious huge difference ( 50 and 95% ) ? >> >> First thing would be to run a lot more than 100000 times. Such >> small intervals will be very random on a multi tasking OS. > > With 1000000 and -server I get: > > 1763 > 1608 > > which is rather close. > this is an interesting result... I was aware that the JVM optimized getters, but unclear is why they are slightly faster than direct field access, rather than exactly the same speed (or sometimes slightly faster and sometimes slightly slower).
[toc] | [prev] | [next] | [standalone]
| From | markspace <-@.> |
|---|---|
| Date | 2011-10-15 15:20 -0700 |
| Message-ID | <j7d0v7$eo5$1@dont-email.me> |
| In reply to | #8837 |
On 10/15/2011 3:00 PM, BGB wrote: > On 10/15/2011 2:42 PM, Arne Vajhøj wrote: >> With 1000000 and -server I get: >> >> 1763 >> 1608 >> >> which is rather close. > this is an interesting result... > > I was aware that the JVM optimized getters, but unclear is why they are > slightly faster than direct field access, rather than exactly the same > speed (or sometimes slightly faster and sometimes slightly slower). I don't think the result indicates that getters are slightly faster, or vice versa. It's just to close too call, and there's isn't really anyway to say one is universally better, speed-wise. Software engineering-wise, and best practice-wise, the getter is undoubtedly the better option.
[toc] | [prev] | [next] | [standalone]
| From | David Lamb <dalamb@cs.queensu.ca> |
|---|---|
| Date | 2011-10-20 12:45 -0400 |
| Message-ID | <KeYnq.9220$wa5.7836@newsfe17.iad> |
| In reply to | #8838 |
On 15/10/2011 6:20 PM, markspace wrote: > On 10/15/2011 3:00 PM, BGB wrote: > >> On 10/15/2011 2:42 PM, Arne Vajhøj wrote: >>> With 1000000 and -server I get: >>> >>> 1763 >>> 1608 >>> >>> which is rather close. > >> this is an interesting result... >> >> I was aware that the JVM optimized getters, but unclear is why they are >> slightly faster than direct field access, rather than exactly the same >> speed (or sometimes slightly faster and sometimes slightly slower). > > > I don't think the result indicates that getters are slightly faster, or > vice versa. It's just to close too call, and there's isn't really anyway > to say one is universally better, speed-wise. A really thorough investigation would have to take into account a lot of performance factors; on multitasking systems (that is, most systems) a single expensive process swap that happened on one loop and not the other would make a huge difference. You'd need to run the whole experiment several times (not just lots of times in one program run). But markspace' comment is almost certainly correct: too close to call. > Software engineering-wise, and best practice-wise, the getter is > undoubtedly the better option. Exactly.
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2011-10-21 14:27 -0700 |
| Message-ID | <lfo3a75gk7h46q1csvk6k6jrks9qce80ni@4ax.com> |
| In reply to | #9035 |
On Thu, 20 Oct 2011 12:45:30 -0400, David Lamb <dalamb@cs.queensu.ca> wrote, quoted or indirectly quoted someone who said : > You'd need to run the whole >experiment several times (not just lots of times in one program run). >But markspace' comment is almost certainly correct: too close to call One thing that bedevils such testing is garbage collection which happens at "random" times and makes things look slower than they really are. That is why you need very long tests to average them out. See http://mindprod.com/jgloss/benchmark.html for some hints. You to avoid the initial class load time or the Hotspot time to optimise if you are after just the speed of the final code. Knuth has frightened people from even investigating speed out of curiosity. He was trying to stop people from writing needlessly optimised unreadable code. If there are two ways of doing things, both equally readable and maintainable, it strikes me as silly to blind yourself to knowing which is faster. You might as well do it that way whether it is necessary or not. The approach I see advocated could be parodied by saying you should select a random sort algorithm including insertion sort, and bubble sort and only putting in a clever algorithm after benchmarking and analysis prove it is necessary. I think you should know without benchmarking each individual case which Map or Collection would be best for a given task. -- Roedy Green Canadian Mind Products http://mindprod.com It should not be considered an error when the user starts something already started or stops something already stopped. This applies to browsers, services, editors... It is inexcusable to punish the user by requiring some elaborate sequence to atone, e.g. open the task editor, find and kill some processes.
[toc] | [prev] | [next] | [standalone]
| From | Daniel Pitts <newsgroup.nospam@virtualinfinity.net> |
|---|---|
| Date | 2011-10-21 18:57 -0700 |
| Message-ID | <rqpoq.6270$UK6.64@newsfe06.iad> |
| In reply to | #9079 |
On 10/21/11 2:27 PM, Roedy Green wrote: > I think you should know without benchmarking each individual case > which Map or Collection would be best for a given task. I nearly agree. An experienced programmer should know which Map or Collection would work sufficiently well for a given task. "Best", by some definitions, would almost always be a custom class (perhaps not even implementing Map or Collection). The thing is that best may be much better than needed.
[toc] | [prev] | [next] | [standalone]
| From | Patricia Shanahan <pats@acm.org> |
|---|---|
| Date | 2011-10-22 07:27 +0100 |
| Message-ID | <pemdnXtu9MKv_z_TnZ2dnUVZ_uadnZ2d@earthlink.com> |
| In reply to | #9083 |
Daniel Pitts wrote: > On 10/21/11 2:27 PM, Roedy Green wrote: >> I think you should know without benchmarking each individual case >> which Map or Collection would be best for a given task. > I nearly agree. > > An experienced programmer should know which Map or Collection would work > sufficiently well for a given task. "Best", by some definitions, would > almost always be a custom class (perhaps not even implementing Map or > Collection). The thing is that best may be much better than needed. Also, even between two algorithms, "best" may require a lot of experiments, and be machine dependent. For example, consider binary vs. linear search of a sequential array. Binary search does O(log n) comparisons, but with unpredictable conditional branches leading to a high cost per branch. Linear search does O(n) comparisons, but most of its conditional branches go the same way, making them relatively cheap. They are of about equal coding cost, because Arrays has methods to do binary search, and linear search is extremely simple. For very short arrays I would definitely use linear search. For very large arrays I would definitely use binary search. How about an array length 100? Which is faster? I think either would be "sufficiently good" except in very rare conditions. If I had a program with a performance bottleneck on a search of a length 100 array, I would measure in the context of that program. Patricia
[toc] | [prev] | [next] | [standalone]
| From | Arved Sandstrom <asandstrom3minus1@eastlink.ca> |
|---|---|
| Date | 2011-10-22 09:57 -0300 |
| Message-ID | <X4zoq.10182$eY3.4818@newsfe15.iad> |
| In reply to | #9083 |
On 11-10-21 10:57 PM, Daniel Pitts wrote: > On 10/21/11 2:27 PM, Roedy Green wrote: >> I think you should know without benchmarking each individual case >> which Map or Collection would be best for a given task. > I nearly agree. > > An experienced programmer should know which Map or Collection would work > sufficiently well for a given task. "Best", by some definitions, would > almost always be a custom class (perhaps not even implementing Map or > Collection). The thing is that best may be much better than needed. Roedy may have meant "best" in this sense. It's a word with lots of definitions, one of which is "most suitable". For a software developer/engineer I'd think that "most suitable" would always be the reigning definition of "best"; another definition (like the one you posit as a possibility, leading perhaps to a custom class) might be more of a computer science thing. Roedy's right, though, with his overall comments, although I wouldn't blame anything on The Donald [1]. For starters, way too many people - having never read anything by Knuth except various quotes, have only ever seen "premature optimization is the root of all evil" in isolation. They've never read the important sentences surrounding that one, that supply essential context. I've seen it myself, that folks seize on this one sentence and disparage reasonable design-time efforts to think about performance. And I've heard often enough from other developers who should know better that we should "write it first and we'll profile it later if it's too slow". Or a memory hog. These folks forget that Knuth talked about _premature_ optimization: there's a lot of design and implementation-time optimization that ain't. c2.com on their page about "UniformlySlowCode" (http://c2.com/cgi/wiki?UniformlySlowCode) discusses a situation that can happen a lot if you don't practice experience-seasoned non-premature optimization. I've seen this happen more than once, that an app is sluggish and a memory beast, and yet when you analyze it you find that you have to fix, well, everything. AHS 1. Man, how can Trump be The Donald? It's just not right. :-) I've set matters straight here. -- I tend to watch a little TV... Court TV, once in a while. Some of the cases I get interested in. -- O. J. Simpson
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2011-10-21 22:12 -0400 |
| Message-ID | <j7t8pu$1gk$2@dont-email.me> |
| In reply to | #9079 |
On 10/21/2011 5:27 PM, Roedy Green wrote:
>[...]
> Knuth has frightened people from even investigating speed out of
> curiosity.
Oh, yeah, right. As in "Ri-i-i-i-ght."
That's why he didn't write three entire volumes of TAOCP (and
more in the works) in an effort to reach beyond O() notation to the
constant factors that lie behind. That's why he didn't devote a
whole lot of text and a whole lot of subtle mathematics to winkling
out differences between O(n log n) Quicksort and O(n log n) Mergesort
and O(n log n) Heapsort and O(n log n(?)) Shellsort. That's why he
never explored threaded trees (they improve some operations by a
factor of only O(log n) so who cares?). That's why he never bothered
with Marsaglia's rectangle-wedge-tail method for generating normally-
distributed random numbers; it's just a trifling micro-optimization.
Oh, sorry, wait: By "frightened" you don't mean "dissuaded," but
literally "scared." Well, I confess that some of Knuth's math goes
over my head. But I feel at the same time that someone who shrinks
from even trying to understand it does not merit the title of
"engineer."
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2011-11-13 10:14 -0700 |
| Message-ID | <j9otv4$7j0$1@news.albasani.net> |
| In reply to | #9084 |
On 10/21/2011 7:12 PM, Eric Sosman wrote:
> On 10/21/2011 5:27 PM, Roedy Green wrote:
>> [...]
>> Knuth has frightened people from even investigating speed out of
>> curiosity.
>
> Oh, yeah, right. As in "Ri-i-i-i-ght."
>
> That's why he didn't write three entire volumes of TAOCP (and
> more in the works) in an effort to reach beyond O() notation to the
> constant factors that lie behind. That's why he didn't devote a
> whole lot of text and a whole lot of subtle mathematics to winkling
> out differences between O(n log n) Quicksort and O(n log n) Mergesort
> and O(n log n) Heapsort and O(n log n(?)) Shellsort. That's why he
> never explored threaded trees (they improve some operations by a
> factor of only O(log n) so who cares?). That's why he never bothered
> with Marsaglia's rectangle-wedge-tail method for generating normally-
> distributed random numbers; it's just a trifling micro-optimization.
>
> Oh, sorry, wait: By "frightened" you don't mean "dissuaded," but
> literally "scared." Well, I confess that some of Knuth's math goes
> over my head. But I feel at the same time that someone who shrinks
> from even trying to understand it does not merit the title of
> "engineer."
>
I tend to distinguish between what would be called "micro optimizations"
and "macro optimization".
worrying about the cost of an if/else block, a switch, or performing a
function/method call, is a micro optimization.
worrying about the choice of an algorithm or program architecture is a
macro-optimization. that is more about what most of the examples were about.
generally, macro-optimizations can be reasoned about, and improved upon
from experience.
OTOH, most micro-optimizations are worrying about small constant
factors, and tend to make code look nasty if used (and so are better off
used sparingly).
the main issue is that many people, when faced with poor performance (or
worry about possible poor performance), look right to trying to
micro-optimize whatever they perceive "might be" slow (which tends to be
bottom-level costs, like the costs of some operation X, ...), rather
than evaluating the larger scale architectural issues ("is there some
way I can avoid this operation altogether?").
but, there can be some overlap:
for example, I recently migrated an interpreter of mine from using a
"loop-and-switch" strategy to using threaded code.
this could be argued to be a micro-optimization (me afraid of the small
constant overhead of the switch), but actually it was more motivated by
flexibility: there are some things which can be done with threaded code
which can't be so readily done using fixed-form logic within a switch.
so, debatably, it wasn't even really an optimization in the first place,
it only had the side-effect of slightly improving performance.
OTOH, I recently also made an optimization to my type-checking code
which reduced the cost of a type-lookup operation by feeding the pointer
through a hash (seeing if the pointer recently had its type looked up).
however, even though it did effect behavior (it skipped the more
expensive operation of looking up an object in the heap), I still
classify this as a micro-optimization (however, to my defense, the
profiler was showing a lot of time going into type-lookups in this case).
similarly, it exhibited the usual micro-optimization property in that it
added some hair to the effected code (some logic to worry about a hash,
and a few calls added elsewhere to flush the hash).
more so, it is not likely to be a uniformly distributed optimization: it
will not improve the performance of type-checking things like fixnums,
which don't involve such a heap lookup in the first place.
or such...
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2011-11-13 11:00 -0800 |
| Message-ID | <4525243.79.1321210833177.JavaMail.geo-discussion-forums@prgt40> |
| In reply to | #9916 |
BGB wrote:
> Eric Sosman wrote:
>> Roedy Green wrote:
>>> [...]
>>> Knuth has frightened people from even investigating speed out of
>>> curiosity.
>>
>> Oh, yeah, right. As in "Ri-i-i-i-ght."
+1
Blaming Knuth is about as accurate as saying Jesus caused people to wage war.
>> That's why he didn't write three entire volumes of TAOCP (and
>> more in the works) in an effort to reach beyond O() notation to the
>> constant factors that lie behind. That's why he didn't devote a
>> whole lot of text and a whole lot of subtle mathematics to winkling
>> out differences between O(n log n) Quicksort and O(n log n) Mergesort
>> and O(n log n) Heapsort and O(n log n(?)) Shellsort. That's why he
>> never explored threaded trees (they improve some operations by a
>> factor of only O(log n) so who cares?). That's why he never bothered
>> with Marsaglia's rectangle-wedge-tail method for generating normally-
>> distributed random numbers; it's just a trifling micro-optimization.
>>
>> Oh, sorry, wait: By "frightened" you don't mean "dissuaded," but
>> literally "scared." Well, I confess that some of Knuth's math goes
Again, Knuth did not cause some low-browed programmer wannabe to feel fear; that's entirely their own responsibility.
>> over my head. But I feel at the same time that someone who shrinks
>> from even trying to understand it does not merit the title of
>> "engineer."
We all feel fear. It's what we do in the face of it that counts.
> I tend to distinguish between what would be called "micro optimizations"
> and "macro optimization".
>
> worrying about the cost of an if/else block, a switch, or performing a
> function/method call, is a micro optimization.
>
> worrying about the choice of an algorithm or program architecture is a
> macro-optimization. that is more about what most of the examples were about.
>
> generally, macro-optimizations can be reasoned about, and improved upon
> from experience.
>
> OTOH, most micro-optimizations are worrying about small constant
> factors, and tend to make code look nasty if used (and so are better off
> used sparingly).
>
>
> the main issue is that many people, when faced with poor performance (or
> worry about possible poor performance), look right to trying to
> micro-optimize whatever they perceive "might be" slow (which tends to be
> bottom-level costs, like the costs of some operation X, ...), rather
> than evaluating the larger scale architectural issues ("is there some
> way I can avoid this operation altogether?").
>
>
> but, there can be some overlap:
> for example, I recently migrated an interpreter of mine from using a
> "loop-and-switch" strategy to using threaded code.
>
> this could be argued to be a micro-optimization (me afraid of the small
> constant overhead of the switch), but actually it was more motivated by
> flexibility: there are some things which can be done with threaded code
> which can't be so readily done using fixed-form logic within a switch.
>
> so, debatably, it wasn't even really an optimization in the first place,
> it only had the side-effect of slightly improving performance.
>
>
> OTOH, I recently also made an optimization to my type-checking code
> which reduced the cost of a type-lookup operation by feeding the pointer
> through a hash (seeing if the pointer recently had its type looked up).
>
> however, even though it did effect behavior (it skipped the more
> expensive operation of looking up an object in the heap), I still
> classify this as a micro-optimization (however, to my defense, the
> profiler was showing a lot of time going into type-lookups in this case).
>
> similarly, it exhibited the usual micro-optimization property in that it
> added some hair to the effected code (some logic to worry about a hash,
> and a few calls added elsewhere to flush the hash).
>
> more so, it is not likely to be a uniformly distributed optimization: it
> will not improve the performance of type-checking things like fixnums,
> which don't involve such a heap lookup in the first place.
The problem isn't micro-optimization, as your experience supports. The problem, as Knuth said, is _premature_ optimization. That's optimization when you don't have enough data to support that what you do actually optimizes anything.
Your "macro-optimization" is optimization for which you have enough evidence early on. Choice of algorithm falls into this category.
The scenario you describe of incidental optimization, where a change made for algorithmic or modeling reasons improved performance, is not "optimization" in the sense of change intended to produce better performance. Unless you've done extensive measurement on multiple platforms in varied usage patterns, you actually don't have enough data to reliably aver improved performance, but since that's irrelevant to the change's purpose that's all right.
The scenario where you intentionally changed a detail of the code to improve performance, based on profiling and evidence that the work actually had the desired effect, may be "micro-optimization", but so what? It's not premature.
You actually demonstrated beautifully in your post what is and is not proper for optimization.
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2011-11-06 16:09 -0500 |
| Message-ID | <4eb6f7a3$0$290$14726298@news.sunsite.dk> |
| In reply to | #9079 |
On 10/21/2011 5:27 PM, Roedy Green wrote: > Knuth has frightened people from even investigating speed out of > curiosity. He was trying to stop people from writing needlessly > optimised unreadable code. True - we never see questions about micro optimizing here on usenet. Hey wait a minute ! Arne
[toc] | [prev] | [next] | [standalone]
| From | Aéris <aeris@imirhil.fr> |
|---|---|
| Date | 2011-10-15 23:59 +0200 |
| Message-ID | <4e9a0253$0$27973$426a74cc@news.free.fr> |
| In reply to | #8834 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Le 15/10/2011 23:36, Arne Vajhøj a écrit :
> Wrong approach. Write nice clean code. If it is fast enough then
> fine. If not then measure where the bottleneck is. It seems highly
> unlikely to be in getters.
I agree with you, my code is clean and fast enough for the moment but
this question of getter is more general and nag me since few time
> First thing would be to run a lot more than 100000 times. Such
> small intervals will be very random on a multi tasking OS.
Tested on Integer.MAX_VALUE, same result.
But I notice that if I read « tmp » value, results are differents.
Adding
if (tmp.isEmpty()) {
System.out.println("Empty");
}
leads to more expected result :
Direct access : 106 ms
Getter call : 2223 ms
Seems Java compiler is very efficient and don't really call the getter
if the return value is not read and the getter has no side-effect.
- --
Aeris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJOmgJOAAoJEK8zQvxDY4P9wqIIAJnfxjQgsoUMmKr7CxIamTBr
KeQNH7gCGX709/d2ZAb3PCRQ9UTKiuteEU8JHal6735ziyifIBtwxHBtQTB6jduC
74uW1aMzMUYJbRpBL0wViL7SYW7Ob57a1Oz8/bYl1+htDEVb2FWMMmfgbhDf9HeT
MXMilvcCVlDs4OxG3NnFL14Y7Nv1He6YzV3BaMOIz1serH9pZy1/NCgdazfhwrZH
6bGwtYET2hbzabda9o2CrX8jRjenyLWz1ad9QPanLvHDWJRf+Y316lNXLmvhbR++
+cQzQJ8hmj9nlyFGJuQ1tG0VzTE/E/qgcdN/sCOWiApC4Nhsp3tUD+68H02bFTU=
=UnUI
-----END PGP SIGNATURE-----
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2011-10-15 19:44 -0400 |
| Message-ID | <4e9a1ae6$0$282$14726298@news.sunsite.dk> |
| In reply to | #8836 |
On 10/15/2011 5:59 PM, Aéris wrote:
> Le 15/10/2011 23:36, Arne Vajhøj a écrit :
>> Wrong approach. Write nice clean code. If it is fast enough then
>> fine. If not then measure where the bottleneck is. It seems highly
>> unlikely to be in getters.
>
> I agree with you, my code is clean and fast enough for the moment but
> this question of getter is more general and nag me since few time
If this is for fun, then it is OK. 25 years ago I spent a lot
of time trying to optimize Sieve of Eratosthenes in VAX
assembler. Fun.
But this is work, then stop wasting your employers money and
focus on real problems.
>> First thing would be to run a lot more than 100000 times. Such
>> small intervals will be very random on a multi tasking OS.
>
> Tested on Integer.MAX_VALUE, same result.
>
> But I notice that if I read « tmp » value, results are differents.
> Adding
> if (tmp.isEmpty()) {
> System.out.println("Empty");
> }
> leads to more expected result :
> Direct access : 106 ms
> Getter call : 2223 ms
> Seems Java compiler is very efficient and don't really call the getter
> if the return value is not read and the getter has no side-effect.
The JIT compiler can do many things.
Another problem with microbenchmarks is that they typical
measure scenario X1 vs Y1 and there are no guarantee that
X1 faster than Y1 implies that sum(Wi*Xi) is faster
that sum(Wi*Yi) for i=1..n, which is the real world.
Small differences can mean a lot of difference for a
single scenario, which is not relevant for the real world
with hundreds or thousands of almost identical but still
different scenarios.
Arne
[toc] | [prev] | [next] | [standalone]
| From | Aéris <aeris@imirhil.fr> |
|---|---|
| Date | 2011-10-16 13:14 +0200 |
| Message-ID | <4e9abc79$0$3198$426a34cc@news.free.fr> |
| In reply to | #8839 |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Le 16/10/2011 01:44, Arne Vajhøj a écrit : > But this is work, then stop wasting your employers money and > focus on real problems. This is not my employer's money, but my free time =) - -- Aeris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOmrx0AAoJEK8zQvxDY4P9zscIAL+ERs3ds9+Qc5dbsq/pGduN zAu7LmCOU37+pVuPKZuiwV7YQOwM4TDHutVLX3vfocYsTkxCZSz6LyqBrWBxMAOx NIc6c608ndRjre44aHX3m2Fk5lZFqJ+8pIRzDVIPCKImBMgsdO/EuYZdn9a+BiD7 q9vcGquiL/zMVUJNuKas8ahiqsquL1l1t0Mi0s1ooeqwsBp0Thw1Xg6qNoAQZ2/c 1u2/mo80Ixb2976BlkQir0enjYQV0D3lHobc4SKZ+Dc8vhS7i+MkjcHBqifPxYMK b6rERWKaJXVdCerE/QKyAtnmEG9Pc5q7dQNSjgSLvg6EJ4deVKuU/2iUtpzE8Wo= =WQdS -----END PGP SIGNATURE-----
[toc] | [prev] | [next] | [standalone]
| From | Lars Enderin <lars.enderin@telia.com> |
|---|---|
| Date | 2011-10-16 16:28 +0200 |
| Message-ID | <4E9AE9F4.8070406@telia.com> |
| In reply to | #8859 |
2011-10-16 13:14, Aéris skrev: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Le 16/10/2011 01:44, Arne Vajhøj a écrit : >> But this is work, then stop wasting your employers money and >> focus on real problems. > > This is not my employer's money, but my free time =) > > - -- > Aeris > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJOmrx0AAoJEK8zQvxDY4P9zscIAL+ERs3ds9+Qc5dbsq/pGduN > zAu7LmCOU37+pVuPKZuiwV7YQOwM4TDHutVLX3vfocYsTkxCZSz6LyqBrWBxMAOx > NIc6c608ndRjre44aHX3m2Fk5lZFqJ+8pIRzDVIPCKImBMgsdO/EuYZdn9a+BiD7 > q9vcGquiL/zMVUJNuKas8ahiqsquL1l1t0Mi0s1ooeqwsBp0Thw1Xg6qNoAQZ2/c > 1u2/mo80Ixb2976BlkQir0enjYQV0D3lHobc4SKZ+Dc8vhS7i+MkjcHBqifPxYMK > b6rERWKaJXVdCerE/QKyAtnmEG9Pc5q7dQNSjgSLvg6EJ4deVKuU/2iUtpzE8Wo= > =WQdS > -----END PGP SIGNATURE----- Why are you wasting bandwidth by including PGP signatures? Your contributions are not all that important.
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2011-10-16 09:47 -0400 |
| Message-ID | <j7en9t$r5o$1@dont-email.me> |
| In reply to | #8832 |
On 10/15/2011 5:03 PM, Aéris wrote:
>
> I work on an application where performances are important.
>
> To optimize it, I thought a direct access to a variable (foo.bar) would
> be more efficient than a getter call (foo.getBar()).
> I thought by avoiding the call to a method and all that goes with it
> (context switching, stacking, return value…), I can save time, but this
> code sample prove the contrary :
> http://pastebin.com/bP1nqxce
> Direct variable access : 1041 ms
> Getter call : 556 ms
>
> The difference is even more important if I don't modify the variable
> value (lines 29 and 36 commented) :
> Direct variable access : 95 ms
> Getter call : 4 ms
>
> How can we explain this not obvious huge difference ( 50 and 95% ) ?
Insufficiently precise measurement. You are trying to tell
whether two nickels are heavier than four pennies by piling each
stack of coins in turn atop a bulldozer and weighing the bulldozer.
If you want to measure tiny effects, seek a measurement framework
that doesn't involve huge effects.
Fundamentally, though, you're going about the larger task in
the wrong way: You are simply guessing that the speed of access to
a variable is important to the performance of your application. Do
you have even the slightest shred of evidence that this is so? Or
are you trying to improve your car's fuel economy by scraping off
all the paint to reduce the vehicle weight?
1: Decide what performance criteria the application must satisfy.
Make this quantifiable: "Throughput of at least X doodads per eyeblink"
or "95% of responses in no more than Y milliquivers," not undecidable
like "As fast as possible."
2: (Optional) Make back-of-the-envelope calculations about data
rates, data set sizes, and so on, just to estimate the resources
that will be required. Is the load within reach of a laptop, or do
you need a clustered network of supercomputers? Can you handle the
traffic with WiFi, or do you need sixteen 10Gb/s fibers in parallel?
3: Write the application, as cleanly and simply as you can in
light of [1] and [2]. During this stage, bear in mind that even the
most expert of programmers is usually surprised about where his code
spends the majority of its time (this is not opinion; it's been found
over and over again in actual experiment). In other words, favor
"clean and simple" over "clever and reputedly fast."
4: Measure the performance, and compare to the thresholds of [1].
If you satisfy [1], stop!
5: (Only if [1] isn't satisfied) Apply profilers, debuggers, and
other tools to discover which parts of the program are too slow. Apply
your efforts to speeding up those parts only; pay no attention to the
others. Then return to [4]. In very rare cases (usually when [2] was
skipped or when some external agency forces a change in [1]), return
to [3] and start over.
Do not begin by trying to micro-optimize; that way lies madness.
Trust me; I know: Been there, done that, bear the scars.
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | Jaap Droogers <JaapDroogers@unusable.meel.homelinux.net> |
|---|---|
| Date | 2011-10-16 22:12 +0200 |
| Message-ID | <4e9b3abe$0$31993$e4fe514c@dreader24.news.xs4all.nl> |
| In reply to | #8832 |
On 15-10-11 23:03, schreef Aéris: > Hello everybody. > > I work on an application where performances are important. > > To optimize it, I thought a direct access to a variable (foo.bar) would > be more efficient than a getter call (foo.getBar()). > I thought by avoiding the call to a method and all that goes with it > (context switching, stacking, return value…), I can save time, but this > code sample prove the contrary : > http://pastebin.com/bP1nqxce > Direct variable access : 1041 ms > Getter call : 556 ms > > The difference is even more important if I don't modify the variable > value (lines 29 and 36 commented) : > Direct variable access : 95 ms > Getter call : 4 ms > > How can we explain this not obvious huge difference ( 50 and 95% ) ? > You forgot the JIT compiler. All the nice clean code you have written is optimized by the compiler. You don't write C code, remember. Why is Java so fast: it uses the JIT compiler which optimizes the code for the machine the program is running on. if you write int a = myObject.getA(); the compiler compiles it probably as: int a = myObject.a; There is also a change that you try to optimize your code and the JIT compiler does not now how to optimize it, so your code runs slower. Jaap
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web