Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #61127 > unrolled thread
| Started by | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| First post | 2013-12-06 00:47 -0800 |
| Last post | 2013-12-10 04:14 -0800 |
| Articles | 17 — 9 participants |
Back to article view | Back to comp.lang.python
squeeze out some performance Robert Voigtländer <r.voigtlaender@gmail.com> - 2013-12-06 00:47 -0800
Re: squeeze out some performance Jeremy Sanders <jeremy@jeremysanders.net> - 2013-12-06 10:46 +0100
Re: squeeze out some performance Chris Angelico <rosuav@gmail.com> - 2013-12-06 23:13 +1100
Re: squeeze out some performance Robert Voigtländer <r.voigtlaender@gmail.com> - 2013-12-06 08:29 -0800
Re: squeeze out some performance Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-06 16:36 +0000
Re: squeeze out some performance Robert Voigtländer <r.voigtlaender@gmail.com> - 2013-12-06 08:44 -0800
Re: squeeze out some performance John Ladasky <john_ladasky@sbcglobal.net> - 2013-12-06 08:52 -0800
Re: squeeze out some performance Joel Goldstick <joel.goldstick@gmail.com> - 2013-12-06 17:07 -0500
Re: squeeze out some performance Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-06 22:38 +0000
Re: squeeze out some performance Dan Stromberg <drsalists@gmail.com> - 2013-12-06 15:01 -0800
Re: squeeze out some performance Robert Voigtländer <r.voigtlaender@gmail.com> - 2013-12-09 06:19 -0800
Re: squeeze out some performance Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-09 14:52 +0000
Re: squeeze out some performance Robin Becker <robin@reportlab.com> - 2013-12-09 15:54 +0000
Re: squeeze out some performance Dave Angel <davea@davea.name> - 2013-12-09 15:46 -0500
Re: squeeze out some performance Robin Becker <robin@reportlab.com> - 2013-12-10 12:54 +0000
Re: squeeze out some performance Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-09 23:57 +0000
Re: squeeze out some performance Robert Voigtländer <r.voigtlaender@gmail.com> - 2013-12-10 04:14 -0800
| From | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| Date | 2013-12-06 00:47 -0800 |
| Subject | squeeze out some performance |
| Message-ID | <ce6477c1-29c0-43b3-9e56-336a5825869b@googlegroups.com> |
Hi, I try to squeeze out some performance of the code pasted on the link below. http://pastebin.com/gMnqprST The code will be used to continuously analyze sonar sensor data. I set this up to calculate all coordinates in a sonar cone without heavy use of trigonometry (assuming that this way is faster in the end). I optimized as much as I could. Maybe one of you has another bright idea to squeeze out a bit more? Thanks Robert
[toc] | [next] | [standalone]
| From | Jeremy Sanders <jeremy@jeremysanders.net> |
|---|---|
| Date | 2013-12-06 10:46 +0100 |
| Message-ID | <mailman.3633.1386323183.18130.python-list@python.org> |
| In reply to | #61127 |
Robert Voigtländer wrote: > I try to squeeze out some performance of the code pasted on the link > below. http://pastebin.com/gMnqprST > > The code will be used to continuously analyze sonar sensor data. I set > this up to calculate all coordinates in a sonar cone without heavy use of > trigonometry (assuming that this way is faster in the end). > > I optimized as much as I could. Maybe one of you has another bright idea > to squeeze out a bit more? This sort of code is probably harder to make faster in pure python. You could try profiling it to see where the hot spots are. Perhaps the choice of arrays or sets might have some speed impact. One idea would be to use something like cython to compile your python code to an extension module, with some hints to the types of the various values. I would go down the geometry route. If you can restate your problem in terms of geometry, it might be possible to replace all that code with a few numpy array operations. e.g. for finding pixels in a circle of radius 50 import numpy as np radiussqd = np.fromfunction(lambda y,x: (y-50)**2+(x-50)**2, (100,100) ) all_y, all_x = np.indices((100,100)) yvals = all_y[radiussqd < 50**2] Jeremy
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-12-06 23:13 +1100 |
| Message-ID | <mailman.3642.1386332017.18130.python-list@python.org> |
| In reply to | #61127 |
On Fri, Dec 6, 2013 at 8:46 PM, Jeremy Sanders <jeremy@jeremysanders.net> wrote:
> This sort of code is probably harder to make faster in pure python. You
> could try profiling it to see where the hot spots are. Perhaps the choice of
> arrays or sets might have some speed impact.
I'd make this recommendation MUCH stronger.
Rule 1 of optimization: Don't.
Rule 2 (for experts only): Don't yet.
Once you find that your program actually is running too slowly, then
AND ONLY THEN do you start looking at tightening something up. You'll
be amazed how little you need to change; start with good clean
idiomatic code, and then if it takes too long, you tweak just a couple
of things and it's fast enough. And when you do come to the
tweaking...
Rule 3: Measure twice, cut once.
Rule 4: Actually, measure twenty times, cut once.
Profile your code to find out what's actually slow. This is very
important. Here's an example from a real application (not in Python,
it's in a semantically-similar language called Pike):
https://github.com/Rosuav/Gypsum/blob/d9907e1507c52189c83ae25f5d7be85235b616fa/window.pike
I noticed that I could saturate one CPU core by typing commands very
quickly. Okay. That gets us past the first two rules (it's a MUD
client, it should not be able to saturate one core of an i5). The code
looks roughly like this:
paint():
for line in lines:
if line_is_visible:
paint_line(line)
paint_line():
for piece_of_text in text:
if highlighted: draw_highlighted()
else: draw_not_highlighted()
My first guess was that the actual drawing was taking the time, since
that's a whole lot of GTK calls. But no; the actual problem was the
iteration across all lines and then finding out if they're visible or
not (possibly because it obliterates the CPU caches). Once the
scrollback got to a million lines or so, that was prohibitively
expensive. I didn't realize that until I actually profiled the code
and _measured_ where the time was being spent.
How fast does your code run? How fast do you need it to run? Lots of
optimization questions are answered by "Yaknow what, it don't even
matter", unless you're running in a tight loop, or on a
microcontroller, or something. Halving the time taken sounds great
until you see that it's currently taking 0.0001 seconds and happens in
response to user action.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| Date | 2013-12-06 08:29 -0800 |
| Message-ID | <35814163-d0a5-4927-b61d-7d3b4ee0989b@googlegroups.com> |
| In reply to | #61137 |
Thanks for your replies. I already did some basic profiling and optimized a lot. Especially with help of a goof python performance tips list I found. I think I'll follow the cython path. The geometry approach also sound good. But it's way above my math/geometry knowledge. Thanks for your input!
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-12-06 16:36 +0000 |
| Message-ID | <mailman.3656.1386347783.18130.python-list@python.org> |
| In reply to | #61164 |
On 06/12/2013 16:29, Robert Voigtländer wrote: > Thanks for your replies. > > I already did some basic profiling and optimized a lot. Especially > with help of a goof python performance tips list I found. > Wonderful typo -----^ :) > I think I'll follow the cython path. > The geometry approach also sound good. But it's way above my math/geometry knowledge. > > Thanks for your input! > -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| Date | 2013-12-06 08:44 -0800 |
| Message-ID | <dae19b90-f141-422f-b5fb-bd2631944fe5@googlegroups.com> |
| In reply to | #61165 |
Am Freitag, 6. Dezember 2013 17:36:03 UTC+1 schrieb Mark Lawrence: > > I already did some basic profiling and optimized a lot. Especially > with help of a goof python performance tips list I found. > > Wonderful typo -----^ :) > Oh well :-) ... it was a good one. Just had a quick look at Cython. Looks great. Thanks for the tip.
[toc] | [prev] | [next] | [standalone]
| From | John Ladasky <john_ladasky@sbcglobal.net> |
|---|---|
| Date | 2013-12-06 08:52 -0800 |
| Message-ID | <9df6ccd7-828d-43be-ac49-fe1c6a38bae7@googlegroups.com> |
| In reply to | #61127 |
On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
> I try to squeeze out some performance of the code pasted on the link below.
> http://pastebin.com/gMnqprST
Several comments:
1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls?
2) I looked up the "Bresenham algorithm", and found two references which may be relevant. The original algorithm was one which computed good raster approximations to straight lines. The second algorithm described may be more pertinent to you, because it draws arcs of circles.
http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
http://en.wikipedia.org/wiki/Midpoint_circle_algorithm
Both of these algorithms are old, from the 1960's, and can be implemented using very simple CPU register operations and minimal memory. Both of the web pages I referenced have extensive example code and pseudocode, and discuss optimization. If you need speed, is this really a job for Python?
3) I THINK that I see some code -- those duplicated parts -- which might benefit from the use of multiprocessing (assuming that you have a multi-core CPU). But I would have to read more deeply to be sure. I need to understand the algorithm more completely, and exactly how you have modified it for your needs.
[toc] | [prev] | [next] | [standalone]
| From | Joel Goldstick <joel.goldstick@gmail.com> |
|---|---|
| Date | 2013-12-06 17:07 -0500 |
| Message-ID | <mailman.3667.1386368018.18130.python-list@python.org> |
| In reply to | #61170 |
[Multipart message — attachments visible in raw view] — view raw
On Fri, Dec 6, 2013 at 11:52 AM, John Ladasky <john_ladasky@sbcglobal.net>wrote:
> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote:
>
> > I try to squeeze out some performance of the code pasted on the link
> below.
> > http://pastebin.com/gMnqprST
>
Not that this will speed up your code but you have this:
if not clockwise:
s = start
start = end
end = s
Python people would write:
end, start = start, end
You have quite a few if statements that involve multiple comparisons of the
same variable. Did you know you can do things like this in python:
>>> x = 4
>>> 2 < x < 7
True
>>> x = 55
>>> 2 < x < 7
False
> Several comments:
>
> 1) I find this program to be very difficult to read, largely because
> there's a whole LOT of duplicated code. Look at lines 53-80, and lines
> 108-287, and lines 294-311. It makes it harder to see what this algorithm
> actually does. Is there a way to refactor some of this code to use some
> shared function calls?
>
> 2) I looked up the "Bresenham algorithm", and found two references which
> may be relevant. The original algorithm was one which computed good raster
> approximations to straight lines. The second algorithm described may be
> more pertinent to you, because it draws arcs of circles.
>
> http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
> http://en.wikipedia.org/wiki/Midpoint_circle_algorithm
>
> Both of these algorithms are old, from the 1960's, and can be implemented
> using very simple CPU register operations and minimal memory. Both of the
> web pages I referenced have extensive example code and pseudocode, and
> discuss optimization. If you need speed, is this really a job for Python?
>
> 3) I THINK that I see some code -- those duplicated parts -- which might
> benefit from the use of multiprocessing (assuming that you have a
> multi-core CPU). But I would have to read more deeply to be sure. I need
> to understand the algorithm more completely, and exactly how you have
> modified it for your needs.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
--
Joel Goldstick
http://joelgoldstick.com
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-12-06 22:38 +0000 |
| Message-ID | <mailman.3668.1386369561.18130.python-list@python.org> |
| In reply to | #61170 |
On 06/12/2013 16:52, John Ladasky wrote: > On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: > >> I try to squeeze out some performance of the code pasted on the link below. >> http://pastebin.com/gMnqprST > > Several comments: > > 1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls? > A handy tool for detecting duplicated code here http://clonedigger.sourceforge.net/ for anyone who's interested. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Dan Stromberg <drsalists@gmail.com> |
|---|---|
| Date | 2013-12-06 15:01 -0800 |
| Message-ID | <mailman.3669.1386370917.18130.python-list@python.org> |
| In reply to | #61170 |
[Multipart message — attachments visible in raw view] — view raw
On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence <breamoreboy@yahoo.co.uk>wrote: > On 06/12/2013 16:52, John Ladasky wrote: > >> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: >> >> I try to squeeze out some performance of the code pasted on the link >>> below. >>> http://pastebin.com/gMnqprST >>> >> >> Several comments: >> >> 1) I find this program to be very difficult to read, largely because >> there's a whole LOT of duplicated code. Look at lines 53-80, and lines >> 108-287, and lines 294-311. It makes it harder to see what this algorithm >> actually does. Is there a way to refactor some of this code to use some >> shared function calls? >> >> > A handy tool for detecting duplicated code here > http://clonedigger.sourceforge.net/ for anyone who's interested. > Pylint does this too...
[toc] | [prev] | [next] | [standalone]
| From | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| Date | 2013-12-09 06:19 -0800 |
| Message-ID | <5320aa30-8ea5-4680-8796-ef2c7dac3984@googlegroups.com> |
| In reply to | #61189 |
Am Samstag, 7. Dezember 2013 00:01:49 UTC+1 schrieb Dan Stromberg: > On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence <bream...@yahoo.co.uk> wrote: > > > On 06/12/2013 16:52, John Ladasky wrote: > > > On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: > > > > > I try to squeeze out some performance of the code pasted on the link below. > > http://pastebin.com/gMnqprST > > > > > Several comments: > > > > 1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls? > > > > > > > > A handy tool for detecting duplicated code here http://clonedigger.sourceforge.net/ for anyone who's interested. > > > > Pylint does this too... Thanks again. I'll try to compress the code and have a look at the "multiple comparisons" topic. Robert
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-12-09 14:52 +0000 |
| Message-ID | <mailman.3780.1386600779.18130.python-list@python.org> |
| In reply to | #61388 |
On 09/12/2013 14:19, Robert Voigtländer wrote: > Am Samstag, 7. Dezember 2013 00:01:49 UTC+1 schrieb Dan Stromberg: >> On Fri, Dec 6, 2013 at 2:38 PM, Mark Lawrence <bream...@yahoo.co.uk> wrote: >> >> >> On 06/12/2013 16:52, John Ladasky wrote: >> >> >> On Friday, December 6, 2013 12:47:54 AM UTC-8, Robert Voigtländer wrote: >> >> >> >> >> I try to squeeze out some performance of the code pasted on the link below. >> >> http://pastebin.com/gMnqprST >> >> >> >> >> Several comments: >> >> >> >> 1) I find this program to be very difficult to read, largely because there's a whole LOT of duplicated code. Look at lines 53-80, and lines 108-287, and lines 294-311. It makes it harder to see what this algorithm actually does. Is there a way to refactor some of this code to use some shared function calls? >> >> >> >> >> >> >> >> A handy tool for detecting duplicated code here http://clonedigger.sourceforge.net/ for anyone who's interested. >> >> >> >> Pylint does this too... > > Thanks again. I'll try to compress the code and have a look at the "multiple comparisons" topic. > > Robert > Would you be kind enough to compress your messages while you're at it by reading and actioning https://wiki.python.org/moin/GoogleGroupsPython thanks, a quick glance above will tell you why :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Robin Becker <robin@reportlab.com> |
|---|---|
| Date | 2013-12-09 15:54 +0000 |
| Message-ID | <mailman.3783.1386604487.18130.python-list@python.org> |
| In reply to | #61170 |
On 06/12/2013 22:07, Joel Goldstick wrote: .......... >> > > Not that this will speed up your code but you have this: > > if not clockwise: > s = start > start = end > end = s > > Python people would write: > end, start = start, end this works for some small number of variables, but on my machine with python 2.7 I start losing with 4 variables eg > C:\code\optichrome\74663>python -mtimeit -s"a=1;b=2;c=3;d=4" "a,b,c,d=b,c,d,a" > 1000000 loops, best of 3: 0.206 usec per loop > > C:\code\optichrome\74663>python -mtimeit -s"a=1;b=2;c=3;d=4" "t=a;a=b;b=c;c=d;d=t" > 10000000 loops, best of 3: 0.118 usec per loop It doesn't seem to make much difference that the variables are related as I see a similar behaviour for simple assignments > C:\code\optichrome\74663>python -mtimeit -s"a=1;b=2;c=3;d=4;e=5;f=6;g=7;h=8" "a,b,c,d=e,f,g,h" > 1000000 loops, best of 3: 0.204 usec per loop > > C:\code\optichrome\74663>python -mtimeit -s"a=1;b=2;c=3;d=4;e=5;f=6;g=7;h=8" "a=e;b=f;c=g;d=h" > 10000000 loops, best of 3: 0.103 usec per loop for less than 4 variables the tuple method is faster. -- Robin Becker
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-12-09 15:46 -0500 |
| Message-ID | <mailman.3795.1386621919.18130.python-list@python.org> |
| In reply to | #61170 |
On Mon, 09 Dec 2013 15:54:36 +0000, Robin Becker <robin@reportlab.com> wrote: > On 06/12/2013 22:07, Joel Goldstick wrote: > > end, start = start, end > a similar behaviour for simple assignments > for less than 4 variables the tuple method is faster. What does speed have to do with it? When you want to swap two variables, the tuple assignment reads better. -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Robin Becker <robin@reportlab.com> |
|---|---|
| Date | 2013-12-10 12:54 +0000 |
| Message-ID | <mailman.3823.1386680060.18130.python-list@python.org> |
| In reply to | #61170 |
On 09/12/2013 20:46, Dave Angel wrote: > On Mon, 09 Dec 2013 15:54:36 +0000, Robin Becker <robin@reportlab.com> wrote: >> On 06/12/2013 22:07, Joel Goldstick wrote: >> > end, start = start, end > >> a similar behaviour for simple assignments > >> for less than 4 variables the tuple method is faster. > > What does speed have to do with it? When you want to swap two variables, the > tuple assignment reads better. > Well the OP is asking about performance so I guess the look and feel might be sacrificed for speed in some circumstances. The tuple approach is more appealing when the lhs & rhs are connected, but it seems that even for more general assignments the tuple assignment may be faster for small numbers of variables. Looking at the output of dis for this case d,e,f=c,b,a it seems that we get code like this LOAD_NAME 3 (c) LOAD_NAME 2 (b) LOAD_NAME 1 (a) ROT_THREE ROT_TWO STORE_NAME 4 (d) STORE_NAME 5 (e) STORE_NAME 6 (f) for d = c e = b f = a we get this LOAD_NAME 3 (c) STORE_NAME 4 (d) LOAD_NAME 2 (b) STORE_NAME 5 (e) LOAD_NAME 1 (a) STORE_NAME 6 (f) which is not obviously slower, but I consistently get the former to be faster. I suppose it's a cache issue or something. However, for this particular case when the variables are not connected I find the tuple assignment less pythonic, but perhaps faster (even though in this case no tuples are built). -- Robin Becker
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-12-09 23:57 +0000 |
| Message-ID | <mailman.3803.1386633504.18130.python-list@python.org> |
| In reply to | #61127 |
On 06/12/2013 08:47, Robert Voigtländer wrote: > Hi, > > I try to squeeze out some performance of the code pasted on the link below. > http://pastebin.com/gMnqprST > > The code will be used to continuously analyze sonar sensor data. I set this up to calculate all coordinates in a sonar cone without heavy use of trigonometry (assuming that this way is faster in the end). > > I optimized as much as I could. Maybe one of you has another bright idea to squeeze out a bit more? > > Thanks > Robert > Actually for optimised code it looks very similar to some code posted here http://www.daniweb.com/software-development/python/threads/321181/python-bresenham-circle-arc-algorithm over three years ago. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Robert Voigtländer <r.voigtlaender@gmail.com> |
|---|---|
| Date | 2013-12-10 04:14 -0800 |
| Message-ID | <4162ca51-a334-471e-84c7-8478411d41f7@googlegroups.com> |
| In reply to | #61425 |
> Actually for optimised code it looks very similar to some code posted > > here > > http://www.daniweb.com/software-development/python/threads/321181/python-bresenham-circle-arc-algorithm > > over three years ago. > This is where it origins from. I just extended it for my needs and now want to optimize it. List comprehensions instead of some for loops brought another 25%. And made the code shorter.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web