Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89608 > unrolled thread

Python is not bad ;-)

Started byCecil Westerhof <Cecil@decebal.nl>
First post2015-04-30 09:07 +0200
Last post2015-04-30 22:05 +0200
Articles 17 on this page of 37 — 12 participants

Back to article view | Back to comp.lang.python


Contents

  Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 09:07 +0200
    Re: Python is not bad ;-) Ben Finney <ben+python@benfinney.id.au> - 2015-04-30 19:10 +1000
      Re: Python is not bad ;-) Marko Rauhamaa <marko@pacujo.net> - 2015-04-30 13:16 +0300
        Re: Python is not bad ;-) Chris Angelico <rosuav@gmail.com> - 2015-04-30 20:52 +1000
        Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 13:30 +0200
          Re: Python is not bad ;-) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-01 17:03 +1000
            Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-05-01 09:47 +0200
            Re: Python is not bad ;-) Christian Gollwitzer <auriocus@gmx.de> - 2015-05-01 19:56 +0200
            Re: Python is not bad ;-) Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-05-02 19:44 +1200
            Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-05-02 10:26 +0200
              Re: Python is not bad ;-) Marko Rauhamaa <marko@pacujo.net> - 2015-05-02 12:10 +0300
                Re: Python is not bad ;-) Chris Angelico <rosuav@gmail.com> - 2015-05-02 19:25 +1000
                  Re: Python is not bad ;-) Marko Rauhamaa <marko@pacujo.net> - 2015-05-02 12:58 +0300
                    Re: Python is not bad ;-) Dave Angel <davea@davea.name> - 2015-05-02 06:22 -0400
                    Re: Python is not bad ;-) Chris Angelico <rosuav@gmail.com> - 2015-05-02 20:42 +1000
                    Re: Python is not bad ;-) Christian Gollwitzer <auriocus@gmx.de> - 2015-05-02 13:07 +0200
                      Re: Python is not bad ;-) Chris Angelico <rosuav@gmail.com> - 2015-05-02 21:21 +1000
                        Re: Python is not bad ;-) Christian Gollwitzer <auriocus@gmx.de> - 2015-05-02 13:32 +0200
                          Re: Python is not bad ;-) Marko Rauhamaa <marko@pacujo.net> - 2015-05-02 14:42 +0300
                            Re: Python is not bad ;-) Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-02 09:45 -0600
                            Re: Python is not bad ;-) Chris Angelico <rosuav@gmail.com> - 2015-05-03 01:55 +1000
                            Re: Python is not bad ;-) Joonas Liik <liik.joonas@gmail.com> - 2015-05-02 16:50 +0300
                            Re: Python is not bad ;-) Joonas Liik <liik.joonas@gmail.com> - 2015-05-02 18:53 +0300
                            Re: Python is not bad ;-) Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-02 11:00 -0600
                            Re: Python is not bad ;-) Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-02 11:17 -0600
                            Re: Python is not bad ;-) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-05-02 18:22 +0100
                Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-05-02 12:29 +0200
              Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-05-02 11:33 +0200
                Re: Python is not bad ;-) Dave Angel <davea@davea.name> - 2015-05-02 06:35 -0400
                  Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-05-02 13:12 +0200
                Re: Python is not bad ;-) Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-02 09:31 -0600
        Re: Python is not bad ;-) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-01 15:56 +1000
      Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 13:10 +0200
    Re: Python is not bad ;-) Michael Torrie <torriem@gmail.com> - 2015-04-30 08:03 -0600
      Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 18:11 +0200
        Re: Python is not bad ;-) Christian Gollwitzer <auriocus@gmx.de> - 2015-04-30 19:59 +0200
          Re: Python is not bad ;-) Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 22:05 +0200

Page 2 of 2 — ← Prev page 1 [2]


#89776

FromChris Angelico <rosuav@gmail.com>
Date2015-05-03 01:55 +1000
Message-ID<mailman.14.1430582153.12865.python-list@python.org>
In reply to#89767
On Sun, May 3, 2015 at 1:45 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Sat, May 2, 2015 at 5:42 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
>> Christian Gollwitzer <auriocus@gmx.de>:
>>
>>> That's why I still think it is a microoptimization, which helps only
>>> in some specific cases.
>>
>> It isn't done for performance. It's done to avoid a stack overflow
>> exception.
>
> If your tree is balanced, then the number of items you would need to
> have to get a stack overflow exception would be approximately 2 **
> 1000, which you can't possibly hope to fit into memory.
>
> If your tree is unbalanced and you're getting a stack overflow
> exception, then maybe you should think about balancing it.

That's assuming it's a search tree, where you *can* just "think about
balancing it". What if it's a parse tree? Let's say you're walking the
AST of a Python module, looking for all functions that contain 'yield'
or 'yield from' (ie generator functions). To do that, you need to walk
the entire depth of the tree, no matter how far that goes. I'm not
sure how complex a piece of code would have to be to hit 1000, but it
wouldn't be hard to have each level of tree cost you two or three
stack entries, so that could come down to just a few hundred.

However, while that _is_ more likely to produce an unbalanced tree,
it's also that much less capable of being converted into tail
recursion.

ChrisA

[toc] | [prev] | [next] | [standalone]


#89779

FromJoonas Liik <liik.joonas@gmail.com>
Date2015-05-02 16:50 +0300
Message-ID<mailman.16.1430585619.12865.python-list@python.org>
In reply to#89767

[Multipart message — attachments visible in raw view] — view raw

I agree, stack overflow is literally the main issue that ive run in to
(tree traversal)
I've yet to refactor recursion to iterative for speed, but i have done so
to avoid hitting the stack size limit.

Tree traversal and similar problems in particular lend themselves well to
recursion and are not quite as trivial to do in an iterative fashion.

I've also found myself constructing an ad-hoc trampoline at least once just
to sneak past the stack limit.
(probably very evil but it worked and it wasn't nearly as ugly as it sounds
like so ill live with it..)


On 2 May 2015 at 14:42, Marko Rauhamaa <marko@pacujo.net> wrote:

> Christian Gollwitzer <auriocus@gmx.de>:
>
> > That's why I still think it is a microoptimization, which helps only
> > in some specific cases.
>
> It isn't done for performance. It's done to avoid a stack overflow
> exception.
>
>
> Marko
> --
> https://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]


#89780

FromJoonas Liik <liik.joonas@gmail.com>
Date2015-05-02 18:53 +0300
Message-ID<mailman.17.1430585619.12865.python-list@python.org>
In reply to#89767

[Multipart message — attachments visible in raw view] — view raw

Balancing of trees is kind of irrelevant when "tree" means "search space"
no?
And you definitely dont need to keep the entire tree in memory at the same
time.

By cropping unsuitable branches early (and not keeping the entire tree in
memory)
it is quite easy to have more than 1000 of call stack and still have
reasonable
preformance. (some/many nodes have 0 or 1 children)

Also should not-running-out-of-call-stack really be the main reason to
balance trees?
That sounds like an optimisation to me ..

On 2 May 2015 at 18:45, Ian Kelly <ian.g.kelly@gmail.com> wrote:

> On Sat, May 2, 2015 at 5:42 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
> > Christian Gollwitzer <auriocus@gmx.de>:
> >
> >> That's why I still think it is a microoptimization, which helps only
> >> in some specific cases.
> >
> > It isn't done for performance. It's done to avoid a stack overflow
> > exception.
>
> If your tree is balanced, then the number of items you would need to
> have to get a stack overflow exception would be approximately 2 **
> 1000, which you can't possibly hope to fit into memory.
>
> If your tree is unbalanced and you're getting a stack overflow
> exception, then maybe you should think about balancing it.
> --
> https://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]


#89782

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-05-02 11:00 -0600
Message-ID<mailman.19.1430586047.12865.python-list@python.org>
In reply to#89767
On Sat, May 2, 2015 at 9:55 AM, Chris Angelico <rosuav@gmail.com> wrote:
> On Sun, May 3, 2015 at 1:45 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
>> On Sat, May 2, 2015 at 5:42 AM, Marko Rauhamaa <marko@pacujo.net> wrote:
>>> Christian Gollwitzer <auriocus@gmx.de>:
>>>
>>>> That's why I still think it is a microoptimization, which helps only
>>>> in some specific cases.
>>>
>>> It isn't done for performance. It's done to avoid a stack overflow
>>> exception.
>>
>> If your tree is balanced, then the number of items you would need to
>> have to get a stack overflow exception would be approximately 2 **
>> 1000, which you can't possibly hope to fit into memory.
>>
>> If your tree is unbalanced and you're getting a stack overflow
>> exception, then maybe you should think about balancing it.
>
> That's assuming it's a search tree, where you *can* just "think about
> balancing it". What if it's a parse tree? Let's say you're walking the
> AST of a Python module, looking for all functions that contain 'yield'
> or 'yield from' (ie generator functions). To do that, you need to walk
> the entire depth of the tree, no matter how far that goes. I'm not
> sure how complex a piece of code would have to be to hit 1000, but it
> wouldn't be hard to have each level of tree cost you two or three
> stack entries, so that could come down to just a few hundred.

Or you just iterate over the ast.walk generator, which uses a deque
rather than recursion.

[toc] | [prev] | [next] | [standalone]


#89783

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-05-02 11:17 -0600
Message-ID<mailman.20.1430587080.12865.python-list@python.org>
In reply to#89767
On Sat, May 2, 2015 at 9:53 AM, Joonas Liik <liik.joonas@gmail.com> wrote:

Top-posting is heavily frowned at on this list, so please don't do it.

> Balancing of trees is kind of irrelevant when "tree" means "search space"
> no?

I think it's relatively rare that DFS is truly the best algorithm for
such a search. For one thing, "search space" often means "graph", not
"tree". And for any other type of search, you'll want/need to
implement it iteratively rather than recursively anyway.

> And you definitely dont need to keep the entire tree in memory at the same
> time.

You could harness every single storage device on the planet and you
would still not have nearly enough capacity to fill a balanced search
tree to a depth of 1000.

> Also should not-running-out-of-call-stack really be the main reason to
> balance trees?
> That sounds like an optimisation to me ..

It is. My point was that if your unbalanced search tree is getting to
a depth of 1000, then it's probably long past time for you to start
thinking about optimizing it *anyway*.

[toc] | [prev] | [next] | [standalone]


#89784

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-05-02 18:22 +0100
Message-ID<mailman.21.1430587348.12865.python-list@python.org>
In reply to#89767
On 02/05/2015 14:50, Joonas Liik wrote:

[top posting fixed]

>
> On 2 May 2015 at 14:42, Marko Rauhamaa <marko@pacujo.net> wrote:
>
>> Christian Gollwitzer <auriocus@gmx.de>:
>>
>>> That's why I still think it is a microoptimization, which helps only
>>> in some specific cases.
>>
>> It isn't done for performance. It's done to avoid a stack overflow
>> exception.
>>
>>
>> Marko
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
> I agree, stack overflow is literally the main issue that ive run in to
> (tree traversal)
> I've yet to refactor recursion to iterative for speed, but i have done so
> to avoid hitting the stack size limit.
>
> Tree traversal and similar problems in particular lend themselves well to
> recursion and are not quite as trivial to do in an iterative fashion.
>
> I've also found myself constructing an ad-hoc trampoline at least once just
> to sneak past the stack limit.
> (probably very evil but it worked and it wasn't nearly as ugly as it sounds
> like so ill live with it..)
>
>

Please do not top post on this list.  Some of the threads get very long 
and are difficult if not impossible to follow when replies are top 
posted.  Interspersing your replies is the preferred method here, 
failing that simply reply at the bottom, thanks.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#89763

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-05-02 12:29 +0200
Message-ID<87pp6jmfwd.fsf@Equus.decebal.nl>
In reply to#89749
Op Saturday 2 May 2015 11:10 CEST schreef Marko Rauhamaa:

> Cecil Westerhof <Cecil@decebal.nl>:
>> I find factorial a lot cleaner code as factorial_iterative, so here
>> tail recursion would be beneficial.
>
> I would just call math.factorial() and be done with it.

You missed the point. I did not need a factorial function, I wanted to
show that tail optimisation could be beneficial.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89755

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-05-02 11:33 +0200
Message-ID<87twvvmigr.fsf@Equus.decebal.nl>
In reply to#89747
Op Saturday 2 May 2015 10:26 CEST schreef Cecil Westerhof:

> Op Friday 1 May 2015 09:03 CEST schreef Steven D'Aprano:
>
>> On Thu, 30 Apr 2015 09:30 pm, Cecil Westerhof wrote:
>>
>>> Tail recursion would nice to have also.
>>
>> People coming from functional languages like Lisp and Haskell often
>> say that, but how many recursive algorithms naturally take a
>> tail-call form? Not that many.
>
> One example:
> def factorial(x, y = 1):
> return y if x == 1 else factorial(x - 1, x * y)
>
> def factorial_iterative(x):
> result = 1
> for i in range(2, x + 1):
> result *= i
> return result
>
> Executing factorial(985) 100.000 times takes 54 seconds.
> While executing factorial_iterative(985) takes 34 seconds.
> Also you can not call factorial with a value that is much bigger
> because of recursion depth. You can call factorial_iterative with
> 1.000.000.
>
> I made also a version that simulates tail recursion:
> def factorial_tail_recursion(x):
> y = 1
> while True:
> if x == 1:
> return y
> y *= x
> x -= 1
>
> This is that a lot less efficient as the iterative version. It takes
> 43 seconds. But it is a lot better as the normal recursive version:
> about 25%. The iterative version is about 25% more efficient as the
> tail recursion version.
>
> With larger values it decreases. Calculating onetime for 5.000.000
> takes 117 and 131 seconds. Just 10% faster.
>
> That is mostly because the tail recursion version starts multiplying
> at the high end. I wrote a second version:
> def factorial_tail_recursion2(x):
> y = 1
> z = 1
> while True:
> if x == z:
> return y
> y *= z
> z += 1
>
> This has almost the performance of the iterative version: 34 and 121
> seconds.
>
> So I made a new recursive version:
> def factorial_recursive(x, y = 1, z = 1):
> return y if x == z else factorial_recursive(x, x * y, z + 1)

Stupid me 'x == z' should be 'z > x'

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89757

FromDave Angel <davea@davea.name>
Date2015-05-02 06:35 -0400
Message-ID<mailman.6.1430562930.12865.python-list@python.org>
In reply to#89755
On 05/02/2015 05:33 AM, Cecil Westerhof wrote:

Please check your email settings.  Your messages that you type seem to 
be indented properly, but those that are quoting earlier messages (even 
your own) are not.  See below.  I suspect there's some problem with how 
your email program processes html messages.

> Op Saturday 2 May 2015 10:26 CEST schreef Cecil Westerhof:
>>
>> That is mostly because the tail recursion version starts multiplying
>> at the high end. I wrote a second version:
>> def factorial_tail_recursion2(x):
>> y = 1
>> z = 1
>> while True:
>> if x == z:
>> return y
>> y *= z
>> z += 1
>>
>> This has almost the performance of the iterative version: 34 and 121
>> seconds.
>>
>> So I made a new recursive version:
>> def factorial_recursive(x, y = 1, z = 1):
>> return y if x == z else factorial_recursive(x, x * y, z + 1)
>
> Stupid me 'x == z' should be 'z > x'
>

I can't see how that is worth doing. The recursive version is already a 
distortion of the definition of factorial that I learned.  And to force 
it to be recursive and also contort it so it does the operations in the 
same order as the iterative version, just to gain performance?

If you want performance on factorial, write it iteratively, in as 
straightforward a way as possible.  Or just call the library function.

Recursion is a very useful tool in a developer's toolbox.  But the only 
reason I would use it for factorial is to provide a simple demonstration 
to introduce the concept to a beginner.


-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#89765

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-05-02 13:12 +0200
Message-ID<87lhh7mdwt.fsf@Equus.decebal.nl>
In reply to#89757
Op Saturday 2 May 2015 12:35 CEST schreef Dave Angel:

> On 05/02/2015 05:33 AM, Cecil Westerhof wrote:
>
> Please check your email settings. Your messages that you type seem
> to be indented properly, but those that are quoting earlier messages
> (even your own) are not. See below. I suspect there's some problem
> with how your email program processes html messages.
>
>> Op Saturday 2 May 2015 10:26 CEST schreef Cecil Westerhof:
>>>
>>> That is mostly because the tail recursion version starts
>>> multiplying at the high end. I wrote a second version: def
>>> factorial_tail_recursion2(x): y = 1 z = 1 while True: if x == z:
>>> return y y *= z z += 1
>>>
>>> This has almost the performance of the iterative version: 34 and
>>> 121 seconds.
>>>
>>> So I made a new recursive version:
>>> def factorial_recursive(x, y = 1, z = 1):
>>> return y if x == z else factorial_recursive(x, x * y, z + 1)
>>
>> Stupid me 'x == z' should be 'z > x'
>>
>
> I can't see how that is worth doing. The recursive version is
> already a distortion of the definition of factorial that I learned.
> And to force it to be recursive and also contort it so it does the
> operations in the same order as the iterative version, just to gain
> performance?
>
> If you want performance on factorial, write it iteratively, in as
> straightforward a way as possible. Or just call the library
> function.
>
> Recursion is a very useful tool in a developer's toolbox. But the
> only reason I would use it for factorial is to provide a simple
> demonstration to introduce the concept to a beginner.

And that was what I was doing here: showing that tail recursion can
have benefits.

By the way I have seen factorial mostly implemented recursive.

Also I am now testing with very large values. The version where I use
both tail recursion versions are faster as the iterative version.
Calculating 500.000:
iterative:                  137 seconds
tail recursion optimised:   120 seconds
tail recursion simple:      130 seconds

You have to be careful that you do not fall into the trap that you are
sure your way is the best way. I have had the same on the Scala list.
I told there that for a certain function it was better to use an
iterative version as a tail recursive function. They did not want to
believe it at first.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89772

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-05-02 09:31 -0600
Message-ID<mailman.11.1430580760.12865.python-list@python.org>
In reply to#89755
On Sat, May 2, 2015 at 4:35 AM, Dave Angel <davea@davea.name> wrote:
> I can't see how that is worth doing. The recursive version is already a
> distortion of the definition of factorial that I learned.  And to force it
> to be recursive and also contort it so it does the operations in the same
> order as the iterative version, just to gain performance?
>
> If you want performance on factorial, write it iteratively, in as
> straightforward a way as possible.  Or just call the library function.

Or if you really want to write it functionally:

from functools import reduce
from operator import mul

def product(iterable):
    return reduce(mul, iterable, 1)

def factorial(n):
    return product(range(1, n+1))

For Python 2, delete the first import and replace range with xrange.

[toc] | [prev] | [next] | [standalone]


#89699

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-05-01 15:56 +1000
Message-ID<554315a9$0$13003$c3e8da3$5496439d@news.astraweb.com>
In reply to#89619
On Thu, 30 Apr 2015 08:16 pm, Marko Rauhamaa wrote:

> Still, Python has features that defy effective optimization. Most
> notably, Python's dot notation translates into a hash table lookup -- or
> worse.


Effective optimization may be difficult, but it isn't impossible. PyPy has a
very effective Just In Time optimizer.

http://speed.pypy.org/

PyPy currently ranges from approximately 1.25 times to 50 times faster than
CPython, with an average of over 7 times faster.

And now PyPy also has a version that runs without the GIL:

http://morepypy.blogspot.com.au/2015/03/pypy-stm-251-released.html


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#89621

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 13:10 +0200
Message-ID<87lhh9x46m.fsf@Equus.decebal.nl>
In reply to#89614
Op Thursday 30 Apr 2015 11:10 CEST schreef Ben Finney:

>> I like it that the code is concise and clear. But also that the
>> performance is not bad.
>
> The former is a property of Python, which is a programming language.
> I agree with your assessment :-)
>
> The latter is not a property of Python; a programming language
> doesn't have runtime performance. Rather, runtime performance is a
> property of some specific *implementation* — that is, the runtime
> Python machine.
>
> There are numerous Python runtimes, and they have different
> performance characteristics on different hardware and operating
> systems.
>
> <URL:https://wiki.python.org/moin/PythonImplementations>
>
> You might be talking about performance on CPython. But you might
> not! I don't know.

I understood when you did not explicitly mention it, it normally is
CPython. And in my case it is.

I should try it also in Jython, because Clojure runs in de JVM.


> Have you looked at PyPy – Python implemented in Python – and
> compared its performance?

Not yet. Busy hacking away. :-D

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89632

FromMichael Torrie <torriem@gmail.com>
Date2015-04-30 08:03 -0600
Message-ID<mailman.130.1430402596.3680.python-list@python.org>
In reply to#89608
On 04/30/2015 01:07 AM, Cecil Westerhof wrote:
> When I do that the computer is freezed a few times. That is a little
> less nice. Does not happen with Clojure when it gets an out of memory.

A system freeze is probably due to thrashing by your operating system as
a process (in this case Python) uses more and more memory, causing
massive swapping.  Clojure's heap, being a JVM-based language, is based
on JVM settings, so it may be maxing out at just a couple of GB.
Whereas Python will happily max out your swap if your program demands
the memory.

[toc] | [prev] | [next] | [standalone]


#89639

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 18:11 +0200
Message-ID<87d22lvbnj.fsf@Equus.decebal.nl>
In reply to#89632
Op Thursday 30 Apr 2015 16:03 CEST schreef Michael Torrie:

> On 04/30/2015 01:07 AM, Cecil Westerhof wrote:
>> When I do that the computer is freezed a few times. That is a
>> little less nice. Does not happen with Clojure when it gets an out
>> of memory.
>
> A system freeze is probably due to thrashing by your operating
> system as a process (in this case Python) uses more and more memory,
> causing massive swapping. Clojure's heap, being a JVM-based
> language, is based on JVM settings, so it may be maxing out at just
> a couple of GB. Whereas Python will happily max out your swap if
> your program demands the memory.

I just posted a message about that. This gets the problem also:
    l = range(int(1E9))

The problem is that after this other processes are swapped out and
have a bad performance. There is nothing that can be done about it?

So there is a positive point for working with the JVM. :-D

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89654

FromChristian Gollwitzer <auriocus@gmx.de>
Date2015-04-30 19:59 +0200
Message-ID<mhtqfg$gcj$1@dont-email.me>
In reply to#89639
Am 30.04.15 um 18:11 schrieb Cecil Westerhof:
> Op Thursday 30 Apr 2015 16:03 CEST schreef Michael Torrie:
>
>> On 04/30/2015 01:07 AM, Cecil Westerhof wrote:
>>> When I do that the computer is freezed a few times. That is a
>>> little less nice. Does not happen with Clojure when it gets an out
>>> of memory.
>>
>> A system freeze is probably due to thrashing by your operating
>> system as a process (in this case Python) uses more and more memory,
>> causing massive swapping. Clojure's heap, being a JVM-based
>> language, is based on JVM settings, so it may be maxing out at just
>> a couple of GB. Whereas Python will happily max out your swap if
>> your program demands the memory.
>
> I just posted a message about that. This gets the problem also:
>      l = range(int(1E9))
>
> The problem is that after this other processes are swapped out and
> have a bad performance. There is nothing that can be done about it?
>
> So there is a positive point for working with the JVM. :-D
>

No, you understood this the wrong way. The JVM has a limit on the max 
memory size on startup, which you can give by the -Xmx option on the 
Oracle machine, so for instance

java -Xmx512M -jar myjar.jar

limits the memory that your program may consume to 512 megs, until the 
JVM kills it. The standard limit is usually fairly low and probably 
below your real memory, so the java program does not have the chance to 
max out your memory and make your computer swap.

CPython, au contraire, uses all memory it can get from the OS. The OS 
kills it if it uses too much. On Linux, you can set this limit yourself 
using ulimit. The analogue to the java call would therefore be something 
like

ulimit -m 512M
python mypython.py

If you set the ulimit to something smaller than your physical memory 
size, you also cause the program to be killed before it can use up all 
the memory and introduce swapping. For a fair comparison, you should set 
both limits to the same value and see how far you can get.

	Christian

[toc] | [prev] | [next] | [standalone]


#89670

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 22:05 +0200
Message-ID<874mnxtmaa.fsf@Equus.decebal.nl>
In reply to#89654
Op Thursday 30 Apr 2015 19:59 CEST schreef Christian Gollwitzer:

> Am 30.04.15 um 18:11 schrieb Cecil Westerhof:
>> Op Thursday 30 Apr 2015 16:03 CEST schreef Michael Torrie:
>>
>>> On 04/30/2015 01:07 AM, Cecil Westerhof wrote:
>>>> When I do that the computer is freezed a few times. That is a
>>>> little less nice. Does not happen with Clojure when it gets an
>>>> out of memory.
>>>
>>> A system freeze is probably due to thrashing by your operating
>>> system as a process (in this case Python) uses more and more
>>> memory, causing massive swapping. Clojure's heap, being a
>>> JVM-based language, is based on JVM settings, so it may be maxing
>>> out at just a couple of GB. Whereas Python will happily max out
>>> your swap if your program demands the memory.
>>
>> I just posted a message about that. This gets the problem also:
>> l = range(int(1E9))
>>
>> The problem is that after this other processes are swapped out and
>> have a bad performance. There is nothing that can be done about it?
>>
>> So there is a positive point for working with the JVM. :-D
>>
>
> No, you understood this the wrong way. The JVM has a limit on the
> max memory size on startup, which you can give by the -Xmx option on
> the Oracle machine, so for instance
>
> java -Xmx512M -jar myjar.jar
>
> limits the memory that your program may consume to 512 megs, until
> the JVM kills it. The standard limit is usually fairly low and
> probably below your real memory, so the java program does not have
> the chance to max out your memory and make your computer swap.

Well, I did play a little with that. My experience is that this is to
high. For example by lowering it, a certain program did not swap
anymore. Sound counter intuitive, but not that strange if you know the
inner workings of the JVM.


> CPython, au contraire, uses all memory it can get from the OS. The
> OS kills it if it uses too much. On Linux, you can set this limit
> yourself using ulimit. The analogue to the java call would therefore
> be something like
>
> ulimit -m 512M
> python mypython.py

I use startup scripts for the different Java (or Scala, or Clojure)
programs to set the memory size. I should do that also for my (or
anothers) Python scripts.


> If you set the ulimit to something smaller than your physical memory
> size, you also cause the program to be killed before it can use up

Nope, the program gets a MemoryError and will not be killed. Much
better I think.


> should set both limits to the same value and see how far you can
> get.

You are right.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web