Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #83890 > unrolled thread

recursive function: use a global or pass a parameter?

Started byTim <jtim.arnold@gmail.com>
First post2015-01-16 09:49 -0800
Last post2015-01-18 06:27 +1100
Articles 15 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  recursive function: use a global or pass a parameter? Tim <jtim.arnold@gmail.com> - 2015-01-16 09:49 -0800
    Re: recursive function: use a global or pass a parameter? Chris Angelico <rosuav@gmail.com> - 2015-01-17 04:56 +1100
      Re: recursive function: use a global or pass a parameter? Rustom Mody <rustompmody@gmail.com> - 2015-01-16 10:22 -0800
    Re: recursive function: use a global or pass a parameter? Peter Otten <__peter__@web.de> - 2015-01-16 19:34 +0100
      Re: recursive function: use a global or pass a parameter? Tim <jtim.arnold@gmail.com> - 2015-01-16 10:48 -0800
      Re: recursive function: use a global or pass a parameter? Yawar Amin <yawar.amin@gmail.com> - 2015-01-16 18:23 -0800
        Re: recursive function: use a global or pass a parameter? Yawar Amin <yawar.amin@gmail.com> - 2015-01-16 21:29 -0800
        Re: recursive function: use a global or pass a parameter? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-01-18 10:25 +1300
    Re: recursive function: use a global or pass a parameter? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-01-17 11:20 +1300
      Re: recursive function: use a global or pass a parameter? Chris Angelico <rosuav@gmail.com> - 2015-01-17 10:49 +1100
    Re: recursive function: use a global or pass a parameter? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-01-17 21:15 +1100
      Re: recursive function: use a global or pass a parameter? Roy Smith <roy@panix.com> - 2015-01-17 10:20 -0500
        Re: recursive function: use a global or pass a parameter? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-01-18 10:07 +1300
    Re: recursive function: use a global or pass a parameter? albert@spenarnc.xs4all.nl (Albert van der Horst) - 2015-01-17 17:30 +0000
      Re: recursive function: use a global or pass a parameter? Chris Angelico <rosuav@gmail.com> - 2015-01-18 06:27 +1100

#83890 — recursive function: use a global or pass a parameter?

FromTim <jtim.arnold@gmail.com>
Date2015-01-16 09:49 -0800
Subjectrecursive function: use a global or pass a parameter?
Message-ID<5e4ccec6-7a00-467d-8cf6-258ab0421c90@googlegroups.com>
I have this type of situation and wonder if I should use a global variable outside the recursive function instead of passing the updated parameter through. 

I want to get a union of all the values that any 'things' key may have, even in a nested dictionary (and I do not know beforehand how deep the nesting might go):

d = {'things':1, 'two':{'things':2}}

def walk(obj, res):
    if not hasattr(obj, 'keys'):
        return set(), set()
    
    if 'things' in obj:
        res.add(obj['things'])
        
    for k in obj:
        walk(obj[k], res)
        
    return res

walk(d, set()) # returns {1, 2}

Is it better to use a global to keep track of the values or does it even matter?

thanks,
--Tim

[toc] | [next] | [standalone]


#83891

FromChris Angelico <rosuav@gmail.com>
Date2015-01-17 04:56 +1100
Message-ID<mailman.17800.1421430971.18130.python-list@python.org>
In reply to#83890
On Sat, Jan 17, 2015 at 4:49 AM, Tim <jtim.arnold@gmail.com> wrote:
> I want to get a union of all the values that any 'things' key may have, even in a nested dictionary (and I do not know beforehand how deep the nesting might go):
>
> d = {'things':1, 'two':{'things':2}}
>
> def walk(obj, res):
>     if not hasattr(obj, 'keys'):
>         return set(), set()
>
>     if 'things' in obj:
>         res.add(obj['things'])
>
>     for k in obj:
>         walk(obj[k], res)
>
>     return res
>
> walk(d, set()) # returns {1, 2}
>
> Is it better to use a global to keep track of the values or does it even matter?

I would use a parameter rather than a global, but I'd make the
parameter optional:

def all_keys(obj, accum=None):
    if accum is None: accum=set()
    if 'things' in obj:
        res.add(obj['things'])
    for val in obj.values():
        all_keys(val, accum)
    return all_keys

ChrisA

[toc] | [prev] | [next] | [standalone]


#83895

FromRustom Mody <rustompmody@gmail.com>
Date2015-01-16 10:22 -0800
Message-ID<1d553d12-a79d-4917-9494-577bd8391d41@googlegroups.com>
In reply to#83891
On Friday, January 16, 2015 at 11:26:46 PM UTC+5:30, Chris Angelico wrote:
> On Sat, Jan 17, 2015 at 4:49 AM, Tim  wrote:
> > I want to get a union of all the values that any 'things' key may have, even in a nested dictionary (and I do not know beforehand how deep the nesting might go):
> >
> > d = {'things':1, 'two':{'things':2}}
> >
> > def walk(obj, res):
> >     if not hasattr(obj, 'keys'):
> >         return set(), set()
> >
> >     if 'things' in obj:
> >         res.add(obj['things'])
> >
> >     for k in obj:
> >         walk(obj[k], res)
> >
> >     return res
> >
> > walk(d, set()) # returns {1, 2}
> >
> > Is it better to use a global to keep track of the values or does it even matter?
> 
> I would use a parameter rather than a global, but I'd make the
> parameter optional:
> 
> def all_keys(obj, accum=None):
>     if accum is None: accum=set()
>     if 'things' in obj:
>         res.add(obj['things'])
>     for val in obj.values():
>         all_keys(val, accum)
>     return all_keys
> 
> ChrisA

I dont like the hardwired this. However keeping that...

def all_keys(obj):
    if   not isinstance(obj, type({})): return set()

    return ((set([obj['things']]) if 'things' in obj else set())   |
               set(v for s in obj.values() for v in all_keys(s)))

[toc] | [prev] | [next] | [standalone]


#83896

FromPeter Otten <__peter__@web.de>
Date2015-01-16 19:34 +0100
Message-ID<mailman.17804.1421433277.18130.python-list@python.org>
In reply to#83890
Tim wrote:

> I have this type of situation and wonder if I should use a global variable
> outside the recursive function instead of passing the updated parameter
> through.
> 
> I want to get a union of all the values that any 'things' key may have,
> even in a nested dictionary (and I do not know beforehand how deep the
> nesting might go):
> 
> d = {'things':1, 'two':{'things':2}}
> 
> def walk(obj, res):
>     if not hasattr(obj, 'keys'):
>         return set(), set()
>     
>     if 'things' in obj:
>         res.add(obj['things'])
>         
>     for k in obj:
>         walk(obj[k], res)
>         
>     return res
> 
> walk(d, set()) # returns {1, 2}
> 
> Is it better to use a global to keep track of the values or does it even
> matter?

Globals are generally bad as they make code non-reentrant; when two calls of 
the function run simultaneously the data will be messed up.

I recommend that you use a generator:

>>> def walk(obj):
...     if not hasattr(obj, "keys"):
...         return
...     if "things" in obj:
...         yield obj["things"]
...     for v in obj.values():
...         yield from walk(v)
... 
>>> d = {'things':1, 'two':{'things':2}}
>>> set(walk(d))
{1, 2}

In Python before 3.3 you have to replace

yield from walk(v)

with a loop:

for t in walk(v):
    yield t

[toc] | [prev] | [next] | [standalone]


#83897

FromTim <jtim.arnold@gmail.com>
Date2015-01-16 10:48 -0800
Message-ID<50dd6ef1-b52b-4516-ad15-9b866ca25fd7@googlegroups.com>
In reply to#83896
On Friday, January 16, 2015 at 1:34:51 PM UTC-5, Peter Otten wrote:
>> Tim wrote:
>> 
> Globals are generally bad as they make code non-reentrant; when two calls of 
> the function run simultaneously the data will be messed up.
> 
> I recommend that you use a generator:
> 
> >>> def walk(obj):
> ...     if not hasattr(obj, "keys"):
> ...         return
> ...     if "things" in obj:
> ...         yield obj["things"]
> ...     for v in obj.values():
> ...         yield from walk(v)
> ... 
> >>> d = {'things':1, 'two':{'things':2}}
> >>> set(walk(d))
> {1, 2}
> 
> In Python before 3.3 you have to replace
> 
> yield from walk(v)
> 
> with a loop:
> 
> for t in walk(v):
>     yield t

Ah, a generator, I wouldn't have seen the problem in this way, but with your example, it looks so natural.

thanks,
--Tim

[toc] | [prev] | [next] | [standalone]


#83904

FromYawar Amin <yawar.amin@gmail.com>
Date2015-01-16 18:23 -0800
Message-ID<e8cf4ecf-90f0-430d-843c-d6f3ec3b3aef@googlegroups.com>
In reply to#83896
On Friday, January 16, 2015 at 1:34:51 PM UTC-5, Peter Otten wrote:
> [...]
> I recommend that you use a generator:
> 
> >>> def walk(obj):
> ...     if not hasattr(obj, "keys"):
> ...         return
> ...     if "things" in obj:
> ...         yield obj["things"]
> ...     for v in obj.values():
> ...         yield from walk(v)

Cool ... but it looks like this can still potentially hit the max
recursion limit? Perhaps better to convert to an iterative style:

    def walk(obj):
      """
      Yield any value(s) contained within `obj` that is (are) indexed by
      the key 'things'. `obj` must be dict-like.
      """
      from collections import deque
      vals = deque()
      vals.append(obj)
    
      while True:
        try: curr_obj = vals.popleft()
        except IndexError: return
        if not hasattr(curr_obj, "keys"): continue
    
        if "things" in curr_obj: yield curr_obj["things"]
        vals.extend(curr_obj.values())
    
    # Examples

    d1 = list(walk({ "things": 1, "two": { "things": 2 } }))
    
    d2 = list(walk({
      "things": 1,
      "two": { "things": 2 },
      "three":
        { "four": 4,
          "things":
            { "five": 5,
              "six": 6,
              "things":
                { "seven": 7,
                  "things": 8 } } } }))

So this effectively 'flattens' a dictionary at each level into a queue
made up of the dictionary's values, and meanwhile yields the values one
by one if they are indexed by the key 'things'.

The output for `d1` should be the same as Peter Otten's example, except
I'm using a list instead of a set because I think the yielded objects
could themselves be dictionaries or other non-hashable values.

When you're looking at the output for `d2`, remember that `walk` here
will yield _any_ object that's indexed by the key, and as I mentioned,
that could be an entire dictionary object contained within the main one.

Regards,

Yawar

[toc] | [prev] | [next] | [standalone]


#83906

FromYawar Amin <yawar.amin@gmail.com>
Date2015-01-16 21:29 -0800
Message-ID<36e4ce9f-348a-44a3-99c0-e3efd903bd74@googlegroups.com>
In reply to#83904
On Friday, January 16, 2015 at 9:24:15 PM UTC-5, Yawar Amin wrote:
> [...]
>         vals.extend(curr_obj.values())

Ah, I should mention that the above will do a breadth-first search. If
we want to do a depth-first search we simply replace the above line
with:

    vals.extendleft(curr_obj.values())

Regards,

Yawar

[toc] | [prev] | [next] | [standalone]


#83951

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2015-01-18 10:25 +1300
Message-ID<ci02aoFo8tkU1@mid.individual.net>
In reply to#83904
Yawar Amin wrote:

> Cool ... but it looks like this can still potentially hit the max
> recursion limit?

It depends on the nature of your data. If the data is
a tree, it's very unlikely you'll reach the recursion
limit unless the tree is massively unbalanced.

If there's a chance of that, or if the data structure
is really a graph that could contain long linear
chains, then a non-recursive solution is safer.

Incidentally, Python's pickle suffers from this
problem -- it's possible to create a structure that
can't be pickled due to excessive recursion depth:

 >>> x = []
 >>> for i in range(5000):
...  x = [x]
...
 >>> import pickle
 >>> s = pickle.dumps(x)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded while pickling an object

Fortunately, the kinds of structures that cause this
tend not to be used in Python, e.g. we normally use
Python list objects instead of linked lists. So most
of the time, pickle gets away with using recursion.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#83902

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2015-01-17 11:20 +1300
Message-ID<chth6bF4cv1U1@mid.individual.net>
In reply to#83890
Tim wrote:
> I have this type of situation and wonder if I should use a global variable
> outside the recursive function instead of passing the updated parameter
> through.

No! Globals are evil, at least for that sort of thing.
The way you're doing it is fine.

The only thing I would change is to wrap it all up
in a top-level function that takes care of creating
the result set and returning it.

def walk(obj):
   res = set()
   _walk(obj, res)
   return res

def _walk(obj, res):
   ...

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#83903

FromChris Angelico <rosuav@gmail.com>
Date2015-01-17 10:49 +1100
Message-ID<mailman.17808.1421452183.18130.python-list@python.org>
In reply to#83902
On Sat, Jan 17, 2015 at 9:20 AM, Gregory Ewing
<greg.ewing@canterbury.ac.nz> wrote:
> The only thing I would change is to wrap it all up
> in a top-level function that takes care of creating
> the result set and returning it.
>
> def walk(obj):
>   res = set()
>   _walk(obj, res)
>   return res
>
> def _walk(obj, res):
>   ...

Point of style: I like to put these kinds of helpers _above_ the
corresponding public functions, to maintain a general policy of
Define-Before-Use. Tends to make code easier to read; the first
reference to a function name is its definition, then all usage comes
after that. It's not always possible, of course (eg mutual recursion),
but in simple cases like this, it's easy enough.

ChrisA

[toc] | [prev] | [next] | [standalone]


#83907

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-01-17 21:15 +1100
Message-ID<54ba3654$0$13008$c3e8da3$5496439d@news.astraweb.com>
In reply to#83890
Tim wrote:

> I have this type of situation and wonder if I should use a global variable
> outside the recursive function instead of passing the updated parameter
> through.

To a first approximation, the answer to:

"I have a X, should I use a global variable or a parameter?"

is *always* "use a parameter", no matter what X is.

To a second and third approximation, the answer is still "use a parameter".

Good reasons for using global variables are few and far between. Just about
the only good reason for using global variables that I can think of is if
you have one or more settings/preference that get set once at the start of
the program and then apply to the entire program.

Actually, there is one other reason... if your program is a simple script
written in imperative style:

name = "steve"
print "Hello", name
do_this()
do_that()
do_something_else()
print "Good bye", name


sort of thing.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#83928

FromRoy Smith <roy@panix.com>
Date2015-01-17 10:20 -0500
Message-ID<roy-F5161B.10193317012015@news.panix.com>
In reply to#83907
In article <54ba3654$0$13008$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> Good reasons for using global variables are few and far between. Just about
> the only good reason for using global variables that I can think of is if
> you have one or more settings/preference that get set once at the start of
> the program and then apply to the entire program.

I will commonly put something like:

import logging
logger = logging.getLogger("logger-name-for-my-module")

in every source file in a big project.  Now I've got a global logger I 
can use anywhere in the module (even in static functions).  But, maybe 
that's just a subset of your "setting that gets set once at the start of 
the program" example.

[toc] | [prev] | [next] | [standalone]


#83950

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2015-01-18 10:07 +1300
Message-ID<ci018uFo04jU1@mid.individual.net>
In reply to#83928
Roy Smith wrote:
> I will commonly put something like:
> 
> import logging
> logger = logging.getLogger("logger-name-for-my-module")

But that's not really a global variable, it's a
global constant. There's nothing wrong with those,
we use them all the time -- classes, functions, etc.

If you were to rebind logger to a different one from
time to time in your code, *then* it would be a global
variable, subject to the usual cautions.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#83936

Fromalbert@spenarnc.xs4all.nl (Albert van der Horst)
Date2015-01-17 17:30 +0000
Message-ID<54ba9c1c$0$15938$e4fe514c@dreader35.news.xs4all.nl>
In reply to#83890
In article <5e4ccec6-7a00-467d-8cf6-258ab0421c90@googlegroups.com>,
Tim  <jtim.arnold@gmail.com> wrote:
>I have this type of situation and wonder if I should use a global
>variable outside the recursive function instead of passing the updated
>parameter through.
>
>I want to get a union of all the values that any 'things' key may have,
>even in a nested dictionary (and I do not know beforehand how deep the
>nesting might go):
>
>d = {'things':1, 'two':{'things':2}}
>
>def walk(obj, res):
>    if not hasattr(obj, 'keys'):
>        return set(), set()
>
>    if 'things' in obj:
>        res.add(obj['things'])
>
>    for k in obj:
>        walk(obj[k], res)
>
>    return res
>
>walk(d, set()) # returns {1, 2}
>
>Is it better to use a global to keep track of the values or does it even matter?

Neither. You shouldn't pass superfluous parameters, and you shouldn't
abuse globals.

The proper technique is make the global local to the normal subroutine,
then make the subroutine with those parameters you don't want to see
also local to that subroutine.
E.g.

def fib(n):
    ' return the n-th Fibonacci number '
    a,b = 0,1
    def fib1(ap,bp):
       ' for f_n,f_n+1, return f_n+1,f_n+2 '
       return bp,ap+b
    for i in xrange(n):
       a,b = fib1(a,b)
    return a

>
>thanks,
>--Tim

Groetjes Albert
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

[toc] | [prev] | [next] | [standalone]


#83945

FromChris Angelico <rosuav@gmail.com>
Date2015-01-18 06:27 +1100
Message-ID<mailman.17817.1421522845.18130.python-list@python.org>
In reply to#83936
On Sun, Jan 18, 2015 at 4:30 AM, Albert van der Horst
<albert@spenarnc.xs4all.nl> wrote:
> The proper technique is make the global local to the normal subroutine,
> then make the subroutine with those parameters you don't want to see
> also local to that subroutine.
> E.g.
>
> def fib(n):
>     ' return the n-th Fibonacci number '
>     a,b = 0,1
>     def fib1(ap,bp):
>        ' for f_n,f_n+1, return f_n+1,f_n+2 '
>        return bp,ap+b
>     for i in xrange(n):
>        a,b = fib1(a,b)
>     return a

That's a fairly useless use of a nested function... you could in-line
it without any effort at all:

def fib(n):
    a,b = 0,1
    for i in xrange(n):
        a,b = b,a+b
    return a

Even in the OP's example, where the inner function needs to call
itself recursively, it doesn't need to be a closure; a simple
out-of-line helper function does work (and has already been suggested;
Gregory was, I think, first). In my opinion, it's not materially
different from having an argument with a default.

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web