Groups > comp.lang.python > #64044 > unrolled thread

Is it possible to get string from function?

Started by	Roy Smith <roy@panix.com>
First post	2014-01-15 22:46 -0500
Last post	2014-01-16 09:52 +0100
Articles	9 — 5 participants

Back to article view | Back to comp.lang.python

  Is it possible to get string from function? Roy Smith <roy@panix.com> - 2014-01-15 22:46 -0500
    Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?) Ben Finney <ben+python@benfinney.id.au> - 2014-01-16 16:02 +1100
      Re: Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?) Roy Smith <roy@panix.com> - 2014-01-16 00:09 -0500
    Re: Is it possible to get string from function? Chris Angelico <rosuav@gmail.com> - 2014-01-16 16:25 +1100
      Re: Is it possible to get string from function? Roy Smith <roy@panix.com> - 2014-01-16 00:40 -0500
        Re: Is it possible to get string from function? Chris Angelico <rosuav@gmail.com> - 2014-01-16 16:47 +1100
    Re: Is it possible to get string from function? Steven D'Aprano <steve@pearwood.info> - 2014-01-16 07:16 +0000
      Re: Is it possible to get string from function? Roy Smith <roy@panix.com> - 2014-01-16 09:30 -0500
    Re: Is it possible to get string from function? Peter Otten <__peter__@web.de> - 2014-01-16 09:52 +0100

#64044 — Is it possible to get string from function?

From	Roy Smith <roy@panix.com>
Date	2014-01-15 22:46 -0500
Subject	Is it possible to get string from function?
Message-ID	<roy-406DCF.22465415012014@news.panix.com>

I realize the subject line is kind of meaningless, so let me explain :-)

I've got some unit tests that look like:

class Foo(TestCase):
  def test_t1(self):
    RECEIPT = "some string"

  def test_t2(self):
    RECEIPT = "some other string"

  def test_t3(self):
    RECEIPT = "yet a third string"

and so on.  It's important that the strings be mutually unique.  In the 
example above, it's trivial to look at them and observe that they're all 
different, but in real life, the strings are about 2500 characters long, 
hex-encoded.  It even turns out that a couple of the strings are 
identical in the first 1000 or so characters, so it's not trivial to do 
by visual inspection.

So, I figured I would write a meta-test, which used introspection to 
find all the methods in the class, extract the strings from them (they 
are all assigned to a variable named RECEIPT), and check to make sure 
they're all different.

Is it possible to do that?  It is straight-forward using the inspect 
module to discover the methods, but I don't see any way to find what 
strings are assigned to a variable with a given name.  Of course, that 
assignment doesn't even happen until the function is executed, so 
perhaps what I want just isn't possible?

It turns out, I solved the problem with more mundane tools:

grep 'RECEIPT = ' test.py | sort | uniq -c

and I could have also solved the problem by putting all the strings in a 
dict and having the functions pull them out of there.  But, I'm still 
interested in exploring if there is any way to do this with 
introspection, as an academic exercise.

[toc] | [next] | [standalone]

#64052 — Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

From	Ben Finney <ben+python@benfinney.id.au>
Date	2014-01-16 16:02 +1100
Subject	Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)
Message-ID	<mailman.5569.1389848543.18130.python-list@python.org>
In reply to	#64044

Roy Smith <roy@panix.com> writes:

> I've got some unit tests that look like:
>
> class Foo(TestCase):
>   def test_t1(self):
>     RECEIPT = "some string"
>
>   def test_t2(self):
>     RECEIPT = "some other string"
>
>   def test_t3(self):
>     RECEIPT = "yet a third string"
>
> and so on.

That looks like a poorly defined class.

Are the test cases pretty much identical other than the data in those
strings? If so, use a collection of strings and generate separate tests
for each one dynamically.

In Python 2 and 3, you can use the ‘testscenarios’ library for that
purpose <URL:https://pypi.python.org/pypi/testscenarios>.

In Python 3, the ‘unittest’ module has “subtests” for the same purpose
<URL:http://docs.python.org/3.4/library/unittest.html#distinguishing-test-iterations-using-subtests>.

> and I could have also solved the problem by putting all the strings in
> a dict and having the functions pull them out of there. But, I'm still
> interested in exploring if there is any way to do this with
> introspection, as an academic exercise.

Since I don't think your use case is best solved this way, I'll leave
the academic exercise to someone else.

-- 
 \            “… Nature … is seen to do all things Herself and through |
  `\         herself of own accord, rid of all gods.” —Titus Lucretius |
_o__)                                                 Carus, c. 40 BCE |
Ben Finney

[toc] | [prev] | [next] | [standalone]

#64054 — Re: Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

From	Roy Smith <roy@panix.com>
Date	2014-01-16 00:09 -0500
Subject	Re: Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)
Message-ID	<roy-59A9EB.00090816012014@news.panix.com>
In reply to	#64052

In article <mailman.5569.1389848543.18130.python-list@python.org>,
 Ben Finney <ben+python@benfinney.id.au> wrote:

> Roy Smith <roy@panix.com> writes:
> 
> > I've got some unit tests that look like:
> >
> > class Foo(TestCase):
> >   def test_t1(self):
> >     RECEIPT = "some string"
> >
> >   def test_t2(self):
> >     RECEIPT = "some other string"
> >
> >   def test_t3(self):
> >     RECEIPT = "yet a third string"
> >
> > and so on.
> 
> That looks like a poorly defined class.
> 
> Are the test cases pretty much identical other than the data in those
> strings?

No, each test is quite different.  The only thing they have in common is 
they all involve a string representation of a transaction receipt.  I 
elided the actual test code in my example above because it wasn't 
relevant to my question.

[toc] | [prev] | [next] | [standalone]

#64055

From	Chris Angelico <rosuav@gmail.com>
Date	2014-01-16 16:25 +1100
Message-ID	<mailman.5570.1389849928.18130.python-list@python.org>
In reply to	#64044

On Thu, Jan 16, 2014 at 2:46 PM, Roy Smith <roy@panix.com> wrote:
> So, I figured I would write a meta-test, which used introspection to
> find all the methods in the class, extract the strings from them (they
> are all assigned to a variable named RECEIPT), and check to make sure
> they're all different.

In theory, it should be. You can disassemble the function and find the
assignment. Check out Lib/dis.py - or just call it and process its
output. Names of local variables are found in
test_t1.__code__.co_names, the constants themselves are in
test_1.__code__.co_consts, and then it's just a matter of matching up
which constant got assigned to the slot represented by the name
RECEIPT.

But you might be able to shortcut it enormously. You say the strings
are "about 2500 characters long, hex-encoded". What are the chances of
having another constant, somewhere in the test function, that also
happens to be roughly that long and hex-encoded? If the answer is
"practically zero", then skip the code, skip co_names, and just look
through co_consts.

class TestCase:
  pass # not running this in the full environment

class Foo(TestCase):
  def test_t1(self):
    RECEIPT = "some string"

  def test_t2(self):
    RECEIPT = "some other string"

  def test_t3(self):
    RECEIPT = "yet a third string"

  def test_oops(self):
    RECEIPT = "some other string"

unique = {}
for funcname in dir(Foo):
    if funcname.startswith("test_"):
        for const in getattr(Foo,funcname).__code__.co_consts:
            if isinstance(const, str) and const.endswith("string"):
                if const in unique:
                    print("Collision!", unique[const], "and", funcname)
                unique[const] = funcname

This depends on your RECEIPT strings ending with the word "string" -
change the .endswith() check to be whatever it takes to distinguish
your critical constants from everything else you might have. Maybe:

CHARSET = set("0123456789ABCDEF") # or use lower-case letters, or
both, according to your hex encoding

if isinstance(const, str) and len(const)>2048 and set(const)<=CHARSET:

Anything over 2KB with no characters outside of that set is highly
likely to be what you want. Of course, this whole theory goes out the
window if your test functions can reference another test's RECEIPT;
though if you can guarantee that this is the *first* such literal (if
RECEIPT="..." is the first thing the function does), then you could
just add a 'break' after the unique[const]=funcname assignment and
it'll check only the first - co_consts is ordered.

An interesting little problem!

ChrisA

[toc] | [prev] | [next] | [standalone]

#64057

From	Roy Smith <roy@panix.com>
Date	2014-01-16 00:40 -0500
Message-ID	<roy-09327D.00403916012014@news.panix.com>
In reply to	#64055

In article <mailman.5570.1389849928.18130.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> On Thu, Jan 16, 2014 at 2:46 PM, Roy Smith <roy@panix.com> wrote:
> > So, I figured I would write a meta-test, which used introspection to
> > find all the methods in the class, extract the strings from them (they
> > are all assigned to a variable named RECEIPT), and check to make sure
> > they're all different.
>> [...]
> But you might be able to shortcut it enormously. You say the strings
> are "about 2500 characters long, hex-encoded". What are the chances of
> having another constant, somewhere in the test function, that also
> happens to be roughly that long and hex-encoded?

The chances are exactly zero.

> If the answer is "practically zero", then skip the code, skip 
> co_names, and just look through co_consts.

That sounds like it should work, thanks!

> Of course, this whole theory goes out the
> window if your test functions can reference another test's RECEIPT;

No, they don't do that.

[toc] | [prev] | [next] | [standalone]

#64058

From	Chris Angelico <rosuav@gmail.com>
Date	2014-01-16 16:47 +1100
Message-ID	<mailman.5572.1389851283.18130.python-list@python.org>
In reply to	#64057

On Thu, Jan 16, 2014 at 4:40 PM, Roy Smith <roy@panix.com> wrote:
>> But you might be able to shortcut it enormously. You say the strings
>> are "about 2500 characters long, hex-encoded". What are the chances of
>> having another constant, somewhere in the test function, that also
>> happens to be roughly that long and hex-encoded?
>
> The chances are exactly zero.
>
>> If the answer is "practically zero", then skip the code, skip
>> co_names, and just look through co_consts.
>
> That sounds like it should work, thanks!
>
>> Of course, this whole theory goes out the
>> window if your test functions can reference another test's RECEIPT;
>
> No, they don't do that.

Awesome! Makes it easy then.

ChrisA

[toc] | [prev] | [next] | [standalone]

#64064

From	Steven D'Aprano <steve@pearwood.info>
Date	2014-01-16 07:16 +0000
Message-ID	<52d7874d$0$6599$c3e8da3$5496439d@news.astraweb.com>
In reply to	#64044

On Wed, 15 Jan 2014 22:46:54 -0500, Roy Smith wrote:

> I've got some unit tests that look like:
> 
> class Foo(TestCase):
>   def test_t1(self):
>     RECEIPT = "some string"
> 
>   def test_t2(self):
>     RECEIPT = "some other string"
> 
>   def test_t3(self):
>     RECEIPT = "yet a third string"
> 
> and so on.  It's important that the strings be mutually unique.  In the
> example above, it's trivial to look at them and observe that they're all
> different, but in real life, the strings are about 2500 characters long,
> hex-encoded.  It even turns out that a couple of the strings are
> identical in the first 1000 or so characters, so it's not trivial to do
> by visual inspection.

Is the mapping of receipt string to test fixed? That is, is it important 
that test_t1 *always* runs with "some string", test_t2 "some other 
string", and so forth?

If not, I'd start by pushing all those strings into a global list (or 
possible a class attribute. Then:

LIST_OF_GIANT_STRINGS = [blah blah blah]  # Read it from a file perhaps?
assert len(LIST_OF_GIANT_STRINGS) == len(set(LIST_OF_GIANT_STRINGS))

Then, change each test case to:

    def test_t1(self):
        RECEIPT = random.choose(LIST_OF_GIANT_STRINGS)

Even if two tests happen to pick the same string on this run, they are 
unlikely to pick the same string on the next run.

If that's not good enough, if the strings *must* be unique, you can use a 
helper like this:

def choose_without_replacement(alist):
    random.shuffle(alist)
    return alist.pop()

class Foo(TestCase):
    def test_t1(self):
        RECEIPT = choose_without_replacement(LIST_OF_GIANT_STRINGS)

All this assumes that you don't care which giant string matches which 
test method. If you do, then:

DICT_OF_GIANT_STRINGS = {
    'test_t1': ..., 
    'test_t2': ..., 
    }  # Again, maybe read them from a file.

assert len(list(DICT_OF_GIANT_STRINGS.values())) == \
       len(set(DICT_OF_GIANT_STRINGS.values()))

You can probably build up the dict from the test class by inspection, 
e.g.:

DICT_OF_GIANT_STRINGS = {}
for name in Foo.__dict__:
    if name.startswith("test_"):
        key = name[5:]
        if key.startswith("t"):
            DICT_OF_GIANT_STRINGS[name] = get_giant_string(key)

I'm sure you get the picture. Then each method just needs to know it's 
own name:

class Foo(TestCase):
    def test_t1(self):
        RECEIPT = DICT_OF_GIANT_STRINGS["test_t1"]

which I must admit is much easier to read than 

        RECEIPT = "...2500 hex encoded characters..."

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#64082

From	Roy Smith <roy@panix.com>
Date	2014-01-16 09:30 -0500
Message-ID	<roy-B7317E.09303016012014@news.panix.com>
In reply to	#64064

In article <52d7874d$0$6599$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve@pearwood.info> wrote:

> Is the mapping of receipt string to test fixed? That is, is it important 
> that test_t1 *always* runs with "some string", test_t2 "some other 
> string", and so forth?

Yes.

[toc] | [prev] | [next] | [standalone]

#64066

From	Peter Otten <__peter__@web.de>
Date	2014-01-16 09:52 +0100
Message-ID	<mailman.5577.1389862382.18130.python-list@python.org>
In reply to	#64044

Roy Smith wrote:

> I realize the subject line is kind of meaningless, so let me explain :-)
> 
> I've got some unit tests that look like:
> 
> class Foo(TestCase):
>   def test_t1(self):
>     RECEIPT = "some string"
> 
>   def test_t2(self):
>     RECEIPT = "some other string"
> 
>   def test_t3(self):
>     RECEIPT = "yet a third string"
> 
> and so on.  It's important that the strings be mutually unique.  In the
> example above, it's trivial to look at them and observe that they're all
> different, but in real life, the strings are about 2500 characters long,
> hex-encoded.  It even turns out that a couple of the strings are
> identical in the first 1000 or so characters, so it's not trivial to do
> by visual inspection.
> 
> So, I figured I would write a meta-test, which used introspection to
> find all the methods in the class, extract the strings from them (they
> are all assigned to a variable named RECEIPT), and check to make sure
> they're all different.
> 
> Is it possible to do that?  It is straight-forward using the inspect
> module to discover the methods, but I don't see any way to find what
> strings are assigned to a variable with a given name.  Of course, that
> assignment doesn't even happen until the function is executed, so
> perhaps what I want just isn't possible?
> 
> It turns out, I solved the problem with more mundane tools:
> 
> grep 'RECEIPT = ' test.py | sort | uniq -c
> 
> and I could have also solved the problem by putting all the strings in a
> dict and having the functions pull them out of there.  But, I'm still
> interested in exploring if there is any way to do this with
> introspection, as an academic exercise.

Instead of using introspection you could make it explicit with a decorator:

$ cat unique_receipt.py 
import functools
import sys
import unittest

_receipts = {}
def unique_receipt(receipt):
    def deco(f):
        if receipt in _receipts:
            raise ValueError(
                "Duplicate receipt {!r} in \n    {} and \n    {}".format(
                    receipt, _receipts[receipt], f))
        _receipts[receipt] = f
        @functools.wraps(f)
        def g(self):
            return f(self, receipt)
        return g
    return deco

class Foo(unittest.TestCase):
    @unique_receipt("foo")
    def test_t1(self, RECEIPT):
        pass

    @unique_receipt("bar")
    def test_t2(self, RECEIPT):
        pass

    @unique_receipt("foo")
    def test_t3(self, RECEIPT):
        pass

if __name__ == "__main__":
    unittest.main()
$ python unique_receipt.py 
Traceback (most recent call last):
  File "unique_receipt.py", line 19, in <module>
    class Foo(unittest.TestCase):
  File "unique_receipt.py", line 28, in Foo
    @unique_receipt("foo")
  File "unique_receipt.py", line 11, in deco
    receipt, _receipts[receipt], f))
ValueError: Duplicate receipt 'foo' in 
    <function test_t1 at 0x7fc8714af5f0> and 
    <function test_t3 at 0x7fc8714af7d0>

[toc] | [prev] | [standalone]

csiph-web

Is it possible to get string from function?

Contents

#64044 — Is it possible to get string from function?

#64052 — Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

#64054 — Re: Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

#64055

#64057

#64058

#64064

#64082

#64066