Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89556 > unrolled thread

ImportError with pickle (Python 2.7.9), possibly platform dependent

Started byBen Sizer <kylotan@gmail.com>
First post2015-04-29 09:01 -0700
Last post2015-05-13 03:27 -0700
Articles 7 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  ImportError with pickle (Python 2.7.9), possibly platform dependent Ben Sizer <kylotan@gmail.com> - 2015-04-29 09:01 -0700
    Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Chris Angelico <rosuav@gmail.com> - 2015-04-30 10:44 +1000
      Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Ben Sizer <kylotan@gmail.com> - 2015-05-01 04:01 -0700
        Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Chris Angelico <rosuav@gmail.com> - 2015-05-01 22:09 +1000
          Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Ben Sizer <kylotan@gmail.com> - 2015-05-13 03:23 -0700
        Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Peter Otten <__peter__@web.de> - 2015-05-01 14:34 +0200
          Re: ImportError with pickle (Python 2.7.9), possibly platform dependent Ben Sizer <kylotan@gmail.com> - 2015-05-13 03:27 -0700

#89556 — ImportError with pickle (Python 2.7.9), possibly platform dependent

FromBen Sizer <kylotan@gmail.com>
Date2015-04-29 09:01 -0700
SubjectImportError with pickle (Python 2.7.9), possibly platform dependent
Message-ID<494551ca-532f-4d4d-aff0-a3932416c8f4@googlegroups.com>
I'm saving some data via pickle, and loading it in is proving tricky. 

Traceback (most recent call last):
  [...some lines removed...]
File "/home/kylotan/OMDBSetup.py", line 44, in get_omdb_map
  __omdb_map = OMDBMap.OMDBMap.load_from_binary(full_path)
File "/home/kylotan/OMDBMap.py", line 87, in load_from_binary
  d = pickle.load(binary_file)
File "/usr/local/lib/python2.7/pickle.py", line 1378, in load
  return Unpickler(file).load()
File "/usr/local/lib/python2.7/pickle.py", line 858, in load
  dispatch[key](self)
File "/usr/local/lib/python2.7/pickle.py", line 1090, in load_global
  klass = self.find_class(module, name)
File "/usr/local/lib/python2.7/pickle.py", line 1124, in find_class
  __import__(module)
ImportError: No module named OMDBMap

Here are the 2 weird things:

1) There clearly is a module named OMDBMap, and it's importable - it's there in the 2nd line of the traceback.
2) This error only arises on Linux. Exactly the same file loads in properly on MacOSX, and on Windows 8.


What I've done to try and debug this:

a) I've run an MD5 on the file to make sure the file is identical on all platforms, and that nothing is changing the line endings, and I'm also making sure to both open and save the pickle with 'rb'/'wb'.
b) I tried both pickle and cPickle - they seem to produce slightly different output but the error is exactly the same in each case.
c) I pickle and unpickle from exactly the same file (a file called OMDBSetup.py does 'import OMDBMap' and then calls methods in there to save and load the data (including the OMDBMap.OMDBMap.load_from_binary which contains the above callstack). The intention here was to avoid both the common "No module named __main__" error, and to hopefully have exactly the same modules imported into the namespace at both save and load time.

So my hypothesis is that I've either found some edge case which only acts weird on Linux (or only succeeds on the other platforms, whichever way you look at it), or there's something wrong with the Linux configuration that means it somehow cannot find this module (despite it already having found it to get this far).

Does anybody have any suggestions on how I can go about debugging this? Or refactoring it to avoid whatever is happening here?

-- 
Ben Sizer

[toc] | [next] | [standalone]


#89592

FromChris Angelico <rosuav@gmail.com>
Date2015-04-30 10:44 +1000
Message-ID<mailman.108.1430354695.3680.python-list@python.org>
In reply to#89556
On Thu, Apr 30, 2015 at 2:01 AM, Ben Sizer <kylotan@gmail.com> wrote:
> 1) There clearly is a module named OMDBMap, and it's importable - it's there in the 2nd line of the traceback.
>
> Does anybody have any suggestions on how I can go about debugging this? Or refactoring it to avoid whatever is happening here?

Are you half way through importing it when this load() call happens?
That might cause some issues.

Has your current directory been changed anywhere in there?

What happens if you catch this exception and print out sys.modules at
that point?

ChrisA

[toc] | [prev] | [next] | [standalone]


#89722

FromBen Sizer <kylotan@gmail.com>
Date2015-05-01 04:01 -0700
Message-ID<20a5c7bf-2163-4b7a-8495-30ce23239903@googlegroups.com>
In reply to#89592
On Thursday, 30 April 2015 01:45:05 UTC+1, Chris Angelico  wrote:
> On Thu, Apr 30, 2015 at 2:01 AM, Ben Sizer wrote:
> > 1) There clearly is a module named OMDBMap, and it's importable - it's there in the 2nd line of the traceback.
> >
> > Does anybody have any suggestions on how I can go about debugging this? Or refactoring it to avoid whatever is happening here?
> 
> Are you half way through importing it when this load() call happens?
> That might cause some issues.

No, we already imported OMDBMap at the top of OMDBSetup.

> Has your current directory been changed anywhere in there?

Good question. It turns out that the current directory seems to be $HOME when loading, but is the script directory during saving. This will be because the Linux server is running under mod_wsgi, whereas we run the save script in-place. Our Windows and Mac tests run via Flask's built-in server so the working directory is likely to be the same whether we're running the script that does pickle.dump or the whole app that does pickle.load.

> What happens if you catch this exception and print out sys.modules at
> that point?

Another good question, and this gives us the answer. The module lists are quite different, as I'd expect because the load happens in the context of the full application whereas the dump happens as a standalone script. But literally every module that is in the 'before dump' list is in the 'before load' list - except OMDBMap. Like the error says! What /is/ in the 'before load' list however is "my_wsgi_app.OMDBMap". The module has been imported, but the pickle algorithm is unable to reconcile the module in the WSGI app's namespace with the module referenced in the pickle file.

So... I don't know how to fix this, but I do now know why it fails, and I have a satisfactory answer for why it is acting differently on the Linux server (and that is just because that is the only one running under WSGI). Two out of three isn't bad!

Thanks,
-- 
Ben Sizer

[toc] | [prev] | [next] | [standalone]


#89725

FromChris Angelico <rosuav@gmail.com>
Date2015-05-01 22:09 +1000
Message-ID<mailman.9.1430482176.3347.python-list@python.org>
In reply to#89722
On Fri, May 1, 2015 at 9:01 PM, Ben Sizer <kylotan@gmail.com> wrote:
> Another good question, and this gives us the answer. The module lists are quite different, as I'd expect because the load happens in the context of the full application whereas the dump happens as a standalone script. But literally every module that is in the 'before dump' list is in the 'before load' list - except OMDBMap. Like the error says! What /is/ in the 'before load' list however is "my_wsgi_app.OMDBMap". The module has been imported, but the pickle algorithm is unable to reconcile the module in the WSGI app's namespace with the module referenced in the pickle file.
>
> So... I don't know how to fix this, but I do now know why it fails, and I have a satisfactory answer for why it is acting differently on the Linux server (and that is just because that is the only one running under WSGI). Two out of three isn't bad!
>

Cool! That's part way. So, can you simply stuff OMDBMap into
sys.modules prior to loading? It might be a bit of a hack, but it
should work for testing, at least. Conversely, you could change the
dump script to import via the name my_wsgi_app to make it consistent.

ChrisA

[toc] | [prev] | [next] | [standalone]


#90542

FromBen Sizer <kylotan@gmail.com>
Date2015-05-13 03:23 -0700
Message-ID<c46f6933-6a3f-4e13-8396-946e675c35e7@googlegroups.com>
In reply to#89725
On Friday, 1 May 2015 13:09:48 UTC+1, Chris Angelico  wrote:
> 
> Cool! That's part way. So, can you simply stuff OMDBMap into
> sys.modules prior to loading? It might be a bit of a hack, but it
> should work for testing, at least. Conversely, you could change the
> dump script to import via the name my_wsgi_app to make it consistent.
> 
> ChrisA

Sorry for not coming back to this sooner.

In our case we are probably just going to work with a different file format because this was something like the 3rd time that the way pickle works caused our loading to fail for some reason or other. I think the hoops we need to jump through to ensure that the dumping and loading environments match up are too high relative to the cost of switching to JSON or similar, especially if we might need to load the data from something other than Python in future (god forbid).

-- 
Ben Sizer

[toc] | [prev] | [next] | [standalone]


#89729

FromPeter Otten <__peter__@web.de>
Date2015-05-01 14:34 +0200
Message-ID<mailman.11.1430483669.3347.python-list@python.org>
In reply to#89722
Ben Sizer wrote:

> On Thursday, 30 April 2015 01:45:05 UTC+1, Chris Angelico  wrote:
>> On Thu, Apr 30, 2015 at 2:01 AM, Ben Sizer wrote:
>> > 1) There clearly is a module named OMDBMap, and it's importable - it's
>> > there in the 2nd line of the traceback.
>> >
>> > Does anybody have any suggestions on how I can go about debugging this?
>> > Or refactoring it to avoid whatever is happening here?
>> 
>> Are you half way through importing it when this load() call happens?
>> That might cause some issues.
> 
> No, we already imported OMDBMap at the top of OMDBSetup.
> 
>> Has your current directory been changed anywhere in there?
> 
> Good question. It turns out that the current directory seems to be $HOME
> when loading, but is the script directory during saving. This will be
> because the Linux server is running under mod_wsgi, whereas we run the
> save script in-place. Our Windows and Mac tests run via Flask's built-in
> server so the working directory is likely to be the same whether we're
> running the script that does pickle.dump or the whole app that does
> pickle.load.
> 
>> What happens if you catch this exception and print out sys.modules at
>> that point?
> 
> Another good question, and this gives us the answer. The module lists are
> quite different, as I'd expect because the load happens in the context of
> the full application whereas the dump happens as a standalone script. But
> literally every module that is in the 'before dump' list is in the 'before
> load' list - except OMDBMap. Like the error says! What /is/ in the 'before
> load' list however is "my_wsgi_app.OMDBMap". The module has been imported,
> but the pickle algorithm is unable to reconcile the module in the WSGI
> app's namespace with the module referenced in the pickle file.
> 
> So... I don't know how to fix this, but I do now know why it fails, and I
> have a satisfactory answer for why it is acting differently on the Linux
> server (and that is just because that is the only one running under WSGI).
> Two out of three isn't bad!

How about moving OMDBMap.py into the parent folder of my_wsgi_app.__file__ 
or any other folder in sys.path?

[toc] | [prev] | [next] | [standalone]


#90543

FromBen Sizer <kylotan@gmail.com>
Date2015-05-13 03:27 -0700
Message-ID<a4cc8273-0e0f-4eec-adc3-4edd321343dc@googlegroups.com>
In reply to#89729
On Friday, 1 May 2015 13:34:41 UTC+1, Peter Otten  wrote:
> Ben Sizer wrote:
> 
> > So... I don't know how to fix this, but I do now know why it fails, and I
> > have a satisfactory answer for why it is acting differently on the Linux
> > server (and that is just because that is the only one running under WSGI).
> > Two out of three isn't bad!
> 
> How about moving OMDBMap.py into the parent folder of my_wsgi_app.__file__ 
> or any other folder in sys.path?

That might work, but wouldn't be practical for us because in some configurations my_wsgi_app doesn't exist at all (as it is an artifact of running under mod_wsgi environment) and when it does, it it at the top of the hierarchy - so the rest of the app wouldn't be access modules there.

It could be put into sys.path somewhere else... but that is starting to break the project structure just to satisfy pickle. Instead, we'll just use a different format in future.

-- 
Ben Sizer

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web