Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #51641 > unrolled thread
| Started by | "Frank Millman" <frank@chagford.com> |
|---|---|
| First post | 2013-07-31 13:43 +0200 |
| Last post | 2013-08-01 10:03 +0200 |
| Articles | 3 — 2 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Problem with psycopg2, bytea, and memoryview "Frank Millman" <frank@chagford.com> - 2013-07-31 13:43 +0200
Re: Problem with psycopg2, bytea, and memoryview Neil Cerutti <neilc@norwich.edu> - 2013-07-31 14:08 +0000
Re: Problem with psycopg2, bytea, and memoryview "Frank Millman" <frank@chagford.com> - 2013-08-01 10:03 +0200
| From | "Frank Millman" <frank@chagford.com> |
|---|---|
| Date | 2013-07-31 13:43 +0200 |
| Subject | Re: Problem with psycopg2, bytea, and memoryview |
| Message-ID | <mailman.10.1375270989.1251.python-list@python.org> |
"Antoine Pitrou" <solipsis@pitrou.net> wrote in message news:loom.20130731T114936-455@post.gmane.org... > Frank Millman <frank <at> chagford.com> writes: >> >> I have some binary data (a gzipped xml object) that I want to store in a >> database. For PostgreSQL I use a column with datatype 'bytea', which is >> their recommended way of storing binary strings. >> >> I use psycopg2 to access the database. It returns binary data in the form >> of >> a python 'memoryview'. >> > [...] >> >> Using MS SQL Server and pyodbc, it returns a byte string, not a >> memoryview, >> and it does compare equal with the original. >> >> I can hack my program to use tobytes(), but it would add complication, >> and >> it would be database-specific. I would prefer a cleaner solution. > > Just cast the result to bytes (`bytes(row[1])`). It will work both with > bytes > and memoryview objcts. > > Regards > > Antoine. > Thanks for that, Antoine. It is an improvement over tobytes(), but i am afraid it is still not ideal for my purposes. At present, I loop over a range of columns, comparing 'before' and 'after' values, without worrying about their types. Strings are returned as str, integers are returned as int, etc. Now I will have to check the type of each column before deciding whether to cast to 'bytes'. Can anyone explain *why* the results do not compare equal? If I understood the problem, I might be able to find a workaround. Frank
[toc] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2013-07-31 14:08 +0000 |
| Message-ID | <b5sk3cFkiq8U1@mid.individual.net> |
| In reply to | #51641 |
On 2013-07-31, Frank Millman <frank@chagford.com> wrote: > > "Antoine Pitrou" <solipsis@pitrou.net> wrote in message > news:loom.20130731T114936-455@post.gmane.org... >> Frank Millman <frank <at> chagford.com> writes: >>> >>> I have some binary data (a gzipped xml object) that I want to store in a >>> database. For PostgreSQL I use a column with datatype 'bytea', which is >>> their recommended way of storing binary strings. >>> >>> I use psycopg2 to access the database. It returns binary data >>> in the form of a python 'memoryview'. >>> >> [...] >>> >>> Using MS SQL Server and pyodbc, it returns a byte string, not >>> a memoryview, and it does compare equal with the original. >>> >>> I can hack my program to use tobytes(), but it would add >>> complication, and it would be database-specific. I would >>> prefer a cleaner solution. >> >> Just cast the result to bytes (`bytes(row[1])`). It will work >> both with bytes and memoryview objcts. > > Thanks for that, Antoine. It is an improvement over tobytes(), > but i am afraid it is still not ideal for my purposes. > > At present, I loop over a range of columns, comparing 'before' > and 'after' values, without worrying about their types. Strings > are returned as str, integers are returned as int, etc. Now I > will have to check the type of each column before deciding > whether to cast to 'bytes'. > > Can anyone explain *why* the results do not compare equal? If I > understood the problem, I might be able to find a workaround. A memoryview will compare equal to another object that supports the buffer protocol when the format and shape are also equal. The database must be returning chunks of binary data in a different shape or format than you are writing it. Perhaps psycopg2 is returning a chunk of ints when you have written a chunk of bytes. Check the .format and .shape members of the return value to see. >>> x = memoryview(b"12345") >>> x.format 'B' >>> x.shape (5,) >>> x == b"12345" True My guess is you're getting format "I" from psycopg2. Hopefully there's a way to coerce your desired "B" format interpretation of the raw data using psycopg2's API. -- Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | "Frank Millman" <frank@chagford.com> |
|---|---|
| Date | 2013-08-01 10:03 +0200 |
| Message-ID | <mailman.70.1375344202.1251.python-list@python.org> |
| In reply to | #51659 |
"Neil Cerutti" <neilc@norwich.edu> wrote in message news:b5sk3cFkiq8U1@mid.individual.net... > On 2013-07-31, Frank Millman <frank@chagford.com> wrote: >> >> >> Can anyone explain *why* the results do not compare equal? If I >> understood the problem, I might be able to find a workaround. > > A memoryview will compare equal to another object that supports > the buffer protocol when the format and shape are also equal. The > database must be returning chunks of binary data in a different > shape or format than you are writing it. > > Perhaps psycopg2 is returning a chunk of ints when you have > written a chunk of bytes. Check the .format and .shape members of > the return value to see. > >>>> x = memoryview(b"12345") >>>> x.format > 'B' >>>> x.shape > (5,) >>>> x == b"12345" > True > > My guess is you're getting format "I" from psycopg2. Hopefully > there's a way to coerce your desired "B" format interpretation of > the raw data using psycopg2's API. > Thanks very much for the explanation, Neil. I tried what you suggested, and the object returned by psycopg2 has a format of 'c' and a shape of (5,). I don't know what it means, but luckily I have found a workaround. I enquired on the psycopg2 list, and someone explained how I can create an extension that forces it to return 'bytes' instead of a 'memoryview'. I tested it and it works. Problem solved :-) For the record, I passed on the suggestion from Antoine and Terry that they change their program to return 'bytes'. It will be interesting to see if anyone responds. Thanks again to all for your help. Frank
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web