Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #54752 > unrolled thread
| Started by | nilsbunger@gmail.com |
|---|---|
| First post | 2013-09-25 09:38 -0700 |
| Last post | 2013-09-26 09:35 -0700 |
| Articles | 8 — 5 participants |
Back to article view | Back to comp.lang.python
Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 nilsbunger@gmail.com - 2013-09-25 09:38 -0700
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Chris Angelico <rosuav@gmail.com> - 2013-09-26 14:11 +1000
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Nils Bunger <nilsbunger@gmail.com> - 2013-09-25 21:23 -0700
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Chris Angelico <rosuav@gmail.com> - 2013-09-26 14:32 +1000
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Neil Cerutti <neilc@norwich.edu> - 2013-09-26 13:41 +0000
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Nils Bunger <nilsbunger@gmail.com> - 2013-09-26 08:56 -0700
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Piet van Oostrum <piet@vanoostrum.org> - 2013-09-26 14:44 -0400
Re: Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 Nils Bunger <nilsbunger@gmail.com> - 2013-09-26 09:35 -0700
| From | nilsbunger@gmail.com |
|---|---|
| Date | 2013-09-25 09:38 -0700 |
| Subject | Newline interpretation issue with MIMEApplication with binary data, Python 3.3.2 |
| Message-ID | <14063249-6159-48ff-bfe2-8e8d6e3cd7a4@googlegroups.com> |
Hi,
I'm having trouble encoding a MIME message with a binary file. Newline characters are being interpreted even though the content is supposed to be binary. This is using Python 3.3.2
Small test case:
app = MIMEApplication(b'Q\x0dQ', _encoder=encode_noop)
b = io.BytesIO()
g = BytesGenerator(b)
g.flatten(app)
for i in b.getvalue()[-3:]:
print ("%02x " % i, end="")
print ()
This prints 51 0a 51, meaning the 0x0d character got reinterpreted as a newline.
I've tried setting an email policy of HTTP policy, but that goes even further, converting \r to \r\n
This is for HTTP transport, so binary encoding is normal.
Any thoughts how I can do this properly?
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-09-26 14:11 +1000 |
| Message-ID | <mailman.335.1380168701.18130.python-list@python.org> |
| In reply to | #54752 |
On Thu, Sep 26, 2013 at 2:38 AM, <nilsbunger@gmail.com> wrote: > app = MIMEApplication(b'Q\x0dQ', _encoder=encode_noop) What is MIMEApplication? It's not a builtin, so your test case is missing an import, at least. Is this email.mime.MIMEApplication? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Nils Bunger <nilsbunger@gmail.com> |
|---|---|
| Date | 2013-09-25 21:23 -0700 |
| Message-ID | <870d6a97-613d-401f-ad03-0bbd3b088538@googlegroups.com> |
| In reply to | #54780 |
Chris,
Thanks for answering.
Yes, it's email.mime.MIMEApplication. I've pasted a snippet with the imports below.
I'm trying to use this to build a multi-part MIME message, with this as one part.
I really can't figure out any way to attach a binary part like this to a multi-part MIME message without the encoding issue... any help would be greatly appreciate!
Nils
---------
import io
from email.mime.application import MIMEApplication
from email.generator import BytesGenerator
from email.encoders import encode_noop
app = MIMEApplication(b'Q\x0dQ', _encoder=encode_noop)
b = io.BytesIO()
g = BytesGenerator(b)
g.flatten(app)
for i in b.getvalue()[-3:]:
print("%02x " % i, end="")
print()
On Wednesday, September 25, 2013 9:11:31 PM UTC-7, Chris Angelico wrote:
> On Thu, Sep 26, 2013 at 2:38 AM, <nilsbunger@gmail.com> wrote:
>
> > app = MIMEApplication(b'Q\x0dQ', _encoder=encode_noop)
>
>
>
> What is MIMEApplication? It's not a builtin, so your test case is
>
> missing an import, at least. Is this email.mime.MIMEApplication?
>
>
>
> ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-09-26 14:32 +1000 |
| Message-ID | <mailman.336.1380170351.18130.python-list@python.org> |
| In reply to | #54781 |
On Thu, Sep 26, 2013 at 2:23 PM, Nils Bunger <nilsbunger@gmail.com> wrote: > Yes, it's email.mime.MIMEApplication. I've pasted a snippet with the imports below. > > I'm trying to use this to build a multi-part MIME message, with this as one part. > > I really can't figure out any way to attach a binary part like this to a multi-part MIME message without the encoding issue... any help would be greatly appreciate! I partly responded just to ping your thread, as I'm not particularly familiar with the email.mime module. But a glance at the docs suggests that MIMEApplication is a "subclass of MIMENonMultipart", so might it be a problem to use that for multipart?? It's designed to handle text, so you may want to use an encoder (like the default base64 one) rather than trying to push binary data through it. Random ideas, hopefully someone who actually knows the module can respond. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2013-09-26 13:41 +0000 |
| Message-ID | <bairt0FcltgU1@mid.individual.net> |
| In reply to | #54782 |
On 2013-09-26, Chris Angelico <rosuav@gmail.com> wrote:
> On Thu, Sep 26, 2013 at 2:23 PM, Nils Bunger <nilsbunger@gmail.com> wrote:
>> Yes, it's email.mime.MIMEApplication. I've pasted a snippet
>> with the imports below.
>>
>> I'm trying to use this to build a multi-part MIME message,
>> with this as one part.
>>
>> I really can't figure out any way to attach a binary part like
>> this to a multi-part MIME message without the encoding
>> issue... any help would be greatly appreciate!
>
> I partly responded just to ping your thread, as I'm not
> particularly familiar with the email.mime module. But a glance
> at the docs suggests that MIMEApplication is a "subclass of
> MIMENonMultipart", so might it be a problem to use that for
> multipart??
>
> It's designed to handle text, so you may want to use an encoder
> (like the default base64 one) rather than trying to push binary
> data through it.
>
> Random ideas, hopefully someone who actually knows the module
> can respond.
I got interested in it since I have never used any of the
modules. So I played with it enough to discover that the part of
the code above that converts the \r to \n is the flatten call.
I got to here and RFC 2049 and gave up.
The following guidelines may be useful to anyone devising a data
format (media type) that is supposed to survive the widest range of
networking technologies and known broken MTAs unscathed. Note that
anything encoded in the base64 encoding will satisfy these rules, but
that some well-known mechanisms, notably the UNIX uuencode facility,
will not. Note also that anything encoded in the Quoted-Printable
encoding will survive most gateways intact, but possibly not some
gateways to systems that use the EBCDIC character set.
(1) Under some circumstances the encoding used for data may
change as part of normal gateway or user agent
operation. In particular, conversion from base64 to
quoted-printable and vice versa may be necessary. This
may result in the confusion of CRLF sequences with line
breaks in text bodies. As such, the persistence of
CRLF as something other than a line break must not be
relied on.
(2) Many systems may elect to represent and store text data
using local newline conventions. Local newline
conventions may not match the RFC822 CRLF convention --
systems are known that use plain CR, plain LF, CRLF, or
counted records. The result is that isolated CR and LF
characters are not well tolerated in general; they may
be lost or converted to delimiters on some systems, and
hence must not be relied on.
So putting a raw CR in a binary chunk maybe be intolerable, and
you need to use a different encoder. But I'm out of my element.
--
Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Nils Bunger <nilsbunger@gmail.com> |
|---|---|
| Date | 2013-09-26 08:56 -0700 |
| Message-ID | <defb0c5f-705a-427c-84c9-83e3bab573e3@googlegroups.com> |
| In reply to | #54821 |
Hi Neil, Thanks for looking at this. I'm trying to create a multipart MIME for an HTTP POST request, not an email. This is for a third-party API that requires a multipart POST with a binary file, so I don't have the option to just use a different encoding. Multipart HTTP is standardized in HTTP 1.0 and supports binary parts. Also, no one will re-interpret contents of HTTP on the wire, as binary is quite normal in HTTP. The issue seems to be some parts of the python MIME encoder still assume it's for email only, where everything would be b64 encoded. Maybe I have to roll my own to create a multipart msg with a binary file? I was hoping to avoid that. Nils ps. You probably know this, but in case anyone else reads this thread, HTTP requires all headers to have CRLF, not native line endings. The python MIME modules can do that properly as of python 3.2 (fixed as of this bug http://hg.python.org/cpython/rev/ebf6741a8d6e/) > > I got interested in it since I have never used any of the > > modules. So I played with it enough to discover that the part of > > the code above that converts the \r to \n is the flatten call. > > > > I got to here and RFC 2049 and gave up. > > > > The following guidelines may be useful to anyone devising a data > > format (media type) that is supposed to survive the widest range of > > networking technologies and known broken MTAs unscathed. Note that > > anything encoded in the base64 encoding will satisfy these rules, but > > that some well-known mechanisms, notably the UNIX uuencode facility, > > will not. Note also that anything encoded in the Quoted-Printable > > encoding will survive most gateways intact, but possibly not some > > gateways to systems that use the EBCDIC character set. > > > > (1) Under some circumstances the encoding used for data may > > change as part of normal gateway or user agent > > operation. In particular, conversion from base64 to > > quoted-printable and vice versa may be necessary. This > > may result in the confusion of CRLF sequences with line > > breaks in text bodies. As such, the persistence of > > CRLF as something other than a line break must not be > > relied on. > > > > (2) Many systems may elect to represent and store text data > > using local newline conventions. Local newline > > conventions may not match the RFC822 CRLF convention -- > > systems are known that use plain CR, plain LF, CRLF, or > > counted records. The result is that isolated CR and LF > > characters are not well tolerated in general; they may > > be lost or converted to delimiters on some systems, and > > hence must not be relied on. > > > > So putting a raw CR in a binary chunk maybe be intolerable, and > > you need to use a different encoder. But I'm out of my element. > > > > -- > > Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Piet van Oostrum <piet@vanoostrum.org> |
|---|---|
| Date | 2013-09-26 14:44 -0400 |
| Message-ID | <m2eh8bifsh.fsf@cochabamba.vanoostrum.org> |
| In reply to | #54836 |
Nils Bunger <nilsbunger@gmail.com> writes: > Hi Neil, > > Thanks for looking at this. > > I'm trying to create a multipart MIME for an HTTP POST request, not an > email. This is for a third-party API that requires a multipart POST > with a binary file, so I don't have the option to just use a different > encoding. > > Multipart HTTP is standardized in HTTP 1.0 and supports binary parts. > Also, no one will re-interpret contents of HTTP on the wire, as binary > is quite normal in HTTP. > > The issue seems to be some parts of the python MIME encoder still > assume it's for email only, where everything would be b64 encoded. > > Maybe I have to roll my own to create a multipart msg with a binary > file? I was hoping to avoid that. The email MIME stuff is not really adapted for HTTP. I would advise to use the Requests package (http://docs.python-requests.org/en/latest/) or the Uploading Files part from Doug Hellmann's page (http://doughellmann.com/2009/07/pymotw-urllib2-library-for-opening-urls.html). This is for Python2; I can send you a Python3 version if you want. -- Piet van Oostrum <piet@vanoostrum.org> WWW: http://pietvanoostrum.com/ PGP key: [8DAE142BE17999C4]
[toc] | [prev] | [next] | [standalone]
| From | Nils Bunger <nilsbunger@gmail.com> |
|---|---|
| Date | 2013-09-26 09:35 -0700 |
| Message-ID | <a675d865-8961-49a5-b70f-0d795b353aa2@googlegroups.com> |
| In reply to | #54752 |
Hi all,
I was able to workaround this problem by encoding a unique 'marker' in the binary part, then replacing the marker with the actual binary content after generating the MIME message.
See my answer on Stack Overflow http://stackoverflow.com/a/19033750/526098 for the code.
Thanks, your suggestions helped me think of this.
Nils
On Wednesday, September 25, 2013 9:38:17 AM UTC-7, Nils Bunger wrote:
> Hi,
>
>
>
> I'm having trouble encoding a MIME message with a binary file. Newline characters are being interpreted even though the content is supposed to be binary. This is using Python 3.3.2
>
>
>
> Small test case:
>
>
>
> app = MIMEApplication(b'Q\x0dQ', _encoder=encode_noop)
>
> b = io.BytesIO()
>
> g = BytesGenerator(b)
>
> g.flatten(app)
>
> for i in b.getvalue()[-3:]:
>
> print ("%02x " % i, end="")
>
> print ()
>
>
>
> This prints 51 0a 51, meaning the 0x0d character got reinterpreted as a newline.
>
>
>
> I've tried setting an email policy of HTTP policy, but that goes even further, converting \r to \r\n
>
>
>
> This is for HTTP transport, so binary encoding is normal.
>
>
>
> Any thoughts how I can do this properly?
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web