Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44492

Re: How do I encode and decode this data to write to a file?

Path csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder2.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'static': 0.04; 'debugging': 0.07; 'encoded': 0.07; 'failing': 0.07; 'python3': 0.07; 'skip:" 60': 0.07; 'subject:file': 0.07; 'tries': 0.07; 'utf-8': 0.07; 'bytes,': 0.09; 'default.': 0.09; 'none)': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:How': 0.10; 'python': 0.11; 'def': 0.12; 'creates': 0.14; '"w")': 0.16; '(should': 0.16; '238,': 0.16; 'chasing': 0.16; 'codec': 0.16; 'file.close()': 0.16; 'hierarchy': 0.16; 'name).': 0.16; 'omitting': 0.16; 'ordinal': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'renamed': 0.16; 'url:py': 0.16; 'fix': 0.17; 'wrote:': 0.18; 'variable': 0.18; 'basically': 0.19; 'things.': 0.19; 'saying': 0.22; 'header:User-Agent:1': 0.23; 'byte': 0.24; 'bytes': 0.24; 'unicode': 0.24; 'file.': 0.24; 'equivalent': 0.26; 'skip:" 30': 0.26; 'header:X-Complaints-To:1': 0.27; 'character': 0.29; 'thus': 0.29; 'characters': 0.30; "i'm": 0.30; 'code': 0.31; 'getting': 0.31; 'lines': 0.31; 'file': 0.32; 'class': 0.32; 'run': 0.32; 'url:python': 0.33; '(most': 0.33; 'raw': 0.33; 'problem': 0.35; "can't": 0.35; 'created': 0.35; 'skip:s 30': 0.35; 'convert': 0.35; 'gallery': 0.36; 'skip:f 40': 0.36; 'subject:data': 0.36; 'method': 0.36; 'subject:?': 0.36; 'url:org': 0.36; 'effort': 0.37; 'list': 0.37; 'skip:o 20': 0.38; 'to:addr:python-list': 0.38; 'recent': 0.39; 'does': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'called': 0.40; 'how': 0.40; 'skip:u 10': 0.60; 'skip:o 30': 0.61; 'full': 0.61; 'first': 0.61; 'deals': 0.65; 'url:0': 0.67; 'containing': 0.69; 'incoming': 0.72; 'walk': 0.74; 'subject:this': 0.83; 'gone.': 0.84; 'step,': 0.84; 'str.': 0.91
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Peter Otten <__peter__@web.de>
Subject Re: How do I encode and decode this data to write to a file?
Date Mon, 29 Apr 2013 12:33:16 +0200
Organization None
References <27s15a-943.ln1@chris.zbmc.eu>
Mime-Version 1.0
Content-Type text/plain; charset="UTF-8"
Content-Transfer-Encoding 8Bit
X-Gmane-NNTP-Posting-Host p5084ba8c.dip0.t-ipconnect.de
User-Agent KNode/4.7.3
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1149.1367231585.3114.python-list@python.org> (permalink)
Lines 86
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1367231585 news.xs4all.nl 15939 [2001:888:2000:d::a6]:33432
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:44492

Show key headers only | View raw


cl@isbd.net wrote:

> I am debugging some code that creates a static HTML gallery from a
> directory hierarchy full of images. It's this package:-
>     https://pypi.python.org/pypi/Gallery2.py/2.0
> 
> 
> It's basically working and does pretty much what I want so I'm happy to
> put some effort into it and fix things.
> 
> The problem I'm currently chasing is that it can't cope with directory
> names that have accented characters in them, it fails when it tries to
> write the HTML that creates the page with the thumbnails on.
> 
> The code that's failing is:-
> 
>         raw = os.path.join(directory, self.getNameNoExtension()) + ".html"
>         file = open(raw, "w")
>         file.write("".join(html).encode('utf-8'))
>         file.close()
> 
> The variable html is a list containing the lines of HTML to write to the
> file.  It fails when it contains accented characters (an é in this
> case).  Here's the traceback:-
> 
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line
>   41, in run self._recurse() File
>   "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 272,
>   in _recurse os.path.walk(self.props["sourcedir"], self.processDir, None)
>   File "/usr/lib/python2.7/posixpath.py", line 246, in walk walk(name,
>   func, arg) File "/usr/lib/python2.7/posixpath.py", line 246, in walk
>   walk(name, func, arg) File "/usr/lib/python2.7/posixpath.py", line 246,
>   in walk walk(name, func, arg) File "/usr/lib/python2.7/posixpath.py",
>   line 238, in walk func(arg, top, names) File
>   "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 263,
>   in processDir self.createGallery() File
>   "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 215,
>   in createGallery self.picturemanager.createPictureHTMLs(self.footer)
>   File "/usr/local/lib/python2.7/dist-packages/gallery/picturemanager.py",
>   line 84, in createPictureHTMLs
>   curPic.createPictureHTML(self.galleryDirectory, self.getStylesheet(),
>   self.fullsize, footer) File
>   "/usr/local/lib/python2.7/dist-packages/gallery/picture.py", line 361,
>   in createPictureHTML file.write("".join(html).encode('utf-8'))
>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
>   783: ordinal not in range(128)
> 
> 
> 
> If I understand correctly the encode() is saying that it can't
> understand the data in the html because there's a character 0xc3 in it.
> I *think* this means that the é is encoded in UTF-8 already in the
> incoming data stream (should be as my system is wholly UTF-8 as far as I
> know and I created the directory name).
> 
> So how do I change the code so I don't get the error?  Do I just
> decode() the data first and then encode() it?
> 

Note that you are getting a *UnicodeDecodeError*, not a UnicodeEncodeError. 
Try omitting the encode() step, i. e. instead of

>         file.write("".join(html).encode('utf-8'))

use

file.write(""join(html))

Background (applies to Python 2 only): the str type deals with bytes, not 
code points. The right thing to do is to use .decode(...) to convert from 
str to unicode and .encode(...) to convert from unicode to str. In Python 2 
however the str type has an encode(...) method which is basically equivalent 
to

class str:
   # imaginary python implementation of python2's str
   ...
   def encode(self, encoding):
       return self.decode("ascii").encode(encoding)

and is almost never called intentionally.

PS Python3 has relabeled unicode to str and thus uses unicode by default. 
str was renamed to bytes and the annoying bytes.encode() method is gone.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How do I encode and decode this data to write to a file? cl@isbd.net - 2013-04-29 10:47 +0100
  Re: How do I encode and decode this data to write to a file? Andrew Berg <bahamutzero8825@gmail.com> - 2013-04-29 05:11 -0500
    Re: How do I encode and decode this data to write to a file? cl@isbd.net - 2013-04-29 13:50 +0100
  Re: How do I encode and decode this data to write to a file? Peter Otten <__peter__@web.de> - 2013-04-29 12:33 +0200
  Re: How do I encode and decode this data to write to a file? Dave Angel <davea@davea.name> - 2013-04-29 07:46 -0400
    Re: How do I encode and decode this data to write to a file? cl@isbd.net - 2013-04-29 13:59 +0100
      Re: How do I encode and decode this data to write to a file? Robert Kern <robert.kern@gmail.com> - 2013-04-29 14:11 +0100
        Re: How do I encode and decode this data to write to a file? cl@isbd.net - 2013-04-29 15:38 +0100
          Re: How do I encode and decode this data to write to a file? Skip Montanaro <skip@pobox.com> - 2013-04-29 09:56 -0500
  Re: How do I encode and decode this data to write to a file? Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-29 18:02 -0400
  Re: How do I encode and decode this data to write to a file? Tony the Tiger <tony@tiger.invalid> - 2013-05-01 16:20 -0500
    Re: How do I encode and decode this data to write to a file? Ned Batchelder <ned@nedbatchelder.com> - 2013-05-01 18:01 -0400
  Re: How do I encode and decode this data to write to a file? Ned Batchelder <ned@nedbatchelder.com> - 2013-05-01 19:36 -0400

csiph-web