Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #45246

Getting ASCII encoding where unicode wanted under Py3k

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <jonathan.hayward@pobox.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'elif': 0.05; 'encoding': 0.05; 'args': 0.07; 'differently': 0.07; 'element': 0.07; 'none:': 0.07; 'subject:Getting': 0.07; 'sys': 0.07; 'to:name:the chicago python users group': 0.07; "'')": 0.09; '*the': 0.09; '8bit%:30': 0.09; 'ascii': 0.09; 'encode': 0.09; 'none)': 0.09; 'try:': 0.09; 'sfxlen:2': 0.11; 'def': 0.12; 'changes': 0.15; 'posted': 0.15; '""):': 0.16; '&lt;body&gt;': 0.16; '&lt;div': 0.16; '&lt;form': 0.16; '&lt;head&gt;': 0.16; '&lt;input': 0.16; '&lt;script': 0.16; "'''": 0.16; "'rb')": 0.16; 'codec': 0.16; 'codecs': 0.16; 'default)': 0.16; 'from:addr:pobox.com': 0.16; 'ioerror:': 0.16; 'ordinal': 0.16; 'received:208.72.237.35': 0.16; 'received:smtp.pobox.com': 0.16; 'script,': 0.16; 'sense,': 0.16; 'skip:n 110': 0.16; 'subject:under': 0.16; 'subject:unicode': 0.16; 'subject:where': 0.16; 'index': 0.16; 'skip:# 20': 0.16; "skip:' 30": 0.19; 'import': 0.22; 'print': 0.22; 'error': 0.23; 'byte': 0.24; 'received:pobox.com': 0.24; 'skip:% 10': 0.24; 'skip:\xa0 20': 0.24; 'skip:v 30': 0.26; 'values': 0.27; 'to:2**1': 0.27; 'skip:p 30': 0.29; 'character': 0.29; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'getting': 0.31; "skip:' 10": 0.31; 'cgi': 0.31; 'high.': 0.31; 'pickle': 0.31; 'received:208.72': 0.31; 'received:208.72.237': 0.31; 'run': 0.32; 'received:209.85.212': 0.32; 'text': 0.33; 'skip:# 10': 0.33; 'skip:& 30': 0.33; "can't": 0.35; 'received:209.85': 0.35; 'except': 0.35; 'skip:s 30': 0.35; 'to:addr:chicago': 0.35; 'received:google.com': 0.35; 'google': 0.35; 'books,': 0.36; 'false': 0.36; 'skip:j 20': 0.36; 'doing': 0.36; 'method': 0.36; 'should': 0.36; 'too': 0.37; 'received:209': 0.37; 'christian': 0.38; 'skip:& 10': 0.38; 'url:amazon': 0.38; 'handle': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'skip:p 20': 0.39; 'skip:u 10': 0.60; 'read': 0.60; 'most': 0.60; 'skip:o 30': 0.61; 'skip:t 30': 0.61; 'facebook': 0.61; 'received:208': 0.61; 'skip:* 10': 0.61; "you'll": 0.62; '[image:': 0.65; 'charset:windows-1252': 0.65; 'url:facebook': 0.67; 'url:png': 0.68; '8bit%:31': 0.68; 'default': 0.69; 'below.': 0.71; 'arial,': 0.74; 'russian': 0.74; 'url:images': 0.83; '*web': 0.84; 'page<>': 0.84; 'skip:s 80': 0.84; 'url:plus': 0.84; 'verdana,': 0.84; '<>*': 0.95
DKIM-Signature v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=mime-version :date:message-id:subject:from:to:content-type; s=sasl; bh=O8aYxO E2f4Lhzb+gf6FLOILrAgk=; b=N+jh7+1ZD4a+Nkg5CWmgLunyf8PfmBxqbaw3iF MurZwcQpTPtjiZvMygBJR/qyE/kOGtCGKwHcVC21LGrADw5VLhIPItdUrBQSO1JZ Z5QCOEY56/k+jve6EammgzKDINqUgxn3vB/5//FsNcsgQDU2D5jnXEchXVEt0Dvw ecGso=
DomainKey-Signature a=rsa-sha1; c=nofws; d=pobox.com; h=mime-version :date:message-id:subject:from:to:content-type; q=dns; s=sasl; b= yXRtFCMLmV/mGy7CquDfOAKG/cQtPqiJjTiqVMDDX44heSLlYg4mj8QGVGBa6p3J p4cO/KTrxv3Pstwq9KeRlwFjhbNSVQ0bp0k2yeIPqaDGVR5qO6BqhcJzR6DoeJ1Y hIqeoNO/GjtHUWtsr8CUHgxanWqW8EvY2wF2b7t9pvA=
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=BYJd6ww4kO8En3nw5lJq68nSXIoIL6+hD11jH8Olnck=; b=kIX1ciB4NG9HE3YZWVTJg2tfXkKdoo3vHAy6XPPsdHfmkTt+1j3L/niAYLZ2bWS2wT aPSaVzuMsE98LEAl8gbH/Oxh0wxJUULXan5mqMPSCF2N3Xj7d6RaOM+9klJod7pRXjfC jeeakvmseXUzYAI2vJ5yOq5K4YvTerzJgz0hXyokXdKeojLVrY59ULegkO3YOtuQVezw L8QsJGAFn4QsT7e2gtrGiqr6ZaYgS4aC26iHG12jhb524B8J+QM6u1CHJHFPQOPliYpr g/qXWxjCaKFhGzGliZbSLQobEgTPDU+5US62KTbQvsmnEnOsF89iExtbUNfWrWJdmsSS HKQA==
MIME-Version 1.0
X-Received by 10.180.109.84 with SMTP id hq20mr10009182wib.11.1368460773142; Mon, 13 May 2013 08:59:33 -0700 (PDT)
Date Mon, 13 May 2013 10:59:33 -0500
Subject Getting ASCII encoding where unicode wanted under Py3k
From Jonathan Hayward <jonathan.hayward@pobox.com>
To The Chicago Python Users Group <chicago@python.org>, python-list@python.org
Content-Type multipart/alternative; boundary=e89a8f3b9db145cbab04dc9b9a23
X-Pobox-Relay-ID 1AD8680A-BBE6-11E2-A499-E56BAAC0D69C-07697135!b-pb-sasl-quonix.pobox.com
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1629.1368460785.3114.python-list@python.org> (permalink)
Lines 619
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1368460785 news.xs4all.nl 15872 [2001:888:2000:d::a6]:46165
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:45246

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

I have a Py3k script, pasted below. When I run it I get an error about
ASCII codecs that can't handle byte values that are too high.

The error that I am getting is:

UnicodeEncodeError: 'ascii' codec can't encode character '\u0161' in
position 1442: ordinal not in range(128)
      args = ('ascii', "Content-Type: text/html\n\n<!DOCTYPE
html>\n<html>\n...ype='submit'>\n </form>\n </body>\n</html>", 1442,
1443, 'ordinal not in range(128)')
      encoding = 'ascii'
      end = 1443
      object = "Content-Type: text/html\n\n<!DOCTYPE
html>\n<html>\n...ype='submit'>\n </form>\n </body>\n</html>"
      reason = 'ordinal not in range(128)'
      start = 1442
      with_traceback = <built-in method with_traceback of
UnicodeEncodeError object>

(And that was posted to StackOverflow--one shot in the dark answer so far.)

My code is below. What should I be doing differently to be, in the most
immediate sense, calls to '''%(foo)s''' % locals()?

#!/usr/bin/python3

import cgi
import cgitb;cgitb.enable()
import os
import pickle
import sys

cgi_form = cgi.FieldStorage()

message = ''

def get_cgi(field, default = ""):
    return cgi_form.getfirst(field, default)

try:
    sys.stderr.write('abc: 1')
    input_file = open(
      os.path.join(os.path.dirname(__file__), '../../../russian/pickled'), 'rb')
    sys.stderr.write('abc: 2')
    state = pickle.load(input_file)
    sys.stderr.write('abc: 3')
    state['changed'] = False
    sys.stderr.write('abc: 4')
    state['loaded'] = True
    sys.stderr.write('abc: 5')
except IOError:
    state = {}
    state['phrases'] = []
    state['changed'] = True
    state['loaded'] = False
#except UnicodeDecodeError:
    #state = {}
    #state['phrases'] = []
    #state['changed'] = True
    state['loaded'] = False

if get_cgi('russian') and get_cgi('english'):
    state['phrases'].append([get_cgi('russian'), get_cgi('english')])
    message = 'Your changes have been saved.'
    state['changed'] = True
elif get_cgi('english'):
    state['phrases'].append([None, get_cgi('english')])
    message = 'Your change has been saved.'
    state['changed'] = True

if get_cgi('mode') == 'edit':
    to_delete = []
    for index in range(len(state['phrases'])):
        if get_cgi('russian_' + str(index), None) != None:
            state['phrases'][index][0] = get_cgi('russian_' + str(index))
        if get_cgi('english_' + str(index), None) != None:
            state['phrases'][index][1] = get_cgi('english_' + str(index))
        if get_cgi('delete_' + str(index), None) != None:
            to_delete.insert(0, index)
            #to_delete.append(index)
    #to_delete.sort(lambda a, b: -cmp(a, b))
    for element in to_delete:
        del state['phrases'][element]
    state['changed'] = True

sys.stderr.write('abc: ' + repr(state))

if state['changed']:
    output_file = open(
      os.path.join(os.path.dirname(__file__), '../../../russian/pickled_new.' +
        str(os.getpid())), 'wb')
    pickle.dump(state, output_file)
    output_file.close()
    os.rename(
      os.path.join(os.path.dirname(__file__), '../../../russian/pickled_new.' +
        str(os.getpid())),
      os.path.join(os.path.dirname(__file__), '../../../russian/pickled'))

if get_cgi('mode') == 'add':
    print('''Content-type: text/html

<!DOCTYPE html>
<html>
    <head>
        <meta charset='UTF-8' />
        <style type="text/css">
            body
                {
                font-family: Verdana, Arial, sans;
                }
            input[type=text]
                {
                width: 100%%;
                }
            div.message
                {
                background-color: silver;
                }
        </style>
    </head>
    <body>
        <div class='message'>
            %(message)s
        </div>
        <form action='' method='POST'>
            <p><strong>Russian:</strong><br />
            <input type='text' id='russian' name='russian'></p>

            <p><strong>English:</strong><br />
            <input type='text' id='english' name='english'></p>

            <p><input type='submit'>
        </form>
        <script src='/include/jquery.js'></script>
        <script>
            jQuery('#Russian').focus();
        </script>
    </body>
</html>''' % locals())
elif get_cgi('mode') == 'edit':
    edit_table = '<table>'
    edit_table += '<thead>'
    edit_table += '<th>Russian</th>'
    edit_table += '<th>English</th>'
    edit_table += '<th>Delete</th>'
    edit_table += '</thead>'
    edit_table += '<tbody>'
    for index in range(len(state['phrases'])):
        if state['phrases'][index][0]:
            russian = state['phrases'][index][0].replace('"', "''")
        else:
            russian = None
        english = state['phrases'][index][1].replace('"', "''")
        edit_table += '<tr>'
        edit_table += '<td>'
        if russian != None:
            edit_table += '''<input type='text' name='russian_%(index)d'
            id='russian_%(index)d' value="%(russian)s">''' % locals()
        edit_table += '</td>\n'
        edit_table += '<td>'
        edit_table += '''<input type='text' name='english_%(index)d'
        id='english_%(index)d' value="%(english)s">''' % locals()
        edit_table += '<td>'
        edit_table += '''<input type='checkbox' name='delete_%(index)d'
        id='delete_%(index)d'>''' % locals()
        edit_table += '</td>\n'
        edit_table += '</tr>'
    print ('''Content-Type: text/html

<!DOCTYPE html>
<html>
    <head>
        <title>Edit Phrases</title>
        <meta charset='utf-8' />
    </head>
    <body>
        <form action='' name='edit' id='edit' />
            <input type='hidden' name='mode' value='edit' />
            %(edit_table)s
            <input type='submit'>
        </form>
    </body>
</html>''' % locals())
else:
    text = ''
    for phrase in state['phrases']:
        if phrase[0]:
            text += ('<p title="' + phrase[1].replace('"', "''") + '">' +
              phrase[0] + '</p>')
        else:
            text += '<h2>' + phrase[1] + '</h2>'

    print ('''Content-type: text/html

<!DOCTYPE html>
<html>
    <head>
        <title>___________</title>
        <meta charset='utf-8'>
        <style type='text/css'>
            body
                {
                font-family: Verdana, Arial, sans;
                }
            div
                {
                background-color: #ffff00;
                }
        </style>
        <link rel='stylesheet' type='text/css'
        href='/js/jquery-ui-1.10.2.custom/css/smoothness/jquery-ui-1.10.2.custom.css'
        />
    </head>
    <body>
        %(text)s
        <script src='/js/vendor/jquery-1.8.2-min.js'></script>
        <script
        src='/js/jquery-ui-1.10.2.custom/js/jquery-ui-1.10.2.custom.min.js'></script>
        <script>
            jQuery(document).tooltip();
        </script>
    </body>
</html>''' % locals())


-- 
[image: Christos Jonathan Hayward] <http://jonathanscorner.com/>
Christos Jonathan Hayward, an Orthodox Christian author.

*Amazon <http://amazon.com/author/cjshayward>* • Author
Bio<http://jonathanscorner.com/author/>
 • *Author Site <http://cjsh.name/>* •
*Email<christos.jonathan.hayward@gmail.com>
* • Facebook <http://www.facebook.com/christos.jonathan.hayward> • Fan
Page<http://fan.cjshayward.com/>
 • Google Plus <http://jonathanscorner.com/plus> •
LinkedIn<http://www.linkedin.com/in/jonathanhayward>
 • *Professional <http://jonathanhayward.com/>* •
Twitter<http://twitter.com/JonathansCorner>
 • *Web <http://jonathanscorner.com/>* • What's
New?<http://jonathanscorner.com/>
If you read just *one* of my books, you'll want *The Best of Jonathan's
Corner <http://www.amazon.com/dp/1478219912>*.

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Getting ASCII encoding where unicode wanted under Py3k Jonathan Hayward <jonathan.hayward@pobox.com> - 2013-05-13 10:59 -0500
  Re: Getting ASCII encoding where unicode wanted under Py3k Peter Pearson <ppearson@nowhere.invalid> - 2013-05-13 16:32 +0000

csiph-web