Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #76411

Re: Unicode in cgi-script with apache2

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin3!goblin2!goblin.stu.neva.ru!newsfeed1.swip.net!uio.no!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <dominique@ramaekers-stassart.be>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'scripts': 0.03; 'resulting': 0.04; 'encoding': 0.05; 'output': 0.05; 'subsequent': 0.05; 'binary': 0.07; 'error:': 0.07; 'fixes': 0.07; 'nicely': 0.07; 'python3': 0.07; 'sys': 0.07; 'utf-8': 0.07; 'locale': 0.09; 'subject:script': 0.09; 'python': 0.11; 'apache': 0.15; 'posted': 0.15; '"r")': 0.16; 'codec': 0.16; 'index.html': 0.16; 'ordinal': 0.16; 'subject:Unicode': 0.16; 'sys.stdout': 0.16; 'wsgi': 0.16; 'sat,': 0.16; 'fix': 0.17; 'wrote:': 0.18; 'slightly': 0.19; 'solution.': 0.20; 'seems': 0.21; 'input': 0.22; 'import': 0.22; '(in': 0.22; 'header:User-Agent:1': 0.23; 'error': 0.23; 'byte': 0.24; 'skip:l 30': 0.24; 'typical': 0.24; 'looks': 0.24; "i've": 0.25; 'script': 0.25; 'pending': 0.26; 'solutions.': 0.26; 'header :In-Reply-To:1': 0.27; 'tried': 0.27; 'fixed': 0.29; 'gives': 0.31; 'code': 0.31; 'serve': 0.31; "skip:' 10": 0.31; 'prints': 0.31; 'anyone': 0.31; 'file': 0.32; 'me?': 0.32; 'run': 0.32; 'skip:# 10': 0.33; 'could': 0.34; 'knowledge': 0.35; 'subject:with': 0.35; "can't": 0.35; 'skip:s 30': 0.35; 'but': 0.35; 'accessing': 0.36; 'done': 0.36; "didn't": 0.36; 'should': 0.36; 'changing': 0.37; 'error.': 0.37; 'implement': 0.38; 'to:addr:python-list': 0.38; 'previous': 0.38; 'little': 0.38; 'to:addr:python.org': 0.39; 'skip:p 20': 0.39; 'skip:u 10': 0.60; 'read': 0.60; 'skip:o 30': 0.61; 'show': 0.63; 'skip:n 10': 0.64; 'choose': 0.64; 'more': 0.64; 'different': 0.65; 'charset:windows-1252': 0.65; 'webpage': 0.68; 'default': 0.69; 'jul': 0.74; '1997': 0.84; 'peter,': 0.84; 'dirty': 0.93
Date Sun, 17 Aug 2014 00:49:47 +0200
From Dominique Ramaekers <dominique@ramaekers-stassart.be>
User-Agent Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0
MIME-Version 1.0
To python-list@python.org
Subject Re: Unicode in cgi-script with apache2
References <53EE4D11.7040604@ramaekers-stassart.be> <lsnejt$fa$1@ger.gmane.org>
In-Reply-To <lsnejt$fa$1@ger.gmane.org>
Content-Type text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding 7bit
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.13056.1408229389.18130.python-list@python.org> (permalink)
Lines 87
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1408229389 news.xs4all.nl 2953 [2001:888:2000:d::a6]:48168
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:76411

Show key headers only | View raw


Hi Peter,

Your code seems interesting.

I've tried using sys.stdout (in a slightly different form) but it gave 
the same error.

I also read about people who fixed the error by changing the servers 
locale to en_US.UTF-8. The people who posted these fixes also said that 
you can only use en_US.UTF-8 (and not ex. nl_BE.UTF8)... Anyway, It 
didn't work for me. And I find this a dirty fix because, I don't want to 
use US locale...

Please excuse me not to try out your specific solutions. I've already 
started to implement WSGI over CGI. See my previous message...

grz

Op 16-08-14 om 13:17 schreef Peter Otten:
> Dominique Ramaekers wrote:
>
>> I've got a little script:
>>
>> #!/usr/bin/env python3
>> print("Content-Type: text/html")
>> print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
>> print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
>> print("")
>> f = open("/var/www/cgi-data/index.html", "r")
>> for line in f:
>>       print(line,end='')
>>
>> If I run the script in the terminal, it nicely prints the webpage
>> 'index.html'.
>>
>> If access the script through a webbrowser, apache gives an error:
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
>> 1791: ordinal not in range(128)
>>
>> I've done a hole afternoon of reading on fora and blogs, I don't have a
>> solution.
>>
>> Can anyone help me?
> If the input and output encoding are the same you can avoid the byte-to-text
> (and subsequent text-to-byte conversion) and serve the binary contents of
> the index.html file directly:
>
> #!/usr/bin/env python3
> import sys
>
> print("Content-Type: text/html")
> print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
> print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
> print("")
> sys.stdout.flush()
> with open("/var/www/cgi-data/index.html", "rb") as f:
>      for line in f:
>          sys.stdout.buffer.write(line)
>
> The flush() is necessary to write pending data before accessing the lowlevel
> stdout.buffer. Instead of the loop you can use any of these:
>
> sys.stdout.buffer.write(f.read()) # not for huge files, but should be OK for
>                                    # typical html file sizes
> sys.stdout.buffer.writelines(f)
> shutil.copyfileobj(f, sys.stdout.buffer) # show off your knowledge
>                                           # of the stdlib ;)
>
>
> Alternatively you could choose an encoding via the locale:
>
> #!/usr/bin/env python3
> import locale
> locale.setlocale(locale.LC_ALL, "en_US.UTF-8")
>
> print("Content-Type: text/html")
> print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
> print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
> print("")
> with open("/var/www/cgi-data/index.html") as f:
>      for line in f:
>          print(line, end='')
>
> Python should then use UTF-8 as the default for i/o and the resulting
> scripts looks more familiar.
>

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 00:49 +0200

csiph-web