Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #76414

Re: Unicode in cgi-script with apache2

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed1.swip.net!uio.no!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <dominique@ramaekers-stassart.be>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'python,': 0.02; 'encoding': 0.05; '"""': 0.07; 'binary': 0.07; 'encoded': 0.07; 'problem:': 0.07; 'python3': 0.07; 'utf-8': 0.07; 'string': 0.09; 'locale': 0.09; 'override': 0.09; 'parameter': 0.09; 'subject:script': 0.09; 'suggestions.': 0.09; 'windows,': 0.09; 'python': 0.11; 'bug': 0.12; 'accepting': 0.14; 'apache': 0.15; 'windows': 0.15; 'effect.': 0.16; 'microsoft...': 0.16; 'specifying': 0.16; 'stdout': 0.16; 'subject:Unicode': 0.16; 'sys.stdout': 0.16; 'system-wide': 0.16; 'true),': 0.16; 'underlying': 0.16; 'fix': 0.17; 'wrote:': 0.18; 'variable': 0.18; 'bit': 0.19; 'trying': 0.19; 'else,': 0.19; 'things.': 0.19; 'help.': 0.21; 'seems': 0.21; 'appears': 0.22; 'input': 0.22; 'platforms': 0.22; 'aug': 0.22; 'coding': 0.22; 'print': 0.22; 'header:User-Agent:1': 0.23; 'error': 0.23; 'interpret': 0.24; 'script.': 0.24; 'file.': 0.24; '(or': 0.24; 'environment': 0.24; "i've": 0.25; 'script': 0.25; '(see': 0.26; 'skip:" 30': 0.26; 'header:In-Reply-To:1': 0.27; 'tried': 0.27; 'character': 0.29; 'wonder': 0.29; "doesn't": 0.30; "i'm": 0.30; 'code': 0.31; 'page.': 0.31; 'ok.': 0.31; 'file': 0.32; 'figure': 0.32; 'linux': 0.33; 'running': 0.33; 'used,': 0.33; 'maybe': 0.34; 'problem': 0.35; 'subject:with': 0.35; 'something': 0.35; 'anybody': 0.35; 'but': 0.35; '+0200,': 0.36; 'interface,': 0.36; "didn't": 0.36; 'method': 0.36; 'should': 0.36; 'starting': 0.37; 'expected': 0.38; 'server': 0.38; 'whatever': 0.38; 'to:addr:python-list': 0.38; 'rather': 0.38; 'though,': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'read': 0.60; 'tell': 0.60; 'further': 0.61; 'first': 0.61; 'become': 0.64; '(that': 0.65; 'situation': 0.65; 'charset:windows-1252': 0.65; 'details': 0.65; 'believe': 0.68; 'spawned': 0.84; 'versions)': 0.84
Date Sun, 17 Aug 2014 07:32:07 +0200
From Dominique Ramaekers <dominique@ramaekers-stassart.be>
User-Agent Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0
MIME-Version 1.0
To python-list@python.org
Subject Re: Unicode in cgi-script with apache2
References <mailman.13038.1408130249.18130.python-list@python.org> <satHv.195207$ze2.61877@fx28.am4> <mailman.13054.1408229123.18130.python-list@python.org> <lsp5ab$sjv$1@dont-email.me>
In-Reply-To <lsp5ab$sjv$1@dont-email.me>
Content-Type text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding 7bit
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.13058.1408253857.18130.python-list@python.org> (permalink)
Lines 83
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1408253857 news.xs4all.nl 2878 [2001:888:2000:d::a6]:38285
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:76414

Show key headers only | View raw


* My system is a linux-box.

* I've tried using encoding="utf-8". It didn't fix things.

* That print uses sys.stdout would explain, using sys.stdout isn't better.

* My locale and the system-wide locale is UTF-8. Using SetEnv 
PYTHONIOENCODING utf-8 didn't fix things

* The file is encoded UTF-8...

I can not speak for anybody else but in my search I don't believe to 
have read about someone who had the problem on a Windows-system. They 
all used linux (different kinds of flavors) or OS-X... This is the first 
time I've encountered a situation where Windows is better in encoding 
issues :P +1 for Microsoft...

I think that Apache (*nix versions) doesn't tell Python, she's accepting 
UTF-8. Or Python doesn't listen right... Maybe I should place a bug 
report in both projects?


Op 17-08-14 om 04:50 schreef Denis McMahon:
> On Sun, 17 Aug 2014 00:36:14 +0200, Dominique Ramaekers wrote:
>
>> What seems to be the problem:
>> My Script was ok. I know this because in the terminal I got my expected
>> output. Python3 uses UTF-8 coding as a standard. The problem is, when
>> python 'prints' to the apache interface, it translates the string to
>> ascii. (Why, I never found an answer).
> Is the apache server running on a linux or a windows platform?
>
> The problem may not be python, it may be the underlying OS. I wonder if
> apache is spawning a process for python though, and if so whether it is
> in some way constraining the character set available to stdout of the
> spawned process.
>
>  From your other message, the error appears to be a python error on
> reading the input file. For some reason python seems to be trying to
> interpret the file it is reading as ascii.
>
> I wonder if specifying the binary data parameter and / or utf-8 encoding
> when opening the file might help.
>
> eg:
>
> f = open( "/var/www/cgi-data/index.html", "rb" )
> f = open( "/var/www/cgi-data/index.html", "rb", encoding="utf-8" )
> f = open( "/var/www/cgi-data/index.html", "r", encoding="utf-8" )
>
> I've managed to drive down a bit further in the problem:
>
> print() goes to sys.stdout
>
> This is part of what the docs say about sys.stdout:
>
> """
> The character encoding is platform-dependent. Under Windows, if the
> stream is interactive (that is, if its isatty() method returns True), the
> console codepage is used, otherwise the ANSI code page. Under other
> platforms, the locale encoding is used (see locale.getpreferredencoding
> ()).
>
> Under all platforms though, you can override this value by setting the
> PYTHONIOENCODING environment variable before starting Python.
> """
>
> At this point, details of the OS become very significant. If your server
> is running on a windows platform you may need to figure out how to make
> apache set the PYTHONIOENCODING environment variable to "utf-8" (or
> whatever else is appropriate) before calling the python script.
>
> I believe that the following line in your httpd.conf may have the
> required effect.
>
> SetEnv PYTHONIOENCODING utf-8
>
> Of course, if the file is not encoded as utf-8, but rather something
> else, then use that as the encoding in the above suggestions. If the
> server is not running windows, then I'm not sure where the problem might
> be.
>

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-15 20:10 +0200
  Re: Unicode in cgi-script with apache2 alister <alister.nospam.ware@ntlworld.com> - 2014-08-15 19:27 +0000
    Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 00:36 +0200
      Re: Unicode in cgi-script with apache2 Denis McMahon <denismfmcmahon@gmail.com> - 2014-08-17 02:50 +0000
        Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 07:32 +0200
        Re: Unicode in cgi-script with apache2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-17 17:50 +1000
          Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 11:40 +0200
          Re: Unicode in cgi-script with apache2 wxjmfauth@gmail.com - 2014-08-17 03:05 -0700
          Re: Unicode in cgi-script with apache2 Peter Otten <__peter__@web.de> - 2014-08-17 13:04 +0200
          Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 13:34 +0200
          Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 14:02 +0200
            Re: Unicode in cgi-script with apache2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-17 23:00 +1000
              Re: Unicode in cgi-script with apache2 wxjmfauth@gmail.com - 2014-08-17 08:56 -0700
          Re: Unicode in cgi-script with apache2 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-08-17 13:35 +0100
            Re: Unicode in cgi-script with apache2 Tony the Tiger <tony@tiger.invalid> - 2014-08-18 04:39 +0000
          Re: Unicode in cgi-script with apache2 Peter Otten <__peter__@web.de> - 2014-08-17 15:12 +0200
          Re: Unicode in cgi-script with apache2 Peter Otten <__peter__@web.de> - 2014-08-17 16:06 +0200
      Re: Unicode in cgi-script with apache2 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-17 15:54 +1000
  Re: Unicode in cgi-script with apache2 John Gordon <gordon@panix.com> - 2014-08-15 19:32 +0000
    Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 00:39 +0200
  Re: Unicode in cgi-script with apache2 Denis McMahon <denismfmcmahon@gmail.com> - 2014-08-16 16:40 +0000
    Re: Unicode in cgi-script with apache2 Dominique Ramaekers <dominique@ramaekers-stassart.be> - 2014-08-17 00:57 +0200
  Re: Unicode in cgi-script with apache2 wxjmfauth@gmail.com - 2014-08-17 01:08 -0700

csiph-web