Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #85568 > unrolled thread
| Started by | dieter <dieter@handshake.de> |
|---|---|
| First post | 2015-02-12 07:48 +0100 |
| Last post | 2015-02-12 07:48 +0100 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: urllib.parser.quote() and RFC 2396: unreserved characters get encoded dieter <dieter@handshake.de> - 2015-02-12 07:48 +0100
| From | dieter <dieter@handshake.de> |
|---|---|
| Date | 2015-02-12 07:48 +0100 |
| Subject | Re: urllib.parser.quote() and RFC 2396: unreserved characters get encoded |
| Message-ID | <mailman.18682.1423723696.18130.python-list@python.org> |
Bruno Cauet <brunocauet@gmail.com> writes:
> Unicode characters outside the ASCII range also get encoded when they
> have no reason to, e.g.
> >>> pathlib.PurePath("/home/싸이/").as_uri()
> 'file:///home/%EC%8B%B8%EC%9D%B4'
Non-ASCII characters are not legal uri characters.
Look at section 2.3 of "http://www.faqs.org/rfcs/rfc2396.html".
You see there "unreserved = alphanum | mark" with with "alphanum"
defined in section 1.6 as the ASCII letters and digits.
See also section 2.1 ("URI and non-ASCII characters"). It tells
that non-ASCII characters should be utf-8 encoded and then uri-escaped.
Thus, the handling (by "urllib")
of non-ASCII unicode characters seems to be correct.
Back to top | Article view | comp.lang.python
csiph-web