Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #24571 > unrolled thread
| Started by | howmuchistoday@gmail.com |
|---|---|
| First post | 2012-06-27 18:14 -0700 |
| Last post | 2012-06-28 19:18 +0200 |
| Articles | 6 — 4 participants |
Back to article view | Back to comp.lang.python
Is there any way to decode String using unknown codec? howmuchistoday@gmail.com - 2012-06-27 18:14 -0700
Re: Is there any way to decode String using unknown codec? Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-06-27 19:20 -0700
Re: Is there any way to decode String using unknown codec? howmuchistoday@gmail.com - 2012-06-28 14:27 -0700
Re: Is there any way to decode String using unknown codec? howmuchistoday@gmail.com - 2012-06-28 14:27 -0700
Re: Is there any way to decode String using unknown codec? MRAB <python@mrabarnett.plus.com> - 2012-06-28 12:28 +0100
Re: Is there any way to decode String using unknown codec? Dieter Maurer <dieter@handshake.de> - 2012-06-28 19:18 +0200
| From | howmuchistoday@gmail.com |
|---|---|
| Date | 2012-06-27 18:14 -0700 |
| Subject | Is there any way to decode String using unknown codec? |
| Message-ID | <c67686b6-4f98-4408-a89c-edc0a6030c24@googlegroups.com> |
Hi
I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').
It raises an Error with showing right Korean.
Is there any way to deal it being not broken?
[toc] | [next] | [standalone]
| From | Benjamin Kaplan <benjamin.kaplan@case.edu> |
|---|---|
| Date | 2012-06-27 19:20 -0700 |
| Message-ID | <mailman.1580.1340850037.4697.python-list@python.org> |
| In reply to | #24571 |
On Wed, Jun 27, 2012 at 6:14 PM, <howmuchistoday@gmail.com> wrote:
> Hi
> I'm a Korean and when I use modules like sys, os, &c,
> sometimes the interpreter show me broken strings like
> '\x13\xb3\x12\xc8'.
> It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> I tried to decode it using codecs like cp949,mbcs,utf-8
> but It failed.
> The only way I found is eval('\x13\xb3\x12\xc8').
> It raises an Error with showing right Korean.
> Is there any way to deal it being not broken?
> --
It's not broken. You're just using the wrong encodings. Try utf-16le.
[toc] | [prev] | [next] | [standalone]
| From | howmuchistoday@gmail.com |
|---|---|
| Date | 2012-06-28 14:27 -0700 |
| Message-ID | <mailman.1622.1340918855.4697.python-list@python.org> |
| In reply to | #24574 |
T
2012년 6월 28일 목요일 오전 11시 20분 28초 UTC+9, Benjamin Kaplan 님의 말:
> On Wed, Jun 27, 2012 at 6:14 PM, <howmuchistoday@gmail.com> wrote:
> > Hi
> > I'm a Korean and when I use modules like sys, os, &c,
> > sometimes the interpreter show me broken strings like
> > '\x13\xb3\x12\xc8'.
> > It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> > I tried to decode it using codecs like cp949,mbcs,utf-8
> > but It failed.
> > The only way I found is eval('\x13\xb3\x12\xc8').
> > It raises an Error with showing right Korean.
> > Is there any way to deal it being not broken?
> > --
>
> It's not broken. You're just using the wrong encodings. Try utf-16le.
Thank you guys. The problem is solved!
[toc] | [prev] | [next] | [standalone]
| From | howmuchistoday@gmail.com |
|---|---|
| Date | 2012-06-28 14:27 -0700 |
| Message-ID | <615e7c90-b240-43e3-a106-ed07c1ffc500@googlegroups.com> |
| In reply to | #24574 |
T
2012년 6월 28일 목요일 오전 11시 20분 28초 UTC+9, Benjamin Kaplan 님의 말:
> On Wed, Jun 27, 2012 at 6:14 PM, <howmuchistoday@gmail.com> wrote:
> > Hi
> > I'm a Korean and when I use modules like sys, os, &c,
> > sometimes the interpreter show me broken strings like
> > '\x13\xb3\x12\xc8'.
> > It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> > I tried to decode it using codecs like cp949,mbcs,utf-8
> > but It failed.
> > The only way I found is eval('\x13\xb3\x12\xc8').
> > It raises an Error with showing right Korean.
> > Is there any way to deal it being not broken?
> > --
>
> It's not broken. You're just using the wrong encodings. Try utf-16le.
Thank you guys. The problem is solved!
[toc] | [prev] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2012-06-28 12:28 +0100 |
| Message-ID | <mailman.1596.1340882887.4697.python-list@python.org> |
| In reply to | #24571 |
On 28/06/2012 02:14, howmuchistoday@gmail.com wrote:
> Hi
> I'm a Korean and when I use modules like sys, os, &c,
> sometimes the interpreter show me broken strings like
> '\x13\xb3\x12\xc8'.
> It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> I tried to decode it using codecs like cp949,mbcs,utf-8
> but It failed.
> The only way I found is eval('\x13\xb3\x12\xc8').
> It raises an Error with showing right Korean.
> Is there any way to deal it being not broken?
>
It might be UTF-16:
>>> b'\x13\xb3\x12\xc8'.decode("utf16")
'댓젒'
I don't know Korean, but that looks reasonable!
[toc] | [prev] | [next] | [standalone]
| From | Dieter Maurer <dieter@handshake.de> |
|---|---|
| Date | 2012-06-28 19:18 +0200 |
| Message-ID | <mailman.1613.1340903908.4697.python-list@python.org> |
| In reply to | #24571 |
howmuchistoday@gmail.com writes:
> I'm a Korean and when I use modules like sys, os, &c,
> sometimes the interpreter show me broken strings like
> '\x13\xb3\x12\xc8'.
> It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> I tried to decode it using codecs like cp949,mbcs,utf-8
> but It failed.
> The only way I found is eval('\x13\xb3\x12\xc8').
This looks as if "sys.stdout/sys.stderr" knew the correct encoding.
Check it like this:
import sys
sys.stdout.encoding
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web