Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #65206 > unrolled thread
| Started by | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| First post | 2014-02-01 03:51 -0500 |
| Last post | 2014-02-01 03:51 -0500 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Python shell wont open idle or an exisiting py file Terry Reedy <tjreedy@udel.edu> - 2014-02-01 03:51 -0500
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2014-02-01 03:51 -0500 |
| Subject | Re: Python shell wont open idle or an exisiting py file |
| Message-ID | <mailman.6272.1391244731.18130.python-list@python.org> |
On 2/1/2014 2:26 AM, Chris Angelico wrote: > On Sat, Feb 1, 2014 at 4:46 PM, Terry Reedy <tjreedy@udel.edu> wrote: >> On 1/31/2014 10:36 PM, Chris Angelico wrote: >>> >>> On Sat, Feb 1, 2014 at 1:54 PM, MRAB <python@mrabarnett.plus.com> wrote: >>>> >>>> I think that some years ago I heard about a variation on UTF-8 >>>> (Microsoft?) where codepoint U+0000 is encoded as 0xC0 0x80 so that the >>>> null byte can be used as the string terminator. >>>> >>>> I had a look on Wikipedia found this: >>>> >>>> http://en.wikipedia.org/wiki/Null-terminated_string >>> >>> >>> Yeah, it's a common abuse of UTF-8. It's a violation of spec, but an >>> understandable one. However, I don't understand why the first part - >>> why should \0 become U+0000 but (presumably) the \a later on >>> (...cs\accel...) doesn't become U+0007, etc? >> >> >> Because only \0 has a special meaning in a C string, I should have added 'to C itself', as the string terminator. >> and Tk is written in C and uses C strings. > > Eh? I've used \a in C programs (not often but I have used it). > > It's possible that \0 is the only one that actually bombs anything > (because of C0 80 representation). \0 can bomb C byte processing by terminating it sooner than it should. Its unexpected replacement bombs utf-8 decoding. > But since \7 and \a both represent > 0x07 in a C string, I would expect there to be other problems, if it's > interpreting it as source. Ah well! Weird weird. While other control codes may have special meaning to a terminal or other device, to do not have special meaning to the operation of C string functions themselves (except possible for a 'getline' function looking for n -- but I do not remember is the C stdlib has any such functions). I am speaking from my memory of C. I have not looked at the Tk C code to see just what it did where to create the exception. I am just happy that Serhiy was able to fixed tkinter without causing another test to fail. -- Terry Jan Reedy
Back to top | Article view | comp.lang.python
csiph-web