Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #17986 > unrolled thread
| Started by | Charles Hixson <charleshixsn@earthlink.net> |
|---|---|
| First post | 2011-12-26 14:23 -0800 |
| Last post | 2011-12-27 16:38 -0500 |
| Articles | 11 — 7 participants |
Back to article view | Back to comp.lang.python
Possible bug in string handling (with kludgy work-around) Charles Hixson <charleshixsn@earthlink.net> - 2011-12-26 14:23 -0800
Re: Possible bug in string handling (with kludgy work-around) Rick Johnson <rantingrickjohnson@gmail.com> - 2011-12-26 14:48 -0800
Re: Possible bug in string handling (with kludgy work-around) Chris Angelico <rosuav@gmail.com> - 2011-12-27 10:05 +1100
Re: Possible bug in string handling (with kludgy work-around) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-12-27 01:10 +0000
Re: Possible bug in string handling (with kludgy work-around) Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2011-12-27 09:53 -0500
Re: Possible bug in string handling (with kludgy work-around) Rick Johnson <rantingrickjohnson@gmail.com> - 2011-12-27 10:04 -0800
Re: Possible bug in string handling (with kludgy work-around) Lie Ryan <lie.1296@gmail.com> - 2011-12-28 08:23 +1100
Re: Possible bug in string handling (with kludgy work-around) Terry Reedy <tjreedy@udel.edu> - 2011-12-27 16:38 -0500
Re: Possible bug in string handling (with kludgy work-around) Rick Johnson <rantingrickjohnson@gmail.com> - 2011-12-27 16:57 -0800
Re: Possible bug in string handling (with kludgy work-around) Lie Ryan <lie.1296@gmail.com> - 2011-12-29 04:54 +1100
Re: Possible bug in string handling (with kludgy work-around) Terry Reedy <tjreedy@udel.edu> - 2011-12-27 16:38 -0500
| From | Charles Hixson <charleshixsn@earthlink.net> |
|---|---|
| Date | 2011-12-26 14:23 -0800 |
| Subject | Possible bug in string handling (with kludgy work-around) |
| Message-ID | <mailman.4112.1324938867.27778.python-list@python.org> |
This doesn't cause a crash, but rather incorrect results.
self.wordList = ["The", "quick", "brown", "fox", "carefully",
"jumps", "over", "the", "lazy", "dog", "as", "it",
"stealthily", "wends", "its", "way", "homewards", '\b.']
for i in range (len (self.wordList) ):
if not isinstance(self.wordList[i], str):
self.wordList = ""
elif self.wordList[i] != "" and self.wordList[i][0] == "\b":
print ("0: wordList[", i, "] = \"", self.wordList[i], "\"", sep
= "")
print ("0a: wordList[", i, "][1] = \"", self.wordList[i][1],
"\"", sep = "")
tmp = self.wordList[i][1] ## !! Kludge --
remove tmp to see the error
self.wordList[i] = tmp + self.wordList[i][1:-1] ## !!
Kludge -- remove tmp + to see the error
print ("1: wordList[", i, "] = \"", self.wordList[i], "\"", sep
= "")
print ("len(wordList[", i, "]) = ", len(self.wordList[i]) )
--
Charles Hixson
[toc] | [next] | [standalone]
| From | Rick Johnson <rantingrickjohnson@gmail.com> |
|---|---|
| Date | 2011-12-26 14:48 -0800 |
| Message-ID | <0063a1a0-0cbc-4352-90ce-13ab2eda6884@v24g2000yqk.googlegroups.com> |
| In reply to | #17986 |
On Dec 26, 4:23 pm, Charles Hixson <charleshi...@earthlink.net> wrote:
> This doesn't cause a crash, but rather incorrect results.
>
> self.wordList = ["The", "quick", "brown", "fox", "carefully",
> "jumps", "over", "the", "lazy", "dog", "as", "it",
> "stealthily", "wends", "its", "way", "homewards", '\b.']
> for i in range (len (self.wordList) ):
> if not isinstance(self.wordList[i], str):
> self.wordList = ""
> elif self.wordList[i] != "" and self.wordList[i][0] == "\b":
> print ("0: wordList[", i, "] = \"", self.wordList[i], "\"", sep
> = "")
> print ("0a: wordList[", i, "][1] = \"", self.wordList[i][1],
> "\"", sep = "")
> tmp = self.wordList[i][1] ## !! Kludge --
> remove tmp to see the error
> self.wordList[i] = tmp + self.wordList[i][1:-1] ## !!
> Kludge -- remove tmp + to see the error
> print ("1: wordList[", i, "] = \"", self.wordList[i], "\"", sep
> = "")
> print ("len(wordList[", i, "]) = ", len(self.wordList[i]) )
>
> --
> Charles Hixson
Handy rules for reporting bugs:
1. Always format code properly.
2. Always trim excess fat from code.
3. Always include relative dependencies ("self.wordlist" is only valid
inside a class. In this case, change the code to a state that is NOT
dependent on a class definition.)
Most times after following these simple rules, you'll find egg on your
face BEFORE someone else has a chance to see it and ridicule you.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2011-12-27 10:05 +1100 |
| Message-ID | <mailman.4115.1324940726.27778.python-list@python.org> |
| In reply to | #17990 |
On Tue, Dec 27, 2011 at 9:48 AM, Rick Johnson
<rantingrickjohnson@gmail.com> wrote:
> Handy rules for reporting bugs:
>
> 1. Always format code properly.
> 2. Always trim excess fat from code.
> 3. Always include relative dependencies ("self.wordlist" is only valid
> inside a class. In this case, change the code to a state that is NOT
> dependent on a class definition.)
>
> Most times after following these simple rules, you'll find egg on your
> face BEFORE someone else has a chance to see it and ridicule you.
4. Don't take it personally when a known troll insults you. His
advice, in this case, is valid; but don't feel that you're going to be
ridiculed. We don't work that way on this list.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-12-27 01:10 +0000 |
| Message-ID | <4ef91afb$0$29973$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #17986 |
On Mon, 26 Dec 2011 14:23:03 -0800, Charles Hixson wrote:
> This doesn't cause a crash, but rather incorrect results.
Charles, your code is badly formatted and virtually unreadable. You have
four spaces between some tokens, lines are too long to fit in an email or
News post without word-wrapping. It is a mess of unidiomatic code filled
with repeated indexing and unnecessary backslash escapes.
You also don't tell us what result you expect, or what result you
actually get. What is the intention of the code? What are you trying to
do, and what happens instead?
The code as given doesn't run -- what's self?
Despite all these problems, I can see one obvious problem in your code:
you test to see if self.wordList[i] is a string, and if not, you replace
the *entire* wordList with the empty string. That is unlikely to do what
you want, although I admit I'm guessing what you are trying to do (since
you don't tell us).
Some hints for you:
(1) Python has two string delimiters, " and ' and you should use them
both. Instead of hard-to-read backslash escapes, just swap delimiters:
print "A string including a \" quote mark." # No!
print 'A string including a " quote mark.' # Yes, much easier to read.
The only time you should backslash-escape a quotation mark is if you need
to include both sorts in a single string:
print "Python has both single ' and double \" quotation marks."
print 'Python has both single \' and double " quotation marks.'
(2) Python is not Pascal, or whatever language you seem to be writing in
the style of. You almost never should write for-loops like this:
for i in range(len(something)):
print something[i]
Instead, you should just iterate over "something" directly:
for obj in something:
print obj
If you also need the index, use the enumerate function:
for i,obj in enumerate(something):
print obj, i
If you are forced to use an ancient version of Python without enumerate,
do yourself a favour and write your loops like this:
for i in range(len(something)):
obj = something[i]
print obj, i
instead of repeatedly indexing the list over and over and over and over
again, as you do in your own code. The use of a temporary variable makes
the code much easier to read and understand.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2011-12-27 09:53 -0500 |
| Message-ID | <mailman.4133.1324997611.27778.python-list@python.org> |
| In reply to | #18001 |
On 27 Dec 2011 01:10:19 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>The only time you should backslash-escape a quotation mark is if you need
>to include both sorts in a single string:
>
>print "Python has both single ' and double \" quotation marks."
>print 'Python has both single \' and double " quotation marks.'
>
You can get by without the backslash in this situation too, by using
triple quoting:
print """Python has both single ' and double " quotation marks."""
(substitute ''' for """ if it looks better to you, as long as you use
the same marker at both ends. I find """ clearer, ''' could be a " and '
packed tightly in some fonts, "', whereas """ can only be one construct)
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Rick Johnson <rantingrickjohnson@gmail.com> |
|---|---|
| Date | 2011-12-27 10:04 -0800 |
| Message-ID | <cce5eb97-8e7f-48c6-8f0d-ec6442871196@k28g2000yqn.googlegroups.com> |
| In reply to | #18023 |
--
Note: superfluous indention removed for clarity!
--
On Dec 27, 8:53 am, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> You can get by without the backslash in this situation too, by using
> triple quoting:
I would not do that because:
1. Because Python already has TWO string literal delimiters (' and ")
2. Because triple quote string literals are SPECIFICALLY created to
solve the "multi-line issue"
3. Because you can confuse the hell out of someone who is reading
Python code and they may miss the true purpose of triple quotes in
Python
But this brings up a very important topic. Why do we even need triple
quote string literals to span multiple lines? Good question, and one i
have never really mused on until now. It's amazing how much BS we just
accept blindly! WE DON'T NEED TRIPLE QUOTE STRINGS! What we need is
single quote strings that span multiple lines and triple quotes then
become superfluous! For the problem of embedding quotes in string
literals, we should be using markup. A SIMPLISTIC MARKUP!
" This is a multi line
string with a single quote --> <SQ>
and a double quote --> <DQ>. Here is an
embedded newline --> <NL>. And a backspace <BS>.
Now we can dispense with all the BS!
"
> I find """ clearer, ''' could be a " and '
> packed tightly in some fonts, "', whereas """ can only be one construct)
Another reason to ONLY use fixed width font when viewing code! Why
would you use ANY font that would obscure chars SO ubiquitous as " and
'?
[toc] | [prev] | [next] | [standalone]
| From | Lie Ryan <lie.1296@gmail.com> |
|---|---|
| Date | 2011-12-28 08:23 +1100 |
| Message-ID | <mailman.4152.1325021001.27778.python-list@python.org> |
| In reply to | #18038 |
On 12/28/2011 05:04 AM, Rick Johnson wrote:
> --
> Note: superfluous indention removed for clarity!
> --
>
> On Dec 27, 8:53 am, Dennis Lee Bieber<wlfr...@ix.netcom.com> wrote:
>> You can get by without the backslash in this situation too, by using
>> triple quoting:
>
> I would not do that because:
> 1. Because Python already has TWO string literal delimiters (' and ")
> 2. Because triple quote string literals are SPECIFICALLY created to
> solve the "multi-line issue"
> 3. Because you can confuse the hell out of someone who is reading
> Python code and they may miss the true purpose of triple quotes in
> Python
>
> But this brings up a very important topic. Why do we even need triple
> quote string literals to span multiple lines? Good question, and one i
> have never really mused on until now. It's amazing how much BS we just
> accept blindly! WE DON'T NEED TRIPLE QUOTE STRINGS! What we need is
> single quote strings that span multiple lines and triple quotes then
> become superfluous! For the problem of embedding quotes in string
> literals, we should be using markup. A SIMPLISTIC MARKUP!
>
> " This is a multi line
> string with a single quote --> <SQ>
> and a double quote --> <DQ>. Here is an
> embedded newline --> <NL>. And a backspace<BS>.
>
> Now we can dispense with all the BS!
> "
Ok, you're trolling.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2011-12-27 16:38 -0500 |
| Message-ID | <mailman.4155.1325021956.27778.python-list@python.org> |
| In reply to | #18038 |
On 12/27/2011 1:04 PM, Rick Johnson wrote: > But this brings up a very important topic. Why do we even need triple > quote string literals to span multiple lines? Good question, and one i > have never really mused on until now. I have, and the reason I thought of is that people, including me, too ofter forget or accidentally fail to properly close a string literal, and type something like 'this is a fairly long single line string" and wonder why they get a syntax error lines later, or, in interactive mode, why the interpreter does not respond to a newline. Color coding editors make it easier to catch such errors, but they were less common in 1991. And there is still uncolored interactive mode. There may also be a technical reason as to how the lexer works. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Rick Johnson <rantingrickjohnson@gmail.com> |
|---|---|
| Date | 2011-12-27 16:57 -0800 |
| Message-ID | <b67bf3de-05ad-4aec-9e0c-63ddc398f2a1@p13g2000yqd.googlegroups.com> |
| In reply to | #18059 |
On Dec 27, 3:38 pm, Terry Reedy <tjre...@udel.edu> wrote: > On 12/27/2011 1:04 PM, Rick Johnson wrote: > > > But this brings up a very important topic. Why do we even need triple > > quote string literals to span multiple lines? Good question, and one i > > have never really mused on until now. > > I have, and the reason I thought of is that people, including me, too > ofter forget or accidentally fail to properly close a string literal, Yes, agreed. > Color coding editors make it easier to catch such errors, but they were > less common in 1991. I would say the need for triple quote strings has passed long ago. Like you say, since color lexers are ubiquitous now we don't need them. > And there is still uncolored interactive mode. I don't see interactive command line programming as a problem. I mean, who drops into a cmd line and starts writing paragraphs of string literals? Typically, one would just make a few one-liner calls here or there. Also, un-terminated string literal errors can be very aggravating. Not because they are difficult to fix, no, but because they are difficult to find! -- and sending me an error message like... "Exception: Un-terminated string literal meets EOF! line: 50,466,638" ... is about as helpful as a bullet in my head! If the interpreter finds itself at EOF BEFORE a string closes, don't you think it would be more helpful to include the currently "opened" strings START POSITION also? Heck, it would be wonderful to only have the start position since the likely-hood of a string ending at EOF is astronomical! As an intelligent lad must know, the odds that the distance from any given string's start position to it's end position is more likely to be shorter than the distance from the string's beginning to the freaking EOF! Ruby and Python are both guilty of this atrocity.
[toc] | [prev] | [next] | [standalone]
| From | Lie Ryan <lie.1296@gmail.com> |
|---|---|
| Date | 2011-12-29 04:54 +1100 |
| Message-ID | <mailman.4187.1325094872.27778.python-list@python.org> |
| In reply to | #18079 |
On 12/28/2011 11:57 AM, Rick Johnson wrote:
> On Dec 27, 3:38 pm, Terry Reedy<tjre...@udel.edu> wrote:
>> On 12/27/2011 1:04 PM, Rick Johnson wrote:
>>
>>> But this brings up a very important topic. Why do we even need triple
>>> quote string literals to span multiple lines? Good question, and one i
>>> have never really mused on until now.
>>
>> I have, and the reason I thought of is that people, including me, too
>> ofter forget or accidentally fail to properly close a string literal,
>
> Yes, agreed.
>
>> Color coding editors make it easier to catch such errors, but they were
>> less common in 1991.
>
> I would say the need for triple quote strings has passed long ago.
> Like you say, since color lexers are ubiquitous now we don't need
> them.
>
>> And there is still uncolored interactive mode.
>
> I don't see interactive command line programming as a problem. I mean,
> who drops into a cmd line and starts writing paragraphs of string
> literals? Typically, one would just make a few one-liner calls here or
> there. Also, un-terminated string literal errors can be very
> aggravating. Not because they are difficult to fix, no, but because
> they are difficult to find! -- and sending me an error message
> like...
>
> "Exception: Un-terminated string literal meets EOF! line: 50,466,638"
>
> ... is about as helpful as a bullet in my head!
>
> If the interpreter finds itself at EOF BEFORE a string closes, don't
> you think it would be more helpful to include the currently "opened"
> strings START POSITION also?
No it wouldn't. Once you get an unterminated string literal, the string
would terminate at the next string opening. Then it would fuck the
parser since it will try to parse what was supposed to be a string
literal as a code. For example:
hello = 'bar'
s = "boo, I missed a quote here
print 'hello = ', hello, "; s = ", s
the parser would misleadingly show that you have an unclosed string
literal here:
vvv
print 'hello = ', hello, "; s = ", s
^^^
instead of on line 2. While an experienced programmer should be able to
figure out what's wrong, I can see a beginner programmer trying to "fix"
the problem like this:
print 'hello = ', hello, "; s = ", s"
and then complaining that print doesn't print.
Limiting string literals to one line limits the possibility of damage to
a single line. You will still have the same problem if you missed to
close triple-quoted string, but since triple-quoted string are much
rarer and they're pretty eye-catching, this sort of error harder are
much harder.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2011-12-27 16:38 -0500 |
| Message-ID | <cailman.4155.1325021956.27778.python-list@python.org> |
| In reply to | #18038 |
On 12/27/2011 1:04 PM, Rick Johnson wrote: > But this brings up a very important topic. Why do we even need triple > quote string literals to span multiple lines? Good question, and one i > have never really mused on until now. I have, and the reason I thought of is that people, including me, too ofter forget or accidentally fail to properly close a string literal, and type something like 'this is a fairly long single line string" and wonder why they get a syntax error lines later, or, in interactive mode, why the interpreter does not respond to a newline. Color coding editors make it easier to catch such errors, but they were less common in 1991. And there is still uncolored interactive mode. There may also be a technical reason as to how the lexer works. -- Terry Jan Reedy
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web