Groups > comp.lang.python > #61261 > unrolled thread

Is It Bug?

Started by	Mahan Marwat <mahanmarwat@gmail.com>
First post	2013-12-07 16:59 -0800
Last post	2013-12-08 11:43 +0100
Articles	10 — 8 participants

Back to article view | Back to comp.lang.python

  Is It Bug? Mahan Marwat <mahanmarwat@gmail.com> - 2013-12-07 16:59 -0800
    Re: Is It Bug? Chris Angelico <rosuav@gmail.com> - 2013-12-08 12:05 +1100
      Re: Is It Bug? Roy Smith <roy@panix.com> - 2013-12-07 20:22 -0500
        Re: Is It Bug? Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-12-08 11:01 +0100
        Re: Is It Bug? Chris Angelico <rosuav@gmail.com> - 2013-12-08 21:04 +1100
        Re: Is It Bug? Chris Angelico <rosuav@gmail.com> - 2013-12-08 21:26 +1100
    Re: Is It Bug? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-08 01:06 +0000
    Re: Is It Bug? MRAB <python@mrabarnett.plus.com> - 2013-12-08 01:12 +0000
    Re: Is It Bug? Tim Roberts <timr@probo.com> - 2013-12-07 22:53 -0800
    Re: Is It Bug? Peter Otten <__peter__@web.de> - 2013-12-08 11:43 +0100

#61261 — Is It Bug?

From	Mahan Marwat <mahanmarwat@gmail.com>
Date	2013-12-07 16:59 -0800
Subject	Is It Bug?
Message-ID	<27c0b454-62e9-410f-b05c-7c5fe306f8aa@googlegroups.com>

Why this is not working.

>>> 'Hello, \\\\World'.replace('\\', '\\')

To me, Python will interpret '\\\\' to '\\'. And the replace method will replace '\\' with '\'. So, the result will be 'Hello, \World'. But it's give me 'Hello, \\\\World'.

The result I want form the code is 'Hello, \World'.

[toc] | [next] | [standalone]

#61264

From	Chris Angelico <rosuav@gmail.com>
Date	2013-12-08 12:05 +1100
Message-ID	<mailman.3713.1386464744.18130.python-list@python.org>
In reply to	#61261

On Sun, Dec 8, 2013 at 11:59 AM, Mahan Marwat <mahanmarwat@gmail.com> wrote:
> Why this is not working.
>
>>>> 'Hello, \\\\World'.replace('\\', '\\')
>
> To me, Python will interpret '\\\\' to '\\'. And the replace method will replace '\\' with '\'. So, the result will be 'Hello, \World'. But it's give me 'Hello, \\\\World'.
>
> The result I want form the code is 'Hello, \World'.

You're replacing with the same as the source string. That's not going
to change anything.

The first thing to get your head around is Python string literals.
You'll find them well described in the online tutorial, or poke around
in the interactive interpreter. Once you master that, you should be
able to understand what you're trying to do here.

ChrisA

[toc] | [prev] | [next] | [standalone]

#61267

From	Roy Smith <roy@panix.com>
Date	2013-12-07 20:22 -0500
Message-ID	<roy-756493.20224107122013@news.panix.com>
In reply to	#61264

In article <mailman.3713.1386464744.18130.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> The first thing to get your head around is Python string literals.
> You'll find them well described in the online tutorial, or poke around
> in the interactive interpreter. 

A couple of ideas to explore along those lines:

1) Read up on raw strings, i.e.

r'Hello, \World'

instead of 

'Hello, \\Word'

There's nothing you can do with raw strings that you can't do with 
regular strings, but they're easier to read when you start to use 
backslashes.

2) When in doubt about what I'm looking at in a string, I turn it into a 
list.  So, if I do:

>>> s = 'Hello, \\World'
>>> print s
Hello, \World

What is that character after the space?  Is it a backslash, or is it 
something that Python is printing as \W?  Not sure?  Just do:

>>> print list(s)
['H', 'e', 'l', 'l', 'o', ',', ' ', '\\', 'W', 'o', 'r', 'l', 'd']

and it's immediately obvious which it is.

[toc] | [prev] | [next] | [standalone]

#61281

From	Chris “Kwpolska” Warrick <kwpolska@gmail.com>
Date	2013-12-08 11:01 +0100
Message-ID	<mailman.3720.1386496916.18130.python-list@python.org>
In reply to	#61267

On Sun, Dec 8, 2013 at 2:22 AM, Roy Smith <roy@panix.com> wrote:
> There's nothing you can do with raw strings that you can't do with
> regular strings, but they're easier to read when you start to use
> backslashes.

Unfortunately, there is one.  A raw string cannot end with a backslash.

>>> r'a\a'
'a\\a'
>>> r'a\'
  File "<stdin>", line 1
    r'a\'
        ^
SyntaxError: EOL while scanning string literal
>>> r'\'
  File "<stdin>", line 1
    r'\'
       ^
SyntaxError: EOL while scanning string literal

-- 
Chris “Kwpolska” Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense

[toc] | [prev] | [next] | [standalone]

#61282

From	Chris Angelico <rosuav@gmail.com>
Date	2013-12-08 21:04 +1100
Message-ID	<mailman.3721.1386497055.18130.python-list@python.org>
In reply to	#61267

On Sun, Dec 8, 2013 at 9:01 PM, Chris “Kwpolska” Warrick
<kwpolska@gmail.com> wrote:
> On Sun, Dec 8, 2013 at 2:22 AM, Roy Smith <roy@panix.com> wrote:
>> There's nothing you can do with raw strings that you can't do with
>> regular strings, but they're easier to read when you start to use
>> backslashes.
>
> Unfortunately, there is one.  A raw string cannot end with a backslash.

That's the other way around. There's something you can't do with a raw
string that you can do with a regular. But there's nothing you can do
with a raw that you can't do with a regular, as can be easily proven
by looking at the repr handling - nothing will ever have a repr that's
a raw string.

ChrisA

[toc] | [prev] | [next] | [standalone]

#61283

From	Chris Angelico <rosuav@gmail.com>
Date	2013-12-08 21:26 +1100
Message-ID	<mailman.3722.1386498404.18130.python-list@python.org>
In reply to	#61267

On Sun, Dec 8, 2013 at 9:01 PM, Chris “Kwpolska” Warrick
<kwpolska@gmail.com> wrote:
> A raw string cannot end with a backslash.
>
>>>> r'a\a'
> 'a\\a'
>>>> r'a\'
>   File "<stdin>", line 1
>     r'a\'
>         ^
> SyntaxError: EOL while scanning string literal

Incidentally, the solution to this would be to not use the backslash
to escape the quote. That's what introduces the ambiguity. Instead, a
raw literal could do as REXX does and double the quote to escape it.
(Any whitespace and it's still concatenation as normal. I'm not
advocating REXX's handling there.)

>>> r"asdf""qwer"
'asdfqwer'

If we had a new "pure string" that worked thus:
>>> p"asdf""qwer"
'asdf"qwer'

>>> p"\b""\d+""\b"
'\\b"\\d+"\\b'

which would be a regex matching quoted strings of digits. The only
potential ambiguity would be in that closing the quote and opening
another would normally revert to a regular string literal, where by
this model it's still a pure string. Editor lexers would have to
understand that.

ChrisA

[toc] | [prev] | [next] | [standalone]

#61265

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2013-12-08 01:06 +0000
Message-ID	<mailman.3714.1386464822.18130.python-list@python.org>
In reply to	#61261

On 08/12/2013 00:59, Mahan Marwat wrote:
> Why this is not working.
>
>>>> 'Hello, \\\\World'.replace('\\', '\\')

Whoops a daisy!!! --------------^^^^--^^^^ ???

>
> To me, Python will interpret '\\\\' to '\\'. And the replace method will replace '\\' with '\'. So, the result will be 'Hello, \World'. But it's give me 'Hello, \\\\World'.
>
> The result I want form the code is 'Hello, \World'.
>


-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]

#61266

From	MRAB <python@mrabarnett.plus.com>
Date	2013-12-08 01:12 +0000
Message-ID	<mailman.3715.1386465166.18130.python-list@python.org>
In reply to	#61261

On 08/12/2013 00:59, Mahan Marwat wrote:
> Why this is not working.
>
>>>> 'Hello, \\\\World'.replace('\\', '\\')
>
> To me, Python will interpret '\\\\' to '\\'. And the replace method
> will replace '\\' with '\'. So, the result will be 'Hello, \World'.
> But it's give me 'Hello, \\\\World'.
>
> The result I want form the code is 'Hello, \World'.
>
The original string contains 2 actual backslashes:

>>> print('Hello, \\\\World')
Hello, \\World

Both the search and replacement strings contain 1 backslash:

>>> print('\\')
\

You're asking it to replace every backslash with a backslash!

If you want to replace 2 consecutive backslashed with a single
backslash:

 >>> 'Hello, \\\\World'.replace('\\\\', '\\')
'Hello, \\World'

Maybe it's clearer if you print it:

 >>> print('Hello, \\\\World'.replace('\\\\', '\\'))
Hello, \World

[toc] | [prev] | [next] | [standalone]

#61274

From	Tim Roberts <timr@probo.com>
Date	2013-12-07 22:53 -0800
Message-ID	<6k58a9do25ftf4b9h33ff96gs046eruej8@4ax.com>
In reply to	#61261

Mahan Marwat <mahanmarwat@gmail.com> wrote:
>
>Why this is not working.
>
>>>> 'Hello, \\\\World'.replace('\\', '\\')
>
>To me, Python will interpret '\\\\' to '\\'. 

It's really important that you think about the difference between the way
string literals are written in Python code, and the way the strings
actually look in memory.

The Python literal 'Hello, \\\\World' contains exactly 2 backslashes.  We
have to spell it with 4 backslashes to get that result, but in memory there
are only two.

Similarly, the Python literal '\\' contains exactly one character.

So, if your goal is to change 2 backslashes to 1, you would need
    'Hello, \\\\World'.replace('\\\\','\\')

However, REMEMBER that if you just have the command-line interpreter echo
the result of that, it's going to show you the string representation, in
which each backslash is shown as TWO characters.  Observe:

    Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> s = 'Hello, \\\\World'
    >>> s
    'Hello, \\\\World'
    >>> print s
    Hello, \\World
    >>> s = s.replace('\\\\','\\')
    >>> s
    'Hello, \\World'
    >>> print s
    Hello, \World
    >>>
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]

#61285

From	Peter Otten <__peter__@web.de>
Date	2013-12-08 11:43 +0100
Message-ID	<mailman.3723.1386499439.18130.python-list@python.org>
In reply to	#61261

Mahan Marwat wrote:

> Why this is not working.
> 
>>>> 'Hello, \\\\World'.replace('\\', '\\')
> 
> To me, Python will interpret '\\\\' to '\\'. And the replace method will
> replace '\\' with '\'. So, the result will be 'Hello, \World'. But it's
> give me 'Hello, \\\\World'.
> 
> The result I want form the code is 'Hello, \World'.

Let's forget about backslashes for the moment and use 'a' instead. We can 
replace an 'a' with an 'a'

>>> "Hello, aaWorld".replace("a", "a")
'Hello, aaWorld'

That changes nothing. Or we can replace two 'a's with one 'a'

>>> "Hello, aaWorld".replace("aa", "a")
'Hello, aWorld'

This does the obvious thing. Finally we can replace an 'a' with the empty 
string '':

>>> "Hello, aaWorld".replace("a", "")
'Hello, World'

This effectively removes all 'a's. 

Now let's replace the "a" with a backslash. Because the backslash has a 
special meaning it has to be "escaped", i. e. preceded by another backslash. 
The examples then become

>>> "Hello, \\\\World".replace("\\", "\\")
'Hello, \\\\World'
>>> "Hello, \\\\World".replace("\\\\", "\\")
'Hello, \\World'
>>> "Hello, \\\\World".replace("\\", "")
'Hello, World'

While doubling of backslashes is required by Python the doubling of 
backslahses in the output occurs because the interactive interpreter applies 
repr() to the string before it is shown. You can avoid that with an explicit 
print statement in Python 2 or a print() function call in Python 3:

>>> print "Hello, \\\\World".replace("\\", "\\")
Hello, \\World
>>> print "Hello, \\\\World".replace("\\\\", "\\")
Hello, \World
>>> print "Hello, \\\\World".replace("\\", "")
Hello, World

[toc] | [prev] | [standalone]

csiph-web

Is It Bug?

Contents

#61261 — Is It Bug?

#61264

#61267

#61281

#61282

#61283

#61265

#61266

#61274

#61285