Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #92048 > unrolled thread
| Started by | Palpandi <palpandi111@gmail.com> |
|---|---|
| First post | 2015-06-04 06:36 -0700 |
| Last post | 2015-06-04 16:21 +0200 |
| Articles | 6 — 6 participants |
Back to article view | Back to comp.lang.python
Regular Expression Palpandi <palpandi111@gmail.com> - 2015-06-04 06:36 -0700
Re: Regular Expression Larry Martell <larry.martell@gmail.com> - 2015-06-04 09:43 -0400
Re: Regular Expression Steven D'Aprano <steve@pearwood.info> - 2015-06-04 23:54 +1000
Re: Regular Expression Tim Chase <python.list@tim.thechases.com> - 2015-06-04 08:48 -0500
Re: Regular Expression Peter Otten <__peter__@web.de> - 2015-06-04 16:00 +0200
Re: Regular Expression Laura Creighton <lac@openend.se> - 2015-06-04 16:21 +0200
| From | Palpandi <palpandi111@gmail.com> |
|---|---|
| Date | 2015-06-04 06:36 -0700 |
| Subject | Regular Expression |
| Message-ID | <c85bc324-37fe-469e-b3b1-b1d4e51bf7d8@googlegroups.com> |
Hi All,
This is the case. To split "string2" from "string1_string2" I am using
re.split('_', "string1_string2", 1)[1].
It is working fine for string "string1_string2" and output as "string2". But actually the problem is that if a sting is "__string1_string2" and the output is "_string1_string2". It is wrong.
How to fix this issue?
[toc] | [next] | [standalone]
| From | Larry Martell <larry.martell@gmail.com> |
|---|---|
| Date | 2015-06-04 09:43 -0400 |
| Message-ID | <mailman.160.1433425441.13271.python-list@python.org> |
| In reply to | #92048 |
On Thu, Jun 4, 2015 at 9:36 AM, Palpandi <palpandi111@gmail.com> wrote:
>
> Hi All,
>
> This is the case. To split "string2" from "string1_string2" I am using
> re.split('_', "string1_string2", 1)[1].
>
> It is working fine for string "string1_string2" and output as "string2". But actually the problem is that if a sting is "__string1_string2" and the output is "_string1_string2". It is wrong.
>
> How to fix this issue?
"__string1_string2".split('_')[-1]
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-06-04 23:54 +1000 |
| Message-ID | <55705890$0$13004$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #92048 |
On Thu, 4 Jun 2015 11:36 pm, Palpandi wrote:
> Hi All,
>
> This is the case. To split "string2" from "string1_string2" I am using
> re.split('_', "string1_string2", 1)[1].
There is absolutely no need to use the nuclear-powered bulldozer of regular
expressions to crack that tiny peanut. Strings have a perfectly useful
split method:
py> "string1_string2".split("_")
['string1', 'string2']
> It is working fine for string "string1_string2" and output as "string2".
> But actually the problem is that if a sting is "__string1_string2" and the
> output is "_string1_string2". It is wrong.
No, the output is correct. You tell Python to split on the *first*
underscore only, which is exactly what Python does:
py> re.split('_', "__string1_string2", 1)
['', '_string1_string2']
> How to fix this issue?
Again, this is a small problem, and regular expressions are not needed. Just
strip the underscores off the left, then split:
py> s = "__string1_string2"
py> s.lstrip("_").split("_")
['string1', 'string2']
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2015-06-04 08:48 -0500 |
| Message-ID | <mailman.161.1433426138.13271.python-list@python.org> |
| In reply to | #92048 |
On 2015-06-04 06:36, Palpandi wrote:
> This is the case. To split "string2" from "string1_string2" I am
> using re.split('_', "string1_string2", 1)[1].
>
> It is working fine for string "string1_string2" and output as
> "string2". But actually the problem is that if a sting is
> "__string1_string2" and the output is "_string1_string2". It is
> wrong.
Why use regular expressions to split a string on a constant?
Try
for input in [
"string1_string2",
"__string1_string2",
]:
value = input.rsplit('_', 1)[-1]
assert value == "string2"
-tkc
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-06-04 16:00 +0200 |
| Message-ID | <mailman.162.1433426457.13271.python-list@python.org> |
| In reply to | #92048 |
Palpandi wrote:
> This is the case. To split "string2" from "string1_string2" I am using
> re.split('_', "string1_string2", 1)[1].
>
> It is working fine for string "string1_string2" and output as "string2".
> But actually the problem is that if a sting is "__string1_string2" and the
> output is "_string1_string2". It is wrong.
>
> How to fix this issue?
Use str.rpartion():
>>> "one_two__three".rpartition("_")[-1]
'three'
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-06-04 16:21 +0200 |
| Message-ID | <mailman.166.1433427684.13271.python-list@python.org> |
| In reply to | #92048 |
In a message of Thu, 04 Jun 2015 06:36:29 -0700, Palpandi writes:
>Hi All,
>
>This is the case. To split "string2" from "string1_string2" I am using
>re.split('_', "string1_string2", 1)
And you shouldn't be. The 3rd argument, 1 says stop after one match.
>It is working fine for string "string1_string2" and output as "string2". But actually the problem is that if a sting is "__string1_string2" and the output is "_string1_string2". It is wrong.
>
>How to fix this issue?
Depends on what you want.
Approach #1 - just use the string method, forget re, because you do not
need it.
>>>> "__string1_string2".split("_")
['', '', 'string1', 'string2']
>>>> "_string1_string2__".split("_")
['', 'string1', 'string2', '', '']
Approach #2 -- use re but with a fixed string (probably a bad idea,
you should be using approach 1 instead if you have a fixed string)
>>>> re.split('_', "__string1_string2")
['', '', 'string1', 'string2']
>>>> re.split('_', "__string1_string2__")
['', '', 'string1', 'string2', '', '']
Approach #3 - there is a real pattern here I want to use, the example
I posted to the list is a lot simpler than what I really want to do.
Ok, in this case we will match 'any number of underscores' for an
example.
>>>> p = re.compile('_*')
>>>> p.split("__string1_string2")
['', 'string1', 'string2']
>>>> p.split("__string1__string2__")
['', 'string1', 'string2', '']
Laura
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web