Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #46916 > unrolled thread

lstrip problem - beginner question

Started bymstagliamonte <madmaxthc@yahoo.it>
First post2013-06-04 08:21 -0700
Last post2013-06-04 21:53 -0700
Articles 15 — 8 participants

Back to article view | Back to comp.lang.python


Contents

  lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:21 -0700
    Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:24 -0700
    Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:25 -0700
      Re: lstrip problem - beginner question Fábio Santos <fabiosantosart@gmail.com> - 2013-06-04 16:41 +0100
        Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:49 -0700
          Re: lstrip problem - beginner question Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-04 17:01 +0100
          Re: lstrip problem - beginner question Dave Angel <davea@davea.name> - 2013-06-04 17:48 -0400
    Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:28 -0700
    Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:29 -0700
    Re: lstrip problem - beginner question MRAB <python@mrabarnett.plus.com> - 2013-06-04 16:48 +0100
      Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 08:53 -0700
    Re: lstrip problem - beginner question Peter Otten <__peter__@web.de> - 2013-06-04 17:52 +0200
    Re: lstrip problem - beginner question John Gordon <gordon@panix.com> - 2013-06-04 15:55 +0000
      Re: lstrip problem - beginner question mstagliamonte <madmaxthc@yahoo.it> - 2013-06-04 09:06 -0700
    Re: lstrip problem - beginner question Larry Hudson <orgnut@yahoo.com> - 2013-06-04 21:53 -0700

#46916 — lstrip problem - beginner question

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:21 -0700
Subjectlstrip problem - beginner question
Message-ID<1829efca-935d-4049-ba61-7138015a2806@googlegroups.com>
Hi everyone,

I am a beginner in python and trying to find my way through... :)

I am writing a script to get numbers from the headers of a text file.

If the header is something like:
h01 = ('>scaffold_1')
I just use:
h01.lstrip('>scaffold_')
and this returns me '1'

But, if the header is:
h02: ('>contig-100_0')
if I use:
h02.lstrip('>contig-100_')
this returns me with: ''
...basically nothing. What surprises me is that if I do in this other way:
h02b = h02.lstrip('>contig-100')
I get h02b = ('_1')
and subsequently:
h02b.lstrip('_')
returns me with: '1' which is what I wanted!

Why is this happening? What am I missing?

Thanks for your help and attention
Max

[toc] | [next] | [standalone]


#46917

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:24 -0700
Message-ID<2bb8cd50-4e6e-4929-96c5-9d825ad592a2@googlegroups.com>
In reply to#46916
On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> Hi everyone,
> 
> 
> 
> I am a beginner in python and trying to find my way through... :)
> 
> 
> 
> I am writing a script to get numbers from the headers of a text file.
> 
> 
> 
> If the header is something like:
> 
> h01 = ('>scaffold_1')
> 
> I just use:
> 
> h01.lstrip('>scaffold_')
> 
> and this returns me with '1'
> 
> 
> 
> But, if the header is:
> 
> h02: ('>contig-100_1')
> 
> if I use:
> 
> h02.lstrip('>contig-100_')
> 
> this returns me with: ''
> 
> ...basically nothing. What surprises me is that if I do in this other way:
> 
> h02b = h02.lstrip('>contig-100')
> 
> I get h02b = ('_1')
> 
> and subsequently:
> 
> h02b.lstrip('_')
> 
> returns me with: '1' which is what I wanted!
> 
> 
> 
> Why is this happening? What am I missing?
> 
> 
> 
> Thanks for your help and attention
> 
> Max

[toc] | [prev] | [next] | [standalone]


#46918

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:25 -0700
Message-ID<1fb7c07f-d974-43d0-815b-2739f7a4b965@googlegroups.com>
In reply to#46916
On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> Hi everyone,
> 
> 
> 
> I am a beginner in python and trying to find my way through... :)
> 
> 
> 
> I am writing a script to get numbers from the headers of a text file.
> 
> 
> 
> If the header is something like:
> 
> h01 = ('>scaffold_1')
> 
> I just use:
> 
> h01.lstrip('>scaffold_')
> 
> and this returns me '1'
> 
> 
> 
> But, if the header is:
> 
> h02: ('>contig-100_0')
> 
> if I use:
> 
> h02.lstrip('>contig-100_')
> 
> this returns me with: ''
> 
> ...basically nothing. What surprises me is that if I do in this other way:
> 
> h02b = h02.lstrip('>contig-100')
> 
> I get h02b = ('_1')
> 
> and subsequently:
> 
> h02b.lstrip('_')
> 
> returns me with: '1' which is what I wanted!
> 
> 
> 
> Why is this happening? What am I missing?
> 
> 
> 
> Thanks for your help and attention
> 
> Max

edit: h02: ('>contig-100_1')

[toc] | [prev] | [next] | [standalone]


#46922

FromFábio Santos <fabiosantosart@gmail.com>
Date2013-06-04 16:41 +0100
Message-ID<mailman.2657.1370360513.3114.python-list@python.org>
In reply to#46918

[Multipart message — attachments visible in raw view] — view raw

On 4 Jun 2013 16:34, "mstagliamonte" <madmaxthc@yahoo.it> wrote:
>
> On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> > Hi everyone,
> >
> >
> >
> > I am a beginner in python and trying to find my way through... :)
> >
> >
> >
> > I am writing a script to get numbers from the headers of a text file.
> >
> >
> >
> > If the header is something like:
> >
> > h01 = ('>scaffold_1')
> >
> > I just use:
> >
> > h01.lstrip('>scaffold_')
> >
> > and this returns me '1'
> >
> >
> >
> > But, if the header is:
> >
> > h02: ('>contig-100_0')
> >
> > if I use:
> >
> > h02.lstrip('>contig-100_')
> >
> > this returns me with: ''
> >
> > ...basically nothing. What surprises me is that if I do in this other
way:
> >
> > h02b = h02.lstrip('>contig-100')
> >
> > I get h02b = ('_1')
> >
> > and subsequently:
> >
> > h02b.lstrip('_')
> >
> > returns me with: '1' which is what I wanted!
> >
> >
> >
> > Why is this happening? What am I missing?
> >
> >
> >
> > Thanks for your help and attention
> >
> > Max
>
> edit: h02: ('>contig-100_1')

You don't have to use ('..') to declare a string. Just 'your string' will
do.

You can use str.split to split your string by a character.

(Not tested)

string_on_left, numbers = '>contig-100_01'.split('-')
left_number, right_number = numbers.split('_')
left_number, right_number = int(left_number), int(right_number)

Of course, you will want to replace the variable names.

If you have more advanced parsing needs, you will want to look at regular
expressions or blobs.

[toc] | [prev] | [next] | [standalone]


#46925

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:49 -0700
Message-ID<1c5227af-75a2-4a78-881d-ff16074b556a@googlegroups.com>
In reply to#46922
On Tuesday, June 4, 2013 11:41:43 AM UTC-4, Fábio Santos wrote:
> On 4 Jun 2013 16:34, "mstagliamonte" <madm...@yahoo.it> wrote:
> 
> >
> 
> > On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> 
> > > Hi everyone,
> 
> > >
> 
> > >
> 
> > >
> 
> > > I am a beginner in python and trying to find my way through... :)
> 
> > >
> 
> > >
> 
> > >
> 
> > > I am writing a script to get numbers from the headers of a text file.
> 
> > >
> 
> > >
> 
> > >
> 
> > > If the header is something like:
> 
> > >
> 
> > > h01 = ('>scaffold_1')
> 
> > >
> 
> > > I just use:
> 
> > >
> 
> > > h01.lstrip('>scaffold_')
> 
> > >
> 
> > > and this returns me '1'
> 
> > >
> 
> > >
> 
> > >
> 
> > > But, if the header is:
> 
> > >
> 
> > > h02: ('>contig-100_0')
> 
> > >
> 
> > > if I use:
> 
> > >
> 
> > > h02.lstrip('>contig-100_')
> 
> > >
> 
> > > this returns me with: ''
> 
> > >
> 
> > > ...basically nothing. What surprises me is that if I do in this other way:
> 
> > >
> 
> > > h02b = h02.lstrip('>contig-100')
> 
> > >
> 
> > > I get h02b = ('_1')
> 
> > >
> 
> > > and subsequently:
> 
> > >
> 
> > > h02b.lstrip('_')
> 
> > >
> 
> > > returns me with: '1' which is what I wanted!
> 
> > >
> 
> > >
> 
> > >
> 
> > > Why is this happening? What am I missing?
> 
> > >
> 
> > >
> 
> > >
> 
> > > Thanks for your help and attention
> 
> > >
> 
> > > Max
> 
> >
> 
> > edit: h02: ('>contig-100_1')
> 
> You don't have to use ('..') to declare a string. Just 'your string' will do.
> 
> You can use str.split to split your string by a character.
> 
> (Not tested)
> 
> string_on_left, numbers = '>contig-100_01'.split('-')
> 
> left_number, right_number = numbers.split('_')
> 
> left_number, right_number = int(left_number), int(right_number)
> 
> Of course, you will want to replace the variable names.
> 
> If you have more advanced parsing needs, you will want to look at regular expressions or blobs.

Thanks, I will try it straight away. Still, I don't understand why the original command is returning me with nothing !? Have you got any idea? 
I am trying to understand a bit the 'nuts and bolts' of what I am doing and this result does not make any sense to me

Regards
Max

[toc] | [prev] | [next] | [standalone]


#46931

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-06-04 17:01 +0100
Message-ID<mailman.2662.1370361684.3114.python-list@python.org>
In reply to#46925
On 04/06/2013 16:49, mstagliamonte wrote:

[strip the double line spaced nonsense]

Can you please check your email settings.  It's bad enough being plagued 
with double line spaced mail from google, having it come from yahoo is 
just adding insult to injury, thanks :)

-- 
"Steve is going for the pink ball - and for those of you who are 
watching in black and white, the pink is next to the green." Snooker 
commentator 'Whispering' Ted Lowe.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#46973

FromDave Angel <davea@davea.name>
Date2013-06-04 17:48 -0400
Message-ID<mailman.2685.1370382496.3114.python-list@python.org>
In reply to#46925
On 06/04/2013 12:01 PM, Mark Lawrence wrote:
> On 04/06/2013 16:49, mstagliamonte wrote:
>
> [strip the double line spaced nonsense]
>
> Can you please check your email settings.  It's bad enough being plagued
> with double line spaced mail from google, having it come from yahoo is
> just adding insult to injury, thanks :)
>

Mark:
The OP is posting from googlegroups, just using a yahoo return address. 
  So you just have one buggy provider to hate, not two.

 >>
 >>> If the header is something like:
 >>
 >>> h01 = ('>scaffold_1')
 >>
 >>> I just use:
 >>
 >>> h01.lstrip('>scaffold_')
 >>
 >>> and this returns me '1'
 >>
 >>>
 >>
 >>> But, if the header is:
 >>

madmaxthc@yahoo.it:

If you must use googlegroups, at least fix the double-posting and 
double-spacing bugs it has.  Start by reading:

  http://wiki.python.org/moin/GoogleGroupsPython




-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#46919

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:28 -0700
Message-ID<78065652-a363-4e9e-a9c5-913b71459444@googlegroups.com>
In reply to#46916
On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> Hi everyone,
> 
> 
> 
> I am a beginner in python and trying to find my way through... :)
> 
> 
> 
> I am writing a script to get numbers from the headers of a text file.
> 
> 
> 
> If the header is something like:
> 
> h01 = ('>scaffold_1')
> 
> I just use:
> 
> h01.lstrip('>scaffold_')
> 
> and this returns me '1'
> 
> 
> 
> But, if the header is:
> 
> h02: ('>contig-100_0')
> 
> if I use:
> 
> h02.lstrip('>contig-100_')
> 
> this returns me with: ''
> 
> ...basically nothing. What surprises me is that if I do in this other way:
> 
> h02b = h02.lstrip('>contig-100')
> 
> I get h02b = ('_1')
> 
> and subsequently:
> 
> h02b.lstrip('_')
> 
> returns me with: '1' which is what I wanted!
> 
> 
> 
> Why is this happening? What am I missing?
> 
> 
> 
> Thanks for your help and attention
> 
> Max

edit: h02= ('>contig-100_1') 

[toc] | [prev] | [next] | [standalone]


#46921

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:29 -0700
Message-ID<df3b4fb0-1d45-4f7a-9607-caa3a631c2c7@googlegroups.com>
In reply to#46916
On Tuesday, June 4, 2013 11:21:53 AM UTC-4, mstagliamonte wrote:
> Hi everyone,
> 
> 
> 
> I am a beginner in python and trying to find my way through... :)
> 
> 
> 
> I am writing a script to get numbers from the headers of a text file.
> 
> 
> 
> If the header is something like:
> 
> h01 = ('>scaffold_1')
> 
> I just use:
> 
> h01.lstrip('>scaffold_')
> 
> and this returns me '1'
> 
> 
> 
> But, if the header is:
> 
> h02: ('>contig-100_0')
> 
> if I use:
> 
> h02.lstrip('>contig-100_')
> 
> this returns me with: ''
> 
> ...basically nothing. What surprises me is that if I do in this other way:
> 
> h02b = h02.lstrip('>contig-100')
> 
> I get h02b = ('_1')
> 
> and subsequently:
> 
> h02b.lstrip('_')
> 
> returns me with: '1' which is what I wanted!
> 
> 
> 
> Why is this happening? What am I missing?
> 
> 
> 
> Thanks for your help and attention
> 
> Max

edit:
h02= ('>contig-100_1')

[toc] | [prev] | [next] | [standalone]


#46924

FromMRAB <python@mrabarnett.plus.com>
Date2013-06-04 16:48 +0100
Message-ID<mailman.2658.1370360923.3114.python-list@python.org>
In reply to#46916
On 04/06/2013 16:21, mstagliamonte wrote:
> Hi everyone,
>
> I am a beginner in python and trying to find my way through... :)
>
> I am writing a script to get numbers from the headers of a text file.
>
> If the header is something like:
> h01 = ('>scaffold_1')
> I just use:
> h01.lstrip('>scaffold_')
> and this returns me '1'
>
> But, if the header is:
> h02: ('>contig-100_0')
> if I use:
> h02.lstrip('>contig-100_')
> this returns me with: ''
> ...basically nothing. What surprises me is that if I do in this other way:
> h02b = h02.lstrip('>contig-100')
> I get h02b = ('_1')
> and subsequently:
> h02b.lstrip('_')
> returns me with: '1' which is what I wanted!
>
> Why is this happening? What am I missing?
>
The methods 'lstrip', 'rstrip' and 'strip' don't strip a string, they
strip characters.

You should think of the argument as a set of characters to be removed.

This code:

h01.lstrip('>scaffold_')

will return the result of stripping the characters '>', '_', 'a', 'c',
'd', 'f', 'l', 'o' and 's' from the left-hand end of h01.

A simpler example:

 >>> 'xyyxyabc'.lstrip('xy')
'abc'

It strips the characters 'x' and 'y' from the string, not the string
'xy' as such.

They are that way because they have been in Python for a long time,
long before sets and such like were added to the language.

[toc] | [prev] | [next] | [standalone]


#46928

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 08:53 -0700
Message-ID<ce8c9f23-d6a7-40b3-a613-00cc596ad480@googlegroups.com>
In reply to#46924
On Tuesday, June 4, 2013 11:48:55 AM UTC-4, MRAB wrote:
> On 04/06/2013 16:21, mstagliamonte wrote:
> 
> > Hi everyone,
> 
> >
> 
> > I am a beginner in python and trying to find my way through... :)
> 
> >
> 
> > I am writing a script to get numbers from the headers of a text file.
> 
> >
> 
> > If the header is something like:
> 
> > h01 = ('>scaffold_1')
> 
> > I just use:
> 
> > h01.lstrip('>scaffold_')
> 
> > and this returns me '1'
> 
> >
> 
> > But, if the header is:
> 
> > h02: ('>contig-100_0')
> 
> > if I use:
> 
> > h02.lstrip('>contig-100_')
> 
> > this returns me with: ''
> 
> > ...basically nothing. What surprises me is that if I do in this other way:
> 
> > h02b = h02.lstrip('>contig-100')
> 
> > I get h02b = ('_1')
> 
> > and subsequently:
> 
> > h02b.lstrip('_')
> 
> > returns me with: '1' which is what I wanted!
> 
> >
> 
> > Why is this happening? What am I missing?
> 
> >
> 
> The methods 'lstrip', 'rstrip' and 'strip' don't strip a string, they
> 
> strip characters.
> 
> 
> 
> You should think of the argument as a set of characters to be removed.
> 
> 
> 
> This code:
> 
> 
> 
> h01.lstrip('>scaffold_')
> 
> 
> 
> will return the result of stripping the characters '>', '_', 'a', 'c',
> 
> 'd', 'f', 'l', 'o' and 's' from the left-hand end of h01.
> 
> 
> 
> A simpler example:
> 
> 
> 
>  >>> 'xyyxyabc'.lstrip('xy')
> 
> 'abc'
> 
> 
> 
> It strips the characters 'x' and 'y' from the string, not the string
> 
> 'xy' as such.
> 
> 
> 
> They are that way because they have been in Python for a long time,
> 
> long before sets and such like were added to the language.

Hey,

Great! Now I understand!
So, basically, it is also stripping the numbers after the '_' !!

Thank you, I know a bit more now!

Have a nice day everyone :)
Max

[toc] | [prev] | [next] | [standalone]


#46927

FromPeter Otten <__peter__@web.de>
Date2013-06-04 17:52 +0200
Message-ID<mailman.2660.1370361157.3114.python-list@python.org>
In reply to#46916
mstagliamonte wrote:

> Hi everyone,
> 
> I am a beginner in python and trying to find my way through... :)
> 
> I am writing a script to get numbers from the headers of a text file.
> 
> If the header is something like:
> h01 = ('>scaffold_1')
> I just use:
> h01.lstrip('>scaffold_')
> and this returns me '1'
> 
> But, if the header is:
> h02: ('>contig-100_0')
> if I use:
> h02.lstrip('>contig-100_')
> this returns me with: ''
> ...basically nothing. What surprises me is that if I do in this other way:
> h02b = h02.lstrip('>contig-100')
> I get h02b = ('_1')
> and subsequently:
> h02b.lstrip('_')
> returns me with: '1' which is what I wanted!
> 
> Why is this happening? What am I missing?

"abba".lstrip("ab")

does not remove the prefix "ab" from the string "abba". Instead it removes 
chars from the beginning until it encounters one that is not in "ab". So

t = s.lstrip(chars_to_be_removed)

is roughly equivalent to

t = s
while len(t) > 0 and t[0] in chars_to_be_removed:
    t = t[1:]

If you want to remove a prefix use

s = "abba"
prefix = "ab"
if s.startswith(prefix):
    s = s[len(prefix):]


[toc] | [prev] | [next] | [standalone]


#46929

FromJohn Gordon <gordon@panix.com>
Date2013-06-04 15:55 +0000
Message-ID<kol2kq$7lh$1@reader1.panix.com>
In reply to#46916
In <1829efca-935d-4049-ba61-7138015a2806@googlegroups.com> mstagliamonte <madmaxthc@yahoo.it> writes:

> Hi everyone,

> I am a beginner in python and trying to find my way through... :)

> I am writing a script to get numbers from the headers of a text file.

> If the header is something like:
> h01 = ('>scaffold_1')
> I just use:
> h01.lstrip('>scaffold_')
> and this returns me '1'

> But, if the header is:
> h02: ('>contig-100_0')
> if I use:
> h02.lstrip('>contig-100_')
> this returns me with: ''
> ...basically nothing. What surprises me is that if I do in this other way:
> h02b = h02.lstrip('>contig-100')
> I get h02b = ('_1')
> and subsequently:
> h02b.lstrip('_')
> returns me with: '1' which is what I wanted!

> Why is this happening? What am I missing?

It's happening because the argument you pass to lstrip() isn't an exact
string to be removed; it's a set of individual characters, all of which
will be stripped out.

So, when you make this call:

    h02.lstrip('>contig-100_')

You're telling python to remove all of the characters in '>contig-100_' from
the base string, which leaves nothing remaining.

The reason it "worked" on your first example was that the character '1'
didn't occur in your sample header string 'scaffold_'.

If the underscore character is always the separating point in your headers,
a better way might be to use the split() method instead of lstrip().

-- 
John Gordon                   A is for Amy, who fell down the stairs
gordon@panix.com              B is for Basil, assaulted by bears
                                -- Edward Gorey, "The Gashlycrumb Tinies"

[toc] | [prev] | [next] | [standalone]


#46932

Frommstagliamonte <madmaxthc@yahoo.it>
Date2013-06-04 09:06 -0700
Message-ID<2e917225-8598-45a6-9c3f-e49b9cbc5b47@googlegroups.com>
In reply to#46929
Thanks to everyone! I didn't expect so many replies in such a short time!

Regards,
Max

[toc] | [prev] | [next] | [standalone]


#47010

FromLarry Hudson <orgnut@yahoo.com>
Date2013-06-04 21:53 -0700
Message-ID<orydnZl-2In_WTPMnZ2dnUVZ_jKdnZ2d@giganews.com>
In reply to#46916
On 06/04/2013 08:21 AM, mstagliamonte wrote:
> Hi everyone,
>
> I am a beginner in python and trying to find my way through... :)
>
> I am writing a script to get numbers from the headers of a text file.
>
> If the header is something like:
> h01 = ('>scaffold_1')
> I just use:
> h01.lstrip('>scaffold_')
> and this returns me '1'
>
> But, if the header is:
> h02: ('>contig-100_0')
> if I use:
> h02.lstrip('>contig-100_')
> this returns me with: ''
> ...basically nothing. What surprises me is that if I do in this other way:
> h02b = h02.lstrip('>contig-100')
> I get h02b = ('_1')
> and subsequently:
> h02b.lstrip('_')
> returns me with: '1' which is what I wanted!
>
> Why is this happening? What am I missing?
>
> Thanks for your help and attention
> Max
>

The lstrip() function is the wrong one to use here.  The command help(str.lstrip) gives:

lstrip(...)
     S.lstrip([chars]) -> str

     Return a copy of the string S with leading whitespace removed.
     If chars is given and not None, remove characters in chars instead.

IOW, it does NOT strip the given string, but all the characters in the given string.
So in your second example it (correctly) removes everything and gives you an empty string as the 
result.

One possible alternative is to use slicing:

h02 = '>contig-100_0'
h03 = '>contig-100_'
result = h02[len(h03):]

Or some similar variation, possibly adding a startswith() function for some simple error 
checking.  Of course, other approaches are possible as well,

      -=- Larry -=-

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web