Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #62011 > unrolled thread

Need Help with the BeautifulSoup problem, please

Started byseaspeak@gmail.com
First post2013-12-15 22:41 -0800
Last post2013-12-16 10:32 +0100
Articles 6 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Need Help with the BeautifulSoup problem, please seaspeak@gmail.com - 2013-12-15 22:41 -0800
    Re: Need Help with the BeautifulSoup problem, please 88888 Dihedral <dihedral88888@gmail.com> - 2013-12-16 00:02 -0800
      Re: Need Help with the BeautifulSoup problem, please seaspeak@gmail.com - 2013-12-16 01:26 -0800
    Re: Need Help with the BeautifulSoup problem, please seaspeak@gmail.com - 2013-12-16 00:39 -0800
    Re: Need Help with the BeautifulSoup problem, please Peter Otten <__peter__@web.de> - 2013-12-16 10:33 +0100
    Re: Need Help with the BeautifulSoup problem, please Andreas Perstinger <andipersti@gmail.com> - 2013-12-16 10:32 +0100

#62011 — Need Help with the BeautifulSoup problem, please

Fromseaspeak@gmail.com
Date2013-12-15 22:41 -0800
SubjectNeed Help with the BeautifulSoup problem, please
Message-ID<89f67420-0b59-4f45-bf33-cd4e17467852@googlegroups.com>
    I need to replace all tag <b> with <span> after ■. But the result from below is '■   <span style="REPLACE">D</span> / <font></font>'
Can you explain what I did wrong, please.

    s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
    soup = BeautifulSoup(s)
    for i in soup.find_all(text='■'):
        tag = soup.new_tag('span')
        tag['style'] = 'REPLACE'
        for ii in i.find_next_siblings():
            if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
                break
            else:
                if ii.name=='b':
                    tag.string=ii.string
                    print(ii.replace_with(tag))
    print(soup)

[toc] | [next] | [standalone]


#62021

From88888 Dihedral <dihedral88888@gmail.com>
Date2013-12-16 00:02 -0800
Message-ID<d038c329-d48f-4a70-8d8a-19e8bac2b57a@googlegroups.com>
In reply to#62011
On Monday, December 16, 2013 2:41:08 PM UTC+8, seas...@gmail.com wrote:
> I need to replace all tag <b> with <span> after ■. But the result from below is '■   <span style="REPLACE">D</span> / <font></font>'
> 
> Can you explain what I did wrong, please.
> 
> 
> 
>     s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
> 
>     soup = BeautifulSoup(s)
> 
>     for i in soup.find_all(text='■'):
> 
>         tag = soup.new_tag('span')
> 
>         tag['style'] = 'REPLACE'
> 
>         for ii in i.find_next_siblings():
> 
>             if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
> 
>                 break
> 
>             else:
> 
>                 if ii.name=='b':
> 
>                     tag.string=ii.string
> 
>                     print(ii.replace_with(tag))
> 
>     print(soup)

I think you should try some descent 
free editors such as notepad++ 
for your source codes to 
replace trivial strings without 
programmig.

[toc] | [prev] | [next] | [standalone]


#62027

Fromseaspeak@gmail.com
Date2013-12-16 01:26 -0800
Message-ID<c17ba96a-9d3c-4f3b-83b5-fe1f747e03c2@googlegroups.com>
In reply to#62021
88888 Dihedral於 2013年12月16日星期一UTC+8下午4時02分42秒寫道:
> On Monday, December 16, 2013 2:41:08 PM UTC+8, seas...@gmail.com wrote:
> 
> > I need to replace all tag <b> with <span> after ■. But the result from below is '■   <span style="REPLACE">D</span> / <font></font>'
> 
> > 
> 
> > Can you explain what I did wrong, please.
> 
> > 
> 
> > 
> 
> > 
> 
> >     s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
> 
> > 
> 
> >     soup = BeautifulSoup(s)
> 
> > 
> 
> >     for i in soup.find_all(text='■'):
> 
> > 
> 
> >         tag = soup.new_tag('span')
> 
> > 
> 
> >         tag['style'] = 'REPLACE'
> 
> > 
> 
> >         for ii in i.find_next_siblings():
> 
> > 
> 
> >             if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
> 
> > 
> 
> >                 break
> 
> > 
> 
> >             else:
> 
> > 
> 
> >                 if ii.name=='b':
> 
> > 
> 
> >                     tag.string=ii.string
> 
> > 
> 
> >                     print(ii.replace_with(tag))
> 
> > 
> 
> >     print(soup)
> 
> 
> 
> I think you should try some descent 
> 
> free editors such as notepad++ 
> 
> for your source codes to 
> 
> replace trivial strings without 
> 
> programmig.

I think it's my fault, thanks

[toc] | [prev] | [next] | [standalone]


#62024

Fromseaspeak@gmail.com
Date2013-12-16 00:39 -0800
Message-ID<5237630f-1c98-4094-96e6-bdc42404e8ba@googlegroups.com>
In reply to#62011
seas...@gmail.com於 2013年12月16日星期一UTC+8下午2時41分08秒寫道:
> I need to replace all tag <b> with <span> after ■. But the result from below is '■   <span style="REPLACE">D</span> / <font></font>'
> 
> Can you explain what I did wrong, please.
> 
> 
> 
>     s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
> 
>     soup = BeautifulSoup(s)
> 
>     for i in soup.find_all(text='■'):
> 
>         tag = soup.new_tag('span')
> 
>         tag['style'] = 'REPLACE'
> 
>         for ii in i.find_next_siblings():
> 
>             if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
> 
>                 break
> 
>             else:
> 
>                 if ii.name=='b':
> 
>                     tag.string=ii.string
> 
>                     print(ii.replace_with(tag))
> 
>     print(soup)

the point is the result seems wrong. I don't know if that is my problem.
I simplify the code to emphasize the problem, there's no way an editor can do what I wanna do.

[toc] | [prev] | [next] | [standalone]


#62029

FromPeter Otten <__peter__@web.de>
Date2013-12-16 10:33 +0100
Message-ID<mailman.4192.1387186374.18130.python-list@python.org>
In reply to#62011
seaspeak@gmail.com wrote:

>     I need to replace all tag <b> with <span> after ■. But the result from
>     below is '■   <span style="REPLACE">D</span> / <font></font>'
> Can you explain what I did wrong, please.
> 
>     s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
>     soup = BeautifulSoup(s)
>     for i in soup.find_all(text='■'):
>         tag = soup.new_tag('span')
>         tag['style'] = 'REPLACE'
>         for ii in i.find_next_siblings():
>             if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
>                 break
>             else:
>                 if ii.name=='b':
>                     tag.string=ii.string
>                     print(ii.replace_with(tag))
>     print(soup)

It looks like you cannot reuse a tag. Try

    s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
    soup = BeautifulSoup(s)
    for i in soup.find_all(text='■'):
        for ii in i.find_next_siblings():
            if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
                break
            else:
                if ii.name=='b':
                    tag = soup.new_tag('span')
                    tag['style'] = 'REPLACE'
                    tag.string=ii.string
                    print(ii.replace_with(tag))
    print(soup)

[toc] | [prev] | [next] | [standalone]


#62030

FromAndreas Perstinger <andipersti@gmail.com>
Date2013-12-16 10:32 +0100
Message-ID<mailman.4191.1387186340.18130.python-list@python.org>
In reply to#62011
On 16.12.2013 07:41, seaspeak@gmail.com wrote:
> I need to replace all tag <b> with <span> after ■. But the result
> frombelow is '■ <span style="REPLACE">D</span> / <font></font>'
> Can you explain what I did wrong, please.
>
>      s = '■<b>A</b> <b>B</b> <b>C</b> <b>D</b> / <font></font>'
>      soup = BeautifulSoup(s)
>      for i in soup.find_all(text='■'):
>          tag = soup.new_tag('span')
>          tag['style'] = 'REPLACE'
>          for ii in i.find_next_siblings():
>              if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
>                  break
>              else:
>                  if ii.name=='b':
>                      tag.string=ii.string
>                      print(ii.replace_with(tag))
>      print(soup)
>

You are only creating one new tag but as I understand your problem you 
want to replace each b-element with a new tag. Simply move the tag 
creating part:

for i in soup.find_all(text='■'):
     for ii in i.find_next_siblings():
         if ii.name=='font' or str(ii).lstrip('')[0:1]=='/':
             break
         else:
             if ii.name=='b':
                 tag = soup.new_tag('span')
                 tag['style'] = 'REPLACE'
                 tag.string=ii.string
                 print(ii.replace_with(tag))

And please read
https://wiki.python.org/moin/GoogleGroupsPython
if you want to continue using Google Groups for accessing this list.

Bye, Andreas

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web