Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #99508

Re: Screen scraper to get all 'a title' elements

From Grobu <snailcoder@retrosite.invalid>
Newsgroups comp.lang.python
Subject Re: Screen scraper to get all 'a title' elements
Date 2015-11-26 00:44 +0100
Organization A noiseless patient Spider
Message-ID <n35h0v$stn$1@dont-email.me> (permalink)
References <23ed6f4b-0ef2-4c9e-ade6-e597e7e03ca2@googlegroups.com> <mailman.96.1448484959.20593.python-list@python.org> <n35ckk$9q0$1@dont-email.me> <c1e43997-0da3-4b93-b9af-98a2568eff9d@googlegroups.com> <mailman.103.1448492791.20593.python-list@python.org>

Show all headers | View raw


On 26/11/15 00:06, Chris Angelico wrote:
> On Thu, Nov 26, 2015 at 9:48 AM, ryguy7272 <ryanshuell@gmail.com> wrote:
>> Thanks!!  Is that regex?  Can you explain exactly what it is doing?
>> Also, it seems to pick up a lot more than just the list I wanted, but that's ok, I can see why it does that.
>>
>> Can you just please explain what it's doing???
>
> It's a trap!
>
> Don't use a regex to parse HTML, unless you're deliberately trying to
> entice young and innocent programmers to the dark side.
>
> ChrisA
>

Sorry, I wasn't aware of regex being on the dark side :-)
Now that you mention it, I suppose that their being complex and 
error-inducing could lead to broken code all too easily when there is a 
reliable, ready-made solution like BeautifulSoup.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 12:42 -0800
  Re: Screen scraper to get all 'a title' elements MRAB <python@mrabarnett.plus.com> - 2015-11-25 20:55 +0000
    Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-25 23:30 +0100
      Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 14:48 -0800
        Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:06 +1100
          Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 00:44 +0100
            Re: Screen scraper to get all 'a title' elements Marko Rauhamaa <marko@pacujo.net> - 2015-11-26 01:53 +0200
              Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:59 +1100
            Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:54 +1100
            Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 02:05 +0100
        Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 00:33 +0100
          Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 15:37 -0800
            Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:42 +1100
  Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 14:04 -0800
    Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 09:10 +1100
  Re: Screen scraper to get all 'a title' elements TP <wingusr@gmail.com> - 2015-11-25 17:15 -0800
  Re: Screen scraper to get all 'a title' elements Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-26 14:49 +0000

csiph-web