Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #99509

Re: Screen scraper to get all 'a title' elements

Path csiph.com!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail
From Marko Rauhamaa <marko@pacujo.net>
Newsgroups comp.lang.python
Subject Re: Screen scraper to get all 'a title' elements
Date Thu, 26 Nov 2015 01:53:26 +0200
Organization A noiseless patient Spider
Lines 19
Message-ID <87y4dl3abt.fsf@elektro.pacujo.net> (permalink)
References <23ed6f4b-0ef2-4c9e-ade6-e597e7e03ca2@googlegroups.com> <mailman.96.1448484959.20593.python-list@python.org> <n35ckk$9q0$1@dont-email.me> <c1e43997-0da3-4b93-b9af-98a2568eff9d@googlegroups.com> <mailman.103.1448492791.20593.python-list@python.org> <n35h0v$stn$1@dont-email.me>
Mime-Version 1.0
Content-Type text/plain
Injection-Info mx02.eternal-september.org; posting-host="b7cb1518d23ec19d482dcc9c31d30fdd"; logging-data="30355"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+F+10u7QG4BON1ieSS0tqK"
User-Agent Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
Cancel-Lock sha1:o0RRFw4udpO8ZdURWYyBEaVHro4= sha1:TK9KGt0Oo2xtSAZDK4alnpI9NBo=
Xref csiph.com comp.lang.python:99509

Show key headers only | View raw


Grobu <snailcoder@retrosite.invalid>:

> Sorry, I wasn't aware of regex being on the dark side :-)

No, regular expressions are great for many purposes. Parsing
context-free syntax isn't one of them.

See:

  <URL: https://en.wikipedia.org/wiki/Chomsky_hierarchy#The_hierarchy>

Most modern programming languages including HTML are context-free. Their
structure is too rich for regular expressions to capture.

Regular expressions can handle any regular language just fine. They are
commonly used to define the lexical tokens of a language.


Marko

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 12:42 -0800
  Re: Screen scraper to get all 'a title' elements MRAB <python@mrabarnett.plus.com> - 2015-11-25 20:55 +0000
    Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-25 23:30 +0100
      Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 14:48 -0800
        Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:06 +1100
          Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 00:44 +0100
            Re: Screen scraper to get all 'a title' elements Marko Rauhamaa <marko@pacujo.net> - 2015-11-26 01:53 +0200
              Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:59 +1100
            Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:54 +1100
            Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 02:05 +0100
        Re: Screen scraper to get all 'a title' elements Grobu <snailcoder@retrosite.invalid> - 2015-11-26 00:33 +0100
          Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 15:37 -0800
            Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 10:42 +1100
  Re: Screen scraper to get all 'a title' elements ryguy7272 <ryanshuell@gmail.com> - 2015-11-25 14:04 -0800
    Re: Screen scraper to get all 'a title' elements Chris Angelico <rosuav@gmail.com> - 2015-11-26 09:10 +1100
  Re: Screen scraper to get all 'a title' elements TP <wingusr@gmail.com> - 2015-11-25 17:15 -0800
  Re: Screen scraper to get all 'a title' elements Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-26 14:49 +0000

csiph-web