Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.datemas.de!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.032 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'url:pypi': 0.03; 'subject: -- ': 0.07; 'subject:ANN': 0.07; 'url:blog': 0.10; 'archive': 0.14; 'gpg': 0.16; 'inspiration': 0.16; 'jabber:': 0.16; 'subject:] ': 0.20; 'header:User-Agent:1': 0.23; 'script.': 0.24; 'script': 0.25; 'somewhere': 0.26; 'comments': 0.31; 'url:cz': 0.31; 'front': 0.32; 'url:python': 0.33; 'addresses': 0.33; 'skip:_ 10': 0.34; 'subject:the': 0.34; 'created': 0.35; 'google': 0.35; 'thanks': 0.36; 'url:org': 0.36; 'so,': 0.37; 'list': 0.37; 'received:209': 0.37; 'received:10': 0.37; 'to:addr:python-list': 0.38; 'subject:[': 0.39; 'hosted': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'black': 0.61; 'first': 0.61; 'content-disposition:inline': 0.62; 'frustrated': 0.68; 'received:10.36': 0.84; 'url:2013': 0.84 Date: Sat, 4 Jan 2014 23:57:40 +0100 From: Matej Cepl To: python-list@python.org Subject: [ANN] gg_scrapper -- scrapping of the Google Groups MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="n8g4imXOkfNTN/H1" Content-Disposition: inline Organization: Red Hat Czech, s.r.o. X-Operating-System: Linux 3.10.0-60.el7.x86_64 User-Agent: Mutt/1.5.21 (2012-12-30) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 49 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1388876270 news.xs4all.nl 2856 [2001:888:2000:d::a6]:33102 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:63152 --n8g4imXOkfNTN/H1 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Did you try to archive email list hosted on the Google Groups? =20 Were you endlessly frustrated by the black hole which is Google=20 Groups, conscpicious by its absence on the Data Liberation Front=20 website? Yes, I was too_ So, I have created a script webscrapping a google group and=20 created gg_scrapper_ . Thanks to `Sean Hogan`_ for the first=20 inspiration for the script. Any comments would be welcome via=20 email (I am sure you can find my addresses somewhere on the=20 Web). Best, Mat=C4=9Bj =2E. _too: http://matej.ceplovi.cz/blog/2013/09/we-should-stop-even-pretending-goo= gle-is-trying-to-do-the-right-thing/ =2E. _gg_scrapper: https://pypi.python.org/pypi/gg_scrapper =2E. _`Sean Hogan`: http://matej.ceplovi.cz/blog/2013/09/we-should-stop-even-pretending-goo= gle-is-trying-to-do-the-right-thing/#comment-482 --=20 http://www.ceplovi.cz/matej/, Jabber: mceplceplovi.cz GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC =20 <"}}}>< --n8g4imXOkfNTN/H1 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iD8DBQFSyJHj4J/vJdlkhKwRAhq6AJ4//AuzaaL3V7e/EuVkd0TtLHjfpQCggl5X nF+1oqPOr9g1XCMY0mqvqek= =iSSO -----END PGP SIGNATURE----- --n8g4imXOkfNTN/H1--