Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.104 X-Spam-Level: * X-Spam-Evidence: '*H*': 0.82; '*S*': 0.02; 'skip:u 30': 0.07; 'parsing': 0.09; 'luck!': 0.16; 'received:74.208.4.195': 0.16; 'simple.': 0.16; 'url:example': 0.16; 'wrote:': 0.18; 'header :User-Agent:1': 0.23; 'tables': 0.26; 'header:In-Reply-To:1': 0.27; 'subject:Database': 0.31; 'not.': 0.33; 'subject:from': 0.34; 'subject: (': 0.35; 'html,': 0.36; 'list': 0.37; 'to:addr :python-list': 0.38; 'pm,': 0.38; 'to:addr:python.org': 0.39; 'skip:t 30': 0.61; 'received:74.208': 0.68; 'skip:r 30': 0.69; 'url:page': 0.74; 'carlos': 0.91; 'subject:Online': 0.96 Date: Fri, 24 May 2013 21:16:37 -0400 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Total Beginner - Extracting Data from a Database Online (Screenshot) References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:QI1tXyQg7HXd6csflu2OwKqSuvpWkhrS92VOA7OIYXB wqFu9A/HuHHQORzmVuu88Lqen0GX6JeqJUh2/Io7uSkhTJMJTn vNTx4g0MAQcvNLG7N4WG+igevYOs33pPcarJHdqhlv9kV2OYI/ UnvDBEMv3y5pEb6xG/Aj7kyaJqmqSVsoAqv4k/hJD/4KMLeMka qUSTzvLnRhpmi6n9q6fObGZp4PIdsCTiMqDJ4HNED9tOeUGD2/ xArk18QIMeTJ5RfeCEQ6xQG6JeC3xsYqf6b+BucwLzJn7nIvbv 78XGZFTPkZDFq7/Njzp+yPIgrS5dP3VTfZ3TyBgbE3uhyNH+TL Tyyfx45SM091tO/5xKBs= X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 18 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1369444611 news.xs4all.nl 15997 [2001:888:2000:d::a6]:41066 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:45938 On 05/24/2013 07:36 PM, Carlos Nepomuceno wrote: > > > page = urllib2.urlopen("http://example.com/page.html").read().strip() > > #to create the tables list > tables=[[re.findall('(.*?)',r,re.S) for r in re.findall('(.*?)',t,re.S)] for t in re.findall('(.*?)
',page,re.S)] > > > Pretty simple. Good luck! Only if the page is html, which the OP's was not. It was an image. Try parsing that with regex. -- DaveA