Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!takemy.news.telefonica.de!telefonica.de!feed.news.schlund.de!schlund.de!news.online.de!not-for-mail From: Philipp Kraus Newsgroups: comp.lang.python Subject: string encoding regex problem Date: Sat, 16 Aug 2014 02:27:57 +0200 Organization: 1&1 Internet AG Lines: 89 Message-ID: NNTP-Posting-Host: p5b0c49a3.dip0.t-ipconnect.de Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=--------------12420417401678244667 X-Trace: online.de 1408148876 19744 91.12.73.163 (16 Aug 2014 00:27:56 GMT) X-Complaints-To: abuse@einsundeins.com NNTP-Posting-Date: Sat, 16 Aug 2014 00:27:56 +0000 (UTC) User-Agent: Unison/2.1.10 Xref: csiph.com comp.lang.python:76389 This is a multi-part message in MIME format. ----------------12420417401678244667 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hello, I have defined a function with: def URLReader(url) : try : f = urllib2.urlopen(url) data = f.read() f.close() except Exception, e : raise MyError.StopError(e) return data which get the HTML source code from an URL. I use this to get a part of a HTML document without any HTML parsing, so I call (I would like to get the download link of the boost library): found = re.search( "

Hello,


I have defined a function with:


def URLReader(url) :

    try :

        f = urllib2.urlopen(url)

        data = f.read()

        f.close()

    except Exception, e :

        raise MyError.StopError(e)

    return data


which get the HTML source code from an URL. I use this to get a part of a HTML document without any HTML parsing, so I call (I would like to get the download link of the boost library):


found = re.search( "<a href=\"/projects/boost/files/latest/download\?source=files\" title=\"/boost/(.*)", Utilities.URLReader("http://sourceforge.net/projects/boost/files/boost/") )

if found == None :

raise MyError.StopError("Boost Download URL not found")


But found is always None, so I cannot get the correct match. I didn't find the error in my code.


Thanks for help


Phil

----------------12420417401678244667--