Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #58268
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <joel.goldstick@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.023 |
| X-Spam-Evidence | '*H*': 0.95; '*S*': 0.00; 'example:': 0.03; 'urllib2': 0.07; 'subject:using': 0.09; 'cc:addr:python-list': 0.11; 'soup': 0.16; 'subject:URL': 0.16; 'url.': 0.16; 'wrote:': 0.18; 'do.': 0.18; 'subject:page': 0.19; 'thu,': 0.19; 'import': 0.22; 'cc:addr:python.org': 0.22; 'question': 0.24; 'cc:2**0': 0.24; 'nearly': 0.26; 'asking': 0.27; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'asked': 0.31; 'code': 0.31; '-0700,': 0.31; 'extract': 0.31; 'yesterday.': 0.31; 'another': 0.32; 'open': 0.33; 'url:python': 0.33; 'fri,': 0.33; 'link.': 0.33; 'skip:u 20': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'really': 0.36; 'url:listinfo': 0.36; 'url:org': 0.36; 'clear': 0.37; 'nov': 0.38; 'url:mail': 0.40; 'how': 0.40; 'providing': 0.61; 'first': 0.61; 'name': 0.63; 'by:': 0.65; 'to:addr:ntlworld.com': 0.84; 'joel': 0.91; '2013': 0.98 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=bTwH8HzSD0IrJxDK+1rKwK6QvhAkzME4Ngp5/J2feY8=; b=NTTQvAgThIn2qvj1oqjhrFjFWfFGOW3Mq4rQ05t3HKjKt+PzzcJ3Laa8wA98s3DJyb /2QTO1fdF8YkySsgOzULNAIUvkm4REesbVgukM0ynJjXPdhmbctywnVp9nZGz6A9bqJa GR8LqRGPsOihqniaHEXSH9QHYOaJ/FfOADKxZQadrc/gi3/8hDAEIYcnFvYmMby9DNvk YTdxPLweZshA1maLY2N+w08TsVzC3Lm6ABB5bGYK+ccKhiO6Jl7QnsG4E6VPiJn/XwiE d1PGkdAd96jXjKH8RT2RFRnYT7V1RHXMYQyAhLBhdYoQy58vfi5G8Z+VJN/3jtNMOmvs GMEA== |
| MIME-Version | 1.0 |
| X-Received | by 10.52.98.99 with SMTP id eh3mr1763353vdb.29.1383315256999; Fri, 01 Nov 2013 07:14:16 -0700 (PDT) |
| In-Reply-To | <VjMcu.4720$J14.740@fx31.am4> |
| References | <7fb9c035-c663-4874-9597-ac47d1c30da7@googlegroups.com> <VjMcu.4720$J14.740@fx31.am4> |
| Date | Fri, 1 Nov 2013 10:14:16 -0400 |
| Subject | Re: how to extract page-URL using BeautifulSoup |
| From | Joel Goldstick <joel.goldstick@gmail.com> |
| To | Alister <alister.ware@ntlworld.com> |
| Content-Type | text/plain; charset=UTF-8 |
| Cc | "python-list@python.org" <python-list@python.org> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.1929.1383315259.18130.python-list@python.org> (permalink) |
| Lines | 38 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1383315259 news.xs4all.nl 15879 [2001:888:2000:d::a6]:40686 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:58268 |
Show key headers only | View raw
This is nearly the same question you asked under another name yesterday. Its not clear what you really want to do. You are asking what the url is of the page you retrieve by providing the same url. On Fri, Nov 1, 2013 at 7:33 AM, Alister <alister.ware@ntlworld.com> wrote: > On Thu, 31 Oct 2013 08:59:00 -0700, bhaktanishant wrote: > >> I want to extract the page-url. for example: >> if i have this code >> >> import urllib2 from bs4 import BeautifulSoup link = >> "http://www.google.com" >> page = urllib2.urlopen(link).read() >> soup = BeautifulSoup(page) >> >> then i can extract title of page by: >> >> title = soup.title >> >> but i want to know that how to extract page-URL from "soup" that will be >> "http://www.google.com" > > I must be missing something here, the page url is what you use to open > the page in the first place in your case link. > > > > > -- > May a Misguided Platypus lay its Eggs in your Jockey Shorts. > -- > https://mail.python.org/mailman/listinfo/python-list -- Joel Goldstick http://joelgoldstick.com
Back to comp.lang.python | Previous | Next — Previous in thread | Find similar | Unroll thread
how to extract page-URL using BeautifulSoup bhaktanishant@gmail.com - 2013-10-31 08:59 -0700
Re: how to extract page-URL using BeautifulSoup MRAB <python@mrabarnett.plus.com> - 2013-10-31 17:36 +0000
Re: how to extract page-URL using BeautifulSoup Alister <alister.ware@ntlworld.com> - 2013-11-01 11:33 +0000
Re: how to extract page-URL using BeautifulSoup Joel Goldstick <joel.goldstick@gmail.com> - 2013-11-01 10:14 -0400
csiph-web