Groups > comp.lang.python > #62535 > unrolled thread

BLANK PAGE when i try Filtering Adsense with abpy

Started by	em rexhepi <em.rexhepi@gmail.com>
First post	2013-12-22 09:20 -0800
Last post	2013-12-22 16:13 -0500
Articles	6 — 6 participants

Back to article view | Back to comp.lang.python

  BLANK PAGE when i try Filtering Adsense with abpy em rexhepi <em.rexhepi@gmail.com> - 2013-12-22 09:20 -0800
    Re: BLANK PAGE when i try Filtering Adsense with abpy Chris Angelico <rosuav@gmail.com> - 2013-12-23 04:58 +1100
    Re: BLANK PAGE when i try Filtering Adsense with abpy Michael Torrie <torriem@gmail.com> - 2013-12-22 11:08 -0700
    Re: BLANK PAGE when i try Filtering Adsense with abpy Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-12-22 18:25 +0000
    Re: BLANK PAGE when i try Filtering Adsense with abpy MRAB <python@mrabarnett.plus.com> - 2013-12-22 20:28 +0000
    Re: BLANK PAGE when i try Filtering Adsense with abpy Terry Reedy <tjreedy@udel.edu> - 2013-12-22 16:13 -0500

#62535 — BLANK PAGE when i try Filtering Adsense with abpy

From	em rexhepi <em.rexhepi@gmail.com>
Date	2013-12-22 09:20 -0800
Subject	BLANK PAGE when i try Filtering Adsense with abpy
Message-ID	<bf7b5838-9f24-4a94-9c14-0976a21578ff@googlegroups.com>

I know is my fault i'm no good programmer, I'm a begginer that's why i need your help.

I have a python 3.3 project to be finished. I did what i could there is not much help on google about this topic.

The project is to load a webpage from any website and filter the ads.
I'm using ABPY library to filter, here is the link:
https://github.com/atereshkin/abpy <- needs to be converted in python 3.x it is on 2.x
easylist.txt link: https://easylist-downloads.adblockplus.org/easylist.txt


When I use my code it just displays nothing

My code:
#!/usr/local/bin/python3.1

import cgitb;cgitb.enable()

import urllib.request
response = urllib.request.build_opener()
response.addheaders = [('User-agent', 'Mozilla/5.0')]
response = urllib.request.urlopen("www.youtube.com";)

html = response.read()

from abpy import Filter
with open("easylist.txt") as f:
f = Filter(file('easylist.txt'))
f.match(html)


print("Content-type: text/html")
print()
print (html)

[toc] | [next] | [standalone]

#62536

From	Chris Angelico <rosuav@gmail.com>
Date	2013-12-23 04:58 +1100
Message-ID	<mailman.4495.1387735103.18130.python-list@python.org>
In reply to	#62535

On Mon, Dec 23, 2013 at 4:20 AM, em rexhepi <em.rexhepi@gmail.com> wrote:
> I have a python 3.3 project to be finished. ...
>
> My code:
> #!/usr/local/bin/python3.1

Your shebang says 3.1, are you sure that's correct? Maybe it's not
finding the right interpreter.

If this is running as CGI, which it seems to be, check your server
error logs. It's quite possible you're getting back a blank page
because something's bombing, in which case - if you're lucky -
there'll be a full exception traceback in the log.

ChrisA

[toc] | [prev] | [next] | [standalone]

#62538

From	Michael Torrie <torriem@gmail.com>
Date	2013-12-22 11:08 -0700
Message-ID	<mailman.4497.1387735747.18130.python-list@python.org>
In reply to	#62535

On 12/22/2013 10:20 AM, em rexhepi wrote:
> When I use my code it just displays nothing
> 
> My code:
> #!/usr/local/bin/python3.1
> 
> import cgitb;cgitb.enable()
> 
> import urllib.request
> response = urllib.request.build_opener()
> response.addheaders = [('User-agent', 'Mozilla/5.0')]
> response = urllib.request.urlopen("www.youtube.com";)
> 
> html = response.read()
> 
> from abpy import Filter
> with open("easylist.txt") as f:
> f = Filter(file('easylist.txt'))
> f.match(html)

What happens when you comment out the above four lines?  Does the web
page print without the filtering?  Just as a sanity check.  My hunch is
that html has no data in it.

Also what is "f.match(html)" supposed to return? Is it supposed to
mutate html (seems unlikely) or does it return something? Looking at the
source code, match() does not return anything, but prints to stdout,
which is weird, but at least that tells us that it doesn't actually
change the html object.

> print("Content-type: text/html")
> print()
> print (html)

I'm not sure you're doing this right.  adpy seems a bit goofy, but since
f.match() does not appear to change html at all, you should get the same
html out that urllib grabbed.  So if you're not getting any output, that
means you're not getting the original html somehow.  Also if f.match()
is doing its thing, I don't think you want to print out html after the
command, because f.match itself is printing to stdout itself.

Have you looked over the adpy source code?  I haven't bothered to run
it, but a glance through the code would seem to indicate that it doesn't
actually do the filtering at all, but rather just prints out the rules
that the html code you provide would match.  I bet you could modify it
to do filtering though.  Maybe add a method that uses rule.sub to
replace the bad text with an empty string.

[toc] | [prev] | [next] | [standalone]

#62539

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2013-12-22 18:25 +0000
Message-ID	<mailman.4498.1387736733.18130.python-list@python.org>
In reply to	#62535

On 22/12/2013 17:20, em rexhepi wrote:
> I know is my fault i'm no good programmer, I'm a begginer that's why i need your help.
>
> I have a python 3.3 project to be finished. I did what i could there is not much help on google about this topic.
>
> The project is to load a webpage from any website and filter the ads.
> I'm using ABPY library to filter, here is the link:
> https://github.com/atereshkin/abpy <- needs to be converted in python 3.x it is on 2.x
> easylist.txt link: https://easylist-downloads.adblockplus.org/easylist.txt
>
>
> When I use my code it just displays nothing
>
> My code:
> #!/usr/local/bin/python3.1
>
> import cgitb;cgitb.enable()
>
> import urllib.request
> response = urllib.request.build_opener()
> response.addheaders = [('User-agent', 'Mozilla/5.0')]
> response = urllib.request.urlopen("www.youtube.com";)
>
> html = response.read()
>
> from abpy import Filter
> with open("easylist.txt") as f:
> f = Filter(file('easylist.txt'))
> f.match(html)

Whats the above meant to be doing?  You've opened easylist.txt as f and 
then reassigned f, passing easylist.txt to file which doesn't exist in 
Python 3.

>
>
> print("Content-type: text/html")
> print()
> print (html)
>


-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]

#62551

From	MRAB <python@mrabarnett.plus.com>
Date	2013-12-22 20:28 +0000
Message-ID	<mailman.4508.1387744110.18130.python-list@python.org>
In reply to	#62535

On 22/12/2013 18:08, Michael Torrie wrote:
> On 12/22/2013 10:20 AM, em rexhepi wrote:
>> When I use my code it just displays nothing
>>
>> My code:
>> #!/usr/local/bin/python3.1
>>
>> import cgitb;cgitb.enable()
>>
>> import urllib.request
>> response = urllib.request.build_opener()
>> response.addheaders = [('User-agent', 'Mozilla/5.0')]
>> response = urllib.request.urlopen("www.youtube.com";)
>>
>> html = response.read()
>>
>> from abpy import Filter
>> with open("easylist.txt") as f:
>> f = Filter(file('easylist.txt'))
>> f.match(html)
>
> What happens when you comment out the above four lines?  Does the web
> page print without the filtering?  Just as a sanity check.  My hunch is
> that html has no data in it.
>
> Also what is "f.match(html)" supposed to return? Is it supposed to
> mutate html (seems unlikely) or does it return something? Looking at the
> source code, match() does not return anything, but prints to stdout,
> which is weird, but at least that tells us that it doesn't actually
> change the html object.
>
>> print("Content-type: text/html")
>> print()
>> print (html)
>
> I'm not sure you're doing this right.  adpy seems a bit goofy, but since
> f.match() does not appear to change html at all, you should get the same
> html out that urllib grabbed.  So if you're not getting any output, that
> means you're not getting the original html somehow.  Also if f.match()
> is doing its thing, I don't think you want to print out html after the
> command, because f.match itself is printing to stdout itself.
>
> Have you looked over the adpy source code?  I haven't bothered to run
> it, but a glance through the code would seem to indicate that it doesn't
> actually do the filtering at all, but rather just prints out the rules
> that the html code you provide would match.  I bet you could modify it
> to do filtering though.  Maybe add a method that uses rule.sub to
> replace the bad text with an empty string.
>
The urlopen call also contains a stray semicolon.

[toc] | [prev] | [next] | [standalone]

#62553

From	Terry Reedy <tjreedy@udel.edu>
Date	2013-12-22 16:13 -0500
Message-ID	<mailman.4510.1387746820.18130.python-list@python.org>
In reply to	#62535

On 12/22/2013 12:20 PM, em rexhepi wrote:
> I know is my fault i'm no good programmer, I'm a begginer that's why i need your help.
>
> I have a python 3.3 project to be finished. I did what i could there is not much help on google about this topic.
>
> The project is to load a webpage from any website and filter the ads.
> I'm using ABPY library to filter, here is the link:
> https://github.com/atereshkin/abpy <- needs to be converted in python 3.x it is on 2.x
> easylist.txt link: https://easylist-downloads.adblockplus.org/easylist.txt
>
>
> When I use my code it just displays nothing
>
> My code:
> #!/usr/local/bin/python3.1

Please update your Python 3 if you are not in a straightjacket 
preventing you from doing so.

> import cgitb;cgitb.enable()

I suggest commenting this out and running normally in a console or Idle 
so you are guaranteed to see output, including error tracebacks. Only 
use cgi when this runs successfully in normal mode.

> import urllib.request
> response = urllib.request.build_opener()
> response.addheaders = [('User-agent', 'Mozilla/5.0')]
> response = urllib.request.urlopen("www.youtube.com";)

The ; is a SyntaxError and Python exits. See above.


> html = response.read()
>
> from abpy import Filter
> with open("easylist.txt") as f:
> f = Filter(file('easylist.txt'))
> f.match(html)
>
>
> print("Content-type: text/html")
> print()
> print (html)
>


-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]

csiph-web

BLANK PAGE when i try Filtering Adsense with abpy

Contents

#62535 — BLANK PAGE when i try Filtering Adsense with abpy

#62536

#62538

#62539

#62551

#62553