Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54478

Re: Antispam measures circumventing

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.039
X-Spam-Evidence '*H*': 0.92; '*S*': 0.00; '21,': 0.07; 'expressions': 0.07; 'users,': 0.07; 'converted': 0.09; 'domains,': 0.09; 'handful': 0.09; 'image,': 0.09; 'instance.': 0.09; 'processing,': 0.09; 'subject:skip:c 10': 0.09; 'wrote': 0.14; 'ah,': 0.16; 'beautifully': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'inverse': 0.16; 'it;': 0.16; 'lambda': 0.16; 'letters.': 0.16; 'protecting': 0.16; 'spammers': 0.16; 'uppercase': 0.16; 'prevent': 0.16; 'sat,': 0.16; 'wrote:': 0.18; 'code.': 0.18; 'bit': 0.19; "python's": 0.19; 'written': 0.21; '>>>': 0.22; 'code,': 0.22; '(in': 0.22; 'saying': 0.22; "aren't": 0.24; 'instead.': 0.24; 'lets': 0.24; 'tend': 0.24; 'text,': 0.24; "haven't": 0.24; "i've": 0.25; 'options': 0.25; 'this:': 0.26; 'header:In-Reply-To:1': 0.27; 'function': 0.29; 'am,': 0.29; 'character': 0.29; 'matching': 0.30; 'message-id:@mail.gmail.com': 0.30; 'getting': 0.31; 'bad.': 0.31; 'sep': 0.31; "skip:' 40": 0.31; 'skip:m 30': 0.32; 'run': 0.32; 'quite': 0.32; 'text': 0.33; 'addresses': 0.33; 'are:': 0.33; 'style': 0.33; 'actual': 0.34; 'maybe': 0.34; 'could': 0.34; "can't": 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'addresses,': 0.36; 'false': 0.36; "i'll": 0.36; 'possible': 0.36; 'too': 0.37; 'server': 0.38; 'filter': 0.38; 'gmail': 0.38; 'solving': 0.38; 'version,': 0.38; 'to:addr:python-list': 0.38; 'functional': 0.39; 'generating': 0.39; 'reported': 0.39; 'success.': 0.39; 'to:addr:python.org': 0.39; 'enough': 0.39; 'users': 0.40; 'even': 0.60; 'remove': 0.60; 'experts': 0.60; 'most': 0.60; 'john': 0.61; 'simply': 0.61; "you're": 0.61; 'you.': 0.62; 'back': 0.62; 'address': 0.63; 'name': 0.63; 'pick': 0.64; 'week,': 0.64; 'accounts': 0.64; 'levels': 0.65; 'it!': 0.67; 'jobs': 0.68; 'frank': 0.68; 'introduction': 0.68; 'skip:r 40': 0.68; 'internet': 0.71; 'capital': 0.73; 'address,': 0.75; 'protect': 0.79; 'you:': 0.81; '(according': 0.84; "else's": 0.84; 'email addr:hotmail.com,': 0.84; 'etc..': 0.84; 'expressive': 0.84; 'hassle': 0.84; '182': 0.91; 'beings': 0.91; 'inbox': 0.93; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=8R/sfaOhdiCYTAG235jxO5EBc+jurfW/ti8hcOtvlys=; b=zSzhNP+pCOUCPZnPTLPsKURsx1rQUPkOTXgWQ5KZJMxbD7wO9u5MjfZ7kH+wF5R5eS FBJq8W9r4Y4V5TEia002VaMZv3AZL08RtsD6EDU1UA9ggiqR8AFDTIkF0waZ01dI97TC M6dBbLmRi5RL4udO9j9S+c7UsYn4ucHxZtAGkbTSjiIZPJEDzIG7fazuiTisv/R0NLYV ZqNnFk1VJbinlPr3QvrXVLRl+Gf9VDucBW89LB7KIceoYxry1uS4By0iXMK77Toqi/HF tlSRPeMpLcG24pPigTLyq72R+Z9JjWzvDx7T+fPiNKyN/zkqPlG+S+EzXgU2YSDAlCXI dOBg==
MIME-Version 1.0
X-Received by 10.58.118.130 with SMTP id km2mr6980264veb.0.1379691857494; Fri, 20 Sep 2013 08:44:17 -0700 (PDT)
In-Reply-To <523C6402.7090501@gmail.com>
References <523C6402.7090501@gmail.com>
Date Sat, 21 Sep 2013 01:44:17 +1000
Subject Re: Antispam measures circumventing
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.186.1379691861.18130.python-list@python.org> (permalink)
Lines 69
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1379691861 news.xs4all.nl 16009 [2001:888:2000:d::a6]:60389
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:54478

Show key headers only | View raw


On Sat, Sep 21, 2013 at 1:04 AM, Jugurtha Hadjar
<jugurtha.hadjar@gmail.com> wrote:
> Supposing my name is John Doe and the e-mail is john.doe@hotmail.com, my
> e-mail was written like this:
>
> REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com'
>
> With a note saying to remove the capital letters.
>
> Now, I wrote this :
>
> for character in my_string:
> ...     if (character == character.upper()) and (character !='@') and
> (character != '.'):
> ...             my_string = my_string.replace(character,'')
>
>
> And the end result was john.doe@hotmail.com.
>
> Is there a better way to do that ?

Instead of matching the ones that are the same as their uppercase
version, why not instead keep the ones that are the same as their
lowercase?

>>> email = 'REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com'
>>> ''.join(filter(lambda x: x==x.lower(),email))
'john.doe@hotmail.com'

This could be a neat introduction to a functional style of code, if
you haven't already met it; use of filter and lambda expressions can
make for some beautifully expressive code.

> Also, what would in your opinion make it *harder* for a non-human
> to retrieve the original e-mail address? Maybe a function with no
> inverse function ? Generating an image that can't be converted back
> to text, etc..

Ah, now you're getting into the realm of CAPTCHAs. I'll be quite frank
with you: Don't bother. Many MANY experts are already looking into it
- with various levels of success. Spammers are getting better and
better at harvesting addresses and solving CAPTCHAs, and your legit
users aren't getting that benefit, so you make it harder for the
humans while still possible for the bots. (And some CAPTCHAs are
solved by simply farming the jobs off to actual human beings (in
China, I think I heard) for a pittance each. There's fundamentally no
way to prevent that.) So your options are:

1) Call on someone else's code. Search the internet for ways of
concealing email addresses, pick one that isn't too much hassle to
legit users, and use it. I've seen quite a few that put the email
address in an image, one way or another; they tend to be a bit
annoying, but some aren't too bad.

2) Give up on protecting your address, and protect your inbox instead.
Get some good spam filtering, and let 'em send it all at you. I run a
local mail server for a few domains, and even with the filter set
conservatively enough to all but eliminate false positives, we see
only a handful of false negatives (according to my logs, 182 emails
reported as spam this week, across all domains and all accounts - most
accounts see <10 a week, a couple of them see maybe 20-30). And again,
you can call on someone else to do the work for you - sending all your
mail to gmail lets you take advantage of their filtering, for
instance.

But hey. If you want to play around with text processing, Python's a
good choice for it!

ChrisA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Antispam measures circumventing Chris Angelico <rosuav@gmail.com> - 2013-09-21 01:44 +1000

csiph-web