Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54476 > unrolled thread

Antispam measures circumventing

Started byJugurtha Hadjar <jugurtha.hadjar@gmail.com>
First post2013-09-20 16:04 +0100
Last post2013-09-21 02:31 +1000
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Antispam measures circumventing Jugurtha Hadjar <jugurtha.hadjar@gmail.com> - 2013-09-20 16:04 +0100
    Re: Antispam measures circumventing Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2013-09-20 19:23 +0300
      Re: Antispam measures circumventing Chris Angelico <rosuav@gmail.com> - 2013-09-21 02:30 +1000
      Re: Antispam measures circumventing Chris Angelico <rosuav@gmail.com> - 2013-09-21 02:31 +1000

#54476 — Antispam measures circumventing

FromJugurtha Hadjar <jugurtha.hadjar@gmail.com>
Date2013-09-20 16:04 +0100
SubjectAntispam measures circumventing
Message-ID<mailman.184.1379689820.18130.python-list@python.org>
Hello,

# I posted this on the tutor list, but my message wasn't displayed


I shared some assembly code (microcontrollers) and I had a comment wit 
my e-mail address for contact purposes.

Supposing my name is John Doe and the e-mail is john.doe@hotmail.com, my 
e-mail was written like this:

REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com'

With a note saying to remove the capital letters.

Now, I wrote this :

for character in my_string:
...     if (character == character.upper()) and (character !='@') and 
(character != '.'):
...             my_string = my_string.replace(character,'')


And the end result was john.doe@hotmail.com.

Is there a better way to do that ? Without using regular expressions 
(Looked *really* ugly and it doesn't really make sense, unlike the few 
lines I've written, which are obvious even to a beginner like me).

I obviously don't like SPAM, but I just thought "If I were a spammer, 
how would I go about it".

Eventually, some algorithm of detecting the 
john<dot>doe<at>hotmail<dot>com must exist.


Also, what would in your opinion make it *harder* for a non-human to 
retrieve the original e-mail address? Maybe a function with no inverse 
function ? Generating an image that can't be converted back to text, etc..

If this is off-topic, you can just answer the "what is a better way to 
do that" part.

Thanks,



-- 
~Jugurtha Hadjar,

-- 
~Jugurtha Hadjar,

[toc] | [next] | [standalone]


#54487

FromJussi Piitulainen <jpiitula@ling.helsinki.fi>
Date2013-09-20 19:23 +0300
Message-ID<qotbo3ncvi6.fsf@ruuvi.it.helsinki.fi>
In reply to#54476
Jugurtha Hadjar writes:

> Supposing my name is John Doe and the e-mail is john.doe@hotmail.com,
> my e-mail was written like this:
> 
> REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com'
> 
> With a note saying to remove the capital letters.
> 
> Now, I wrote this :
> 
> for character in my_string:
> ...     if (character == character.upper()) and (character !='@') and
> (character != '.'):
> ...             my_string = my_string.replace(character,'')

That does a lot of needless work, but I'll suggest other things
instead of expanding on this remark.

First, there's character.isupper() that will replace your entire
condition.

Second, there's ''.join(c for c in my_string if not c.isupper()).

> And the end result was john.doe@hotmail.com.
> 
> Is there a better way to do that ? Without using regular expressions
> (Looked *really* ugly and it doesn't really make sense, unlike the few
> lines I've written, which are obvious even to a beginner like me).

I don't see how you get to consider '[A-Z]' ugly. (Python doesn't seem
to have the named character classes like '[[:upper:]]' that would do
more than ASCII in some regexp systems. I only looked very briefly.)

Third, here's a way - try help(str.translate) and help(str.maketrans)
or python.org for some details:

 >>> from string import ascii_uppercase
 >>> 'Ooh, CamelCase!'.translate(str.maketrans('', '', ascii_uppercase))
 'oh, amelase!'

> I obviously don't like SPAM, but I just thought "If I were a spammer,
> how would I go about it".
> 
> Eventually, some algorithm of detecting the
> john<dot>doe<at>hotmail<dot>com must exist.
> 
> Also, what would in your opinion make it *harder* for a non-human to
> retrieve the original e-mail address? Maybe a function with no
> inverse function ? Generating an image that can't be converted back
> to text, etc..

Something meaningful: make it john.doeray@hotmail.com with a note to
"remove the female deer" for john.ray@hotmail.com, or "remove the drop
of golden sun" for "john.doe@hotmail.com". You may get a cease and
desist letter - much uglier than a simple regex - if you do literally
this, but you get the idea. I've seen people using "remove the animal"
or "remove the roman numeral".

(Put .invalid at the end, maybe. But I wish spam was against the law,
effectively.)

[toc] | [prev] | [next] | [standalone]


#54492

FromChris Angelico <rosuav@gmail.com>
Date2013-09-21 02:30 +1000
Message-ID<mailman.190.1379694610.18130.python-list@python.org>
In reply to#54487
On Sat, Sep 21, 2013 at 2:23 AM, Jussi Piitulainen
<jpiitula@ling.helsinki.fi> wrote:
> (Put .invalid at the end, maybe. But I wish spam was against the law,
> effectively.)

Against what law, exactly? In what jurisdiction will you seek to
charge spammers? And who will track them down?

ChrisA

[toc] | [prev] | [next] | [standalone]


#54493

FromChris Angelico <rosuav@gmail.com>
Date2013-09-21 02:31 +1000
Message-ID<mailman.191.1379694695.18130.python-list@python.org>
In reply to#54487
On Sat, Sep 21, 2013 at 2:23 AM, Jussi Piitulainen
<jpiitula@ling.helsinki.fi> wrote:
> Something meaningful: make it john.doeray@hotmail.com with a note to
> "remove the female deer" for john.ray@hotmail.com, or "remove the drop
> of golden sun" for "john.doe@hotmail.com".

This method can be quite effective. In fact, of all the suggestions
made so far, I'd say these are a few of my favorite techniques...

*ducks the rotten tomatoes*

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web