Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54479

Re: Antispam measures circumventing

References <523C6402.7090501@gmail.com>
Date 2013-09-20 17:47 +0200
Subject Re: Antispam measures circumventing
From Vlastimil Brom <vlastimil.brom@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.187.1379692075.18130.python-list@python.org> (permalink)

Show all headers | View raw


2013/9/20 Jugurtha Hadjar <jugurtha.hadjar@gmail.com>:
> Hello,
> # I posted this on the tutor list, but my message wasn't displayed
> I shared some assembly code (microcontrollers) and I had a comment wit my
> e-mail address for contact purposes.
> Supposing my name is John Doe and the e-mail is john.doe@hotmail.com, my
> e-mail was written like this:
> REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com'
> With a note saying to remove the capital letters.
> Now, I wrote this :
> for character in my_string:
> ...     if (character == character.upper()) and (character !='@') and
> (character != '.'):
> ...             my_string = my_string.replace(character,'')
> And the end result was john.doe@hotmail.com.
> Is there a better way to do that ? Without using regular expressions (Looked
> *really* ugly and it doesn't really make sense, unlike the few lines I've
> written, which are obvious even to a beginner like me).
> I obviously don't like SPAM, but I just thought "If I were a spammer, how
> would I go about it".
> Eventually, some algorithm of detecting the john<dot>doe<at>hotmail<dot>com
> must exist.
> retrieve the original e-mail address? Maybe a function with no inverse
> function ? Generating an image that can't be converted back to text, etc..
> If this is off-topic, you can just answer the "what is a better way to do
> that" part.
>
> Thanks,
> --
> ~Jugurtha Hadjar,
> --
> https://mail.python.org/mailman/listinfo/python-list


Hi,
is the regex really that bad for such simple replacement?

>>> re.sub(r"[A-Z]", "", "REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com")
'john.doe@hotmail.com'

Alternatively, you can use a check with the string method  isupper():
>>> "".join(char for char in "REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com" if not char.isupper())
'john.doe@hotmail.com'

or using a special form of str.translate()
>>> "REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com".translate(None, "ABCDEFGHIJKLMNOPQRSTUVWXYZ")
'john.doe@hotmail.com'

which is the same like:
>>> import string
>>> "REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com".translate(None, string.ascii_uppercase)
'john.doe@hotmail.com'

Another possibility would be to utilise ord(...)
>>> "".join(char for char in "REMOVEMEjohn.doSPAMeSPAM@REMOVEMEhotmail.com" if ord(char) not in range(65, 91))
'john.doe@hotmail.com'
>>>

Well, maybe there are other possibilities, these above are listed
roughly in the order of my personal preference. Of course, others may
differ...

hth,
   vbr

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Antispam measures circumventing Vlastimil Brom <vlastimil.brom@gmail.com> - 2013-09-20 17:47 +0200

csiph-web