Groups > comp.lang.python > #7900 > unrolled thread

Strategy to Verify Python Program is POST'ing to a web server.

Started by	"mzagursk@gmail.com" <mzagursk@gmail.com>
First post	2011-06-18 04:34 -0700
Last post	2011-06-19 05:18 -0700
Articles	14 — 10 participants

Back to article view | Back to comp.lang.python

  Strategy to Verify Python Program is POST'ing to a web server. "mzagursk@gmail.com" <mzagursk@gmail.com> - 2011-06-18 04:34 -0700
    Re: Strategy to Verify Python Program is POST'ing to a web server. Eden Kirin <eden@bicikl.> - 2011-06-18 14:32 +0200
    Re: Strategy to Verify Python Program is POST'ing to a web server. Michael Hrivnak <mhrivnak@hrivnak.org> - 2011-06-18 13:05 -0400
    Re: Strategy to Verify Python Program is POST'ing to a web server. Chris Angelico <rosuav@gmail.com> - 2011-06-19 03:26 +1000
    Re: Strategy to Verify Python Program is POST'ing to a web server. Tim Roberts <timr@probo.com> - 2011-06-18 12:37 -0700
    Re: Strategy to Verify Python Program is POST'ing to a web server. Michael Hrivnak <mhrivnak@hrivnak.org> - 2011-06-18 16:40 -0400
      Re: Strategy to Verify Python Program is POST'ing to a web server. Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2011-06-19 12:38 +1200
        Re: Strategy to Verify Python Program is POST'ing to a web server. Chris Angelico <rosuav@gmail.com> - 2011-06-19 10:54 +1000
    Re: Strategy to Verify Python Program is POST'ing to a web server. Paul Rubin <no.email@nospam.invalid> - 2011-06-18 14:03 -0700
    Re: Strategy to Verify Python Program is POST'ing to a web server. Terry Reedy <tjreedy@udel.edu> - 2011-06-18 17:17 -0400
    Re: Strategy to Verify Python Program is POST'ing to a web server. Chris Angelico <rosuav@gmail.com> - 2011-06-19 09:12 +1000
    Re: Strategy to Verify Python Program is POST'ing to a web server. Nobody <nobody@nowhere.com> - 2011-06-19 05:47 +0100
      Re: Strategy to Verify Python Program is POST'ing to a web server. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-19 12:03 +0000
        Re: Strategy to Verify Python Program is POST'ing to a web server. Paul Rubin <no.email@nospam.invalid> - 2011-06-19 05:18 -0700

#7900 — Strategy to Verify Python Program is POST'ing to a web server.

From	"mzagursk@gmail.com" <mzagursk@gmail.com>
Date	2011-06-18 04:34 -0700
Subject	Strategy to Verify Python Program is POST'ing to a web server.
Message-ID	<d8c7dc52-0c54-4b29-a7b6-bcd833686611@q12g2000prb.googlegroups.com>

Hello Folks,

I am wondering what your strategies are for ensuring that data
transmitted to a website via a python program is indeed from that
program, and not from someone submitting POST data using some other
means.  I find it likely that there is no solution, in which case what
is the best solution for sending data to a remote server from a python
program and ensuring that it is from that program?

For example, if I create a website that tracks some sort of
statistical information and don't ensure that my program is the one
that is uploading it, the statistics can be thrown off by people
entering false POST data onto the data upload page.  Any remedy?

Thanks

[toc] | [next] | [standalone]

#7902

From	Eden Kirin <eden@bicikl.>
Date	2011-06-18 14:32 +0200
Message-ID	<iti5t8$ffl$1@nntp.amis.hr>
In reply to	#7900

On 18.06.2011 13:34, mzagursk@gmail.com wrote:
> Hello Folks,
>
> I am wondering what your strategies are for ensuring that data
> transmitted to a website via a python program is indeed from that
> program, and not from someone submitting POST data using some other
> means.  I find it likely that there is no solution, in which case what
> is the best solution for sending data to a remote server from a python
> program and ensuring that it is from that program?
>
> For example, if I create a website that tracks some sort of
> statistical information and don't ensure that my program is the one
> that is uploading it, the statistics can be thrown off by people
> entering false POST data onto the data upload page.  Any remedy?

Include some hash check in hidden field.

For example, from your python program you will include hidden fields 
random_number and hash:

import random, hashlib
my_secret_key = "MySecretKey"
random_number = "%f" % random.random()
hash = hashlib.sha1("%s %s" % (my_secret_key, random_number)).hexdigest()

On the server side check hash with random_number and secret key to 
ensure the data is POSTed from your application.

-- 
www.vikendi.com -/- www.svimi.net

[toc] | [prev] | [next] | [standalone]

#7915

From	Michael Hrivnak <mhrivnak@hrivnak.org>
Date	2011-06-18 13:05 -0400
Message-ID	<mailman.125.1308416728.1164.python-list@python.org>
In reply to	#7900

Authentication by client SSL certificate is best.

You should also look into restricting access on the server side by IP address.

Michael

On Sat, Jun 18, 2011 at 7:34 AM, mzagursk@gmail.com <mzagursk@gmail.com> wrote:
> Hello Folks,
>
> I am wondering what your strategies are for ensuring that data
> transmitted to a website via a python program is indeed from that
> program, and not from someone submitting POST data using some other
> means.  I find it likely that there is no solution, in which case what
> is the best solution for sending data to a remote server from a python
> program and ensuring that it is from that program?
>
> For example, if I create a website that tracks some sort of
> statistical information and don't ensure that my program is the one
> that is uploading it, the statistics can be thrown off by people
> entering false POST data onto the data upload page.  Any remedy?
>
> Thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
>

[toc] | [prev] | [next] | [standalone]

#7917

From	Chris Angelico <rosuav@gmail.com>
Date	2011-06-19 03:26 +1000
Message-ID	<mailman.127.1308417979.1164.python-list@python.org>
In reply to	#7900

On Sat, Jun 18, 2011 at 9:34 PM, mzagursk@gmail.com <mzagursk@gmail.com> wrote:
> I am wondering what your strategies are for ensuring that data
> transmitted to a website via a python program is indeed from that
> program, and not from someone submitting POST data using some other
> means.  I find it likely that there is no solution, in which case what
> is the best solution for sending data to a remote server from a python
> program and ensuring that it is from that program?

You're correct there: there is no solution. Everything on the other
side of your network cable should be treated as hostile and spoofed.
But the real question is, how much effort are people likely to go to
to avoid using your program?

SSL certificates are good, but they can be stolen (very easily if the
client is open source). Anything algorithmic suffers from the same
issue.

In the example you gave, there's no solution. Someone could easily
spoof it and stuff the ballot. But if you make that more difficult
than the survey is worth, then you can largely trust your data.

The other common reason for wanting to be sure that the far end really
is your script is when you're trusting the client to do data
validation. There's a solution to that one: repeat the validation on
the server, and then it doesn't matter if they use your program or
not. (And before you cry "Isn't that obvious?", a lot of people have
completely missed that point.) In neither case can you prove what
program was on the far end. You're working with network packets, so
anything can be spoofed. You could go a long way toward it, though, by
using something ridiculously complex, such as:

* Client connects via SSL to host, using a known certificate.
* Server verifies certificate, and sends client some Python code to execute.
* Client verifies the server's certificate (vital!).
* Client executes the code it's given, and based on the result, plus
some other data, sends the server a hash value.
* Server executes the same code it gave the client, knows the data it
was working with, and calculates the equivalent hash.
* If the two hashes match, the client is deemed to be valid.

This is a variant of the usual nonce-based hashing systems, where the
nonce in question is actually executable code. By randomizing the
code, you can make it difficult for any non-Python program to
duplicate the hash algorithm. But it still won't provide certainty, by
any means.

I've spent quite a bit of time this past fortnight explaining some of
these concepts to my boss and one of my coworkers; they were building
a rather elaborate system but didn't realise that, apart from
requiring about three times as much data from /dev/random, it wasn't
materially different from a simple SSL cert check...

Chris Angelico

[toc] | [prev] | [next] | [standalone]

#7920

From	Tim Roberts <timr@probo.com>
Date	2011-06-18 12:37 -0700
Message-ID	<bivpv6letvb71oa37tpip6679b0a6ntr9f@4ax.com>
In reply to	#7900

"mzagursk@gmail.com" <mzagursk@gmail.com> wrote:
>
>For example, if I create a website that tracks some sort of
>statistical information and don't ensure that my program is the one
>that is uploading it, the statistics can be thrown off by people
>entering false POST data onto the data upload page.  Any remedy?

The amount of protection you need to take depends on what the cost of
interference will be, and how likely it is to be spoofed.  How will people
find out about your interface?  If they found out about it, what would they
gain by spoofing it?
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [next] | [standalone]

#7923

From	Michael Hrivnak <mhrivnak@hrivnak.org>
Date	2011-06-18 16:40 -0400
Message-ID	<mailman.129.1308429616.1164.python-list@python.org>
In reply to	#7900

On Sat, Jun 18, 2011 at 1:26 PM, Chris Angelico <rosuav@gmail.com> wrote:
> SSL certificates are good, but they can be stolen (very easily if the
> client is open source). Anything algorithmic suffers from the same
> issue.

This is only true if you distribute your app with one built-in
certificate, which does indeed seem like a bad idea.  When you know
your user base though, especially if this is a situation with a small
number of deployments, than you can distribute a unique certificate to
each client, signed by your CA.  Not knowing what kind of statistics
the OP is trying to collect, we really don't know if this client will
be running in one place or thousands.

Even if there will be thousands of deployments, you could generate an
RSA key-pair on the client similar to how an ssh client does, and use
that to sign the data.  Then you can at least track which client each
submission came from (storing the public key and IP address), and then
remove submissions as necessary if you detect abuse.

> In the example you gave, there's no solution. Someone could easily
> spoof it and stuff the ballot. But if you make that more difficult
> than the survey is worth, then you can largely trust your data.
>
> You could go a long way toward it, though, by
> using something ridiculously complex, such as:
>
> * Client connects via SSL to host, using a known certificate.
> * Server verifies certificate, and sends client some Python code to execute.
> * Client verifies the server's certificate (vital!).
> * Client executes the code it's given, and based on the result, plus
> some other data, sends the server a hash value.
> * Server executes the same code it gave the client, knows the data it
> was working with, and calculates the equivalent hash.
> * If the two hashes match, the client is deemed to be valid.

An authentication process that involves the client executing code
supplied by the server opens up one single point of failure (server is
compromised or man-in-the-middle attack is happening) by which
arbitrary code could get executed on the client.  Yikes!  It's ok to
execute server-supplied code in a sandbox (i.e. javascript), but I
would never want to use software that sends me code over the network
to be executed directly on my system (unless that's the express
purpose of the software, like celery).  Besides, it seems that all
you've accomplished is verifying that the client can execute python
code and you've made it a bit less convenient to attack.

The TLS handshake really does verify that the client has a certificate
which has been previously signed by the CA.  If you can get signed
certs to each deployment, that is spectacular security that will serve
you well.  The above sounds a bit like you're trying to create a new
cipher based on exchanged code that gets executed.  I encourage you to
not reinvent the wheel, and stick with the ciphers that are already
standard in the SSL/TLS handshake.

If you cannot uniquely authenticate each client (either through a
signed cert or by having the user supply credentials interactively),
then you'll have to accept that you cannot trust the submitted data
100%, and just take measures to mitigate abuse.

Michael

[toc] | [prev] | [next] | [standalone]

#7934

From	Gregory Ewing <greg.ewing@canterbury.ac.nz>
Date	2011-06-19 12:38 +1200
Message-ID	<964unqFhfjU1@mid.individual.net>
In reply to	#7923

Michael Hrivnak wrote:
> Besides, it seems that all
> you've accomplished is verifying that the client can execute python
> code and you've made it a bit less convenient to attack.

And that only if the attacker isn't a Python programmer.
If he is, he's probably writing his attack program in
Python anyway. :-)

Although if you were devious, and you detected that such
an attack was in progress, you could lull him into a sense
of security and then send him some Python code to pwn his
machine...

-- 
Greg

[toc] | [prev] | [next] | [standalone]

#7935

From	Chris Angelico <rosuav@gmail.com>
Date	2011-06-19 10:54 +1000
Message-ID	<mailman.137.1308444874.1164.python-list@python.org>
In reply to	#7934

On Sun, Jun 19, 2011 at 10:38 AM, Gregory Ewing
<greg.ewing@canterbury.ac.nz> wrote:
> And that only if the attacker isn't a Python programmer.
> If he is, he's probably writing his attack program in
> Python anyway. :-)
>

I was thinking you'd have it call on various functions defined
elsewhere in the program, forcing him to pretty much have the whole
original code in there. :)

ChrisA

[toc] | [prev] | [next] | [standalone]

#7925

From	Paul Rubin <no.email@nospam.invalid>
Date	2011-06-18 14:03 -0700
Message-ID	<7xaadej1ig.fsf@ruckus.brouhaha.com>
In reply to	#7900

"mzagursk@gmail.com" <mzagursk@gmail.com> writes:
> For example, if I create a website that tracks some sort of
> statistical information and don't ensure that my program is the one
> that is uploading it, the statistics can be thrown off by people
> entering false POST data onto the data upload page.  Any remedy?

If you're concerned about unauthorized users posting random crap, the
obvious solution is configure your web server to put password protection
on the page.

If you're saying AUTHORIZED users (those allowed to use the program to
post stuff) aren't trusted to not bypass the program, you've basically
got a DRM problem, especially if you think the users might
reverse-engineer the program to figure out the protocol.  The most
effective approaches generally involve delivering the program in the
form of a hardware product that's difficult to tamper with.  That's what
cable TV boxes amount to, for example.

What is the application, if you can say?  That might help get better
answers.

[toc] | [prev] | [next] | [standalone]

#7926

From	Terry Reedy <tjreedy@udel.edu>
Date	2011-06-18 17:17 -0400
Message-ID	<mailman.131.1308431848.1164.python-list@python.org>
In reply to	#7900

On 6/18/2011 7:34 AM, mzagursk@gmail.com wrote:
> Hello Folks,
>
> I am wondering what your strategies are for ensuring that data
> transmitted to a website via a python program is indeed from that
> program, and not from someone submitting POST data using some other
> means.  I find it likely that there is no solution, in which case what
> is the best solution for sending data to a remote server from a python
> program and ensuring that it is from that program?
>
> For example, if I create a website that tracks some sort of
> statistical information and don't ensure that my program is the one
> that is uploading it, the statistics can be thrown off by people
> entering false POST data onto the data upload page.  Any remedy?

You have not specified all the parameters of the problem. Are there a 
limited number of copies of your program or are they distrubuted freely? 
What about multiple votes from one program?

Corporate proxy votes (which are a legally important type of statistical 
information) work as follows. Each shareholder is mailed or emailed a 
'control number'. Attend stockholder meeting in person, mail proxy vote, 
or login with any browser with control number. Repeat votes by the same 
control id supercede previous vote. There should be a 'thank you for 
voting' response for each vote. I suspect IP addr. is recorded with vote 
too. I have not heard of specific problems with electronic proxy voting.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#7929

From	Chris Angelico <rosuav@gmail.com>
Date	2011-06-19 09:12 +1000
Message-ID	<mailman.132.1308438742.1164.python-list@python.org>
In reply to	#7900

On Sun, Jun 19, 2011 at 6:40 AM, Michael Hrivnak <mhrivnak@hrivnak.org> wrote:
> On Sat, Jun 18, 2011 at 1:26 PM, Chris Angelico <rosuav@gmail.com> wrote:
>> SSL certificates are good, but they can be stolen (very easily if the
>> client is open source). Anything algorithmic suffers from the same
>> issue.
>
> This is only true if you distribute your app with one built-in
> certificate, which does indeed seem like a bad idea.  When you know
> your user base though, especially if this is a situation with a small
> number of deployments, than you can distribute a unique certificate to
> each client, signed by your CA.

That changes it from verifying the program to verifying the user. It's
a somewhat different beast, but it still leaves the possibility of
snagging the cert and using it in another program. Same with IP
address checks. You can't prove that the other end is a particular
program.

>> You could go a long way toward it, though, by
>> using something ridiculously complex, such as:
>> ...
>
> An authentication process that involves the client executing code
> supplied by the server opens up one single point of failure (server is
> compromised or man-in-the-middle attack is happening) by which
> arbitrary code could get executed on the client.  Yikes!

Yeah, hence the part of verifying the server's cert too. That one is a
bit safer though; nobody but you will have that certificate, so it's
not as easy to take and put into another program. But this whole
scheme was meant from the start to be ridiculous.

> If ...
> then you'll have to accept that you cannot trust the submitted data
> 100%, and just take measures to mitigate abuse.

I still stand by my original point, namely that the "if" on here is
superfluous, and the "then" is unconditional. But the measures you
describe _do_ reduce the likelihood significantly.

ChrisA

[toc] | [prev] | [next] | [standalone]

#7941

From	Nobody <nobody@nowhere.com>
Date	2011-06-19 05:47 +0100
Message-ID	<pan.2011.06.19.04.46.38.578000@nowhere.com>
In reply to	#7900

On Sat, 18 Jun 2011 04:34:55 -0700, mzagursk@gmail.com wrote:

> I am wondering what your strategies are for ensuring that data
> transmitted to a website via a python program is indeed from that
> program, and not from someone submitting POST data using some other
> means.

> Any remedy?

Supply the client with tamper-proof hardware containing a private key.

Either that, or just accept that it cannot be done. Compare the amount of
effort game developers put into trying to implement tamper-proofing in
software with how little success they've had.

[toc] | [prev] | [next] | [standalone]

#7949

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2011-06-19 12:03 +0000
Message-ID	<4dfde576$0$30002$c3e8da3$5496439d@news.astraweb.com>
In reply to	#7941

On Sun, 19 Jun 2011 05:47:30 +0100, Nobody wrote:

> On Sat, 18 Jun 2011 04:34:55 -0700, mzagursk@gmail.com wrote:
> 
>> I am wondering what your strategies are for ensuring that data
>> transmitted to a website via a python program is indeed from that
>> program, and not from someone submitting POST data using some other
>> means.
> 
>> Any remedy?
> 
> Supply the client with tamper-proof hardware containing a private key.

Is that resistant to man-in-the-middle attacks by somebody with a packet 
sniffer watching the traffic between the device and the website?

> Either that, or just accept that it cannot be done. Compare the amount
> of effort game developers put into trying to implement tamper-proofing
> in software with how little success they've had.

Exactly.



-- 
Steven

[toc] | [prev] | [next] | [standalone]

#7950

From	Paul Rubin <no.email@nospam.invalid>
Date	2011-06-19 05:18 -0700
Message-ID	<7xtybmc8uq.fsf@ruckus.brouhaha.com>
In reply to	#7949

Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:
>> Supply the client with tamper-proof hardware containing a private key.
>
> Is that resistant to man-in-the-middle attacks by somebody with a packet 
> sniffer watching the traffic between the device and the website?

Sure, why not?  As long as the crypto is done properly, that is.

But, there is also the matter of securing the path from the data to the
hardware.  I don't have the impression that the OP has really thought
this through.

[toc] | [prev] | [standalone]

csiph-web

Strategy to Verify Python Program is POST'ing to a web server.

Contents

#7900 — Strategy to Verify Python Program is POST'ing to a web server.

#7902

#7915

#7917

#7920

#7923

#7934

#7935

#7925

#7926

#7929

#7941

#7949

#7950