Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #93068 > unrolled thread

Pure Python Data Mangling or Encrypting

Started byRandall Smith <randall@tnr.cc>
First post2015-06-23 14:02 -0500
Last post2015-06-25 14:41 -0500
Articles 17 — 6 participants

Back to article view | Back to comp.lang.python


Contents

  Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-23 14:02 -0500
    Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-24 21:36 +1000
      Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-24 14:02 +0000
        Re: Pure Python Data Mangling or Encrypting Emile van Sebille <emile@fenx.com> - 2015-06-24 08:52 -0700
          Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-24 16:16 +0000
            Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-25 02:23 +1000
              Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-24 18:23 +0000
            Re: Pure Python Data Mangling or Encrypting Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-06-24 21:24 -0400
        Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-25 01:55 +1000
        Re: Pure Python Data Mangling or Encrypting Emile van Sebille <emile@fenx.com> - 2015-06-24 09:09 -0700
      Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-24 13:20 -0500
        Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-24 18:29 +0000
          Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-24 14:00 -0500
            Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-24 21:24 +0000
              Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-24 18:13 -0500
      Re: Pure Python Data Mangling or Encrypting Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-06-24 21:33 -0400
      Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-25 14:41 -0500

#93068 — Pure Python Data Mangling or Encrypting

FromRandall Smith <randall@tnr.cc>
Date2015-06-23 14:02 -0500
SubjectPure Python Data Mangling or Encrypting
Message-ID<mailman.10.1435130890.3674.python-list@python.org>
Chunks of data (about 2MB) are to be stored on machines using a 
peer-to-peer protocol.  The recipient of these chunks can't assume that 
the payload is benign.  While the data senders are supposed to encrypt 
data, that's not guaranteed, and I'd like to protect the recipient 
against exposure to nefarious data by mangling or encrypting the data 
before it is written to disk.

My original idea was for the recipient to encrypt using AES.  But I want 
to keep this software pure Python "batteries included" and not require 
installation of other platform-dependent software.  Pure Python AES and 
even DES are just way too slow.  I don't know that I really need 
encryption here, but some type of fast mangling algorithm where a bad 
actor sending a payload can't guess the output ahead of time.

Any ideas are appreciated.  Thanks.

-Randall

[toc] | [next] | [standalone]


#93073

FromSteven D'Aprano <steve@pearwood.info>
Date2015-06-24 21:36 +1000
Message-ID<558a9649$0$1675$c3e8da3$5496439d@news.astraweb.com>
In reply to#93068
On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote:

> Chunks of data (about 2MB) are to be stored on machines using a
> peer-to-peer protocol.  The recipient of these chunks can't assume that
> the payload is benign.  While the data senders are supposed to encrypt
> data, that's not guaranteed, and I'd like to protect the recipient
> against exposure to nefarious data by mangling or encrypting the data
> before it is written to disk.

I don't understand how mangling the data is supposed to protect the
recipient. Don't they have the ability unmangle the data, and thus expose
themselves to whatever nasties are in the files?

If not, you can save all that time and effort implementing the peer-to-peer
business and just dump 2MB chunks of random data on their disks.


> My original idea was for the recipient to encrypt using AES.  But I want
> to keep this software pure Python "batteries included" and not require
> installation of other platform-dependent software.  Pure Python AES and
> even DES are just way too slow.  I don't know that I really need
> encryption here, but some type of fast mangling algorithm where a bad
> actor sending a payload can't guess the output ahead of time.

Again, I don't understand your threat model here. Why does the bad actor
need to guess the mangling? Putting on my Black Hat and twirling my
moustache wickedly, I decide to send you a JPG of Goatse. (Don't google
it.) Or, a more serious threat, a zip bomb:

http://www.ghacks.net/2008/07/27/42-kilobytes-unzipped-make-45-petabytes/

or malware of some description. So I P2P you the file. How it gets encrypted
on your disk is irrelevant to me, eventually you're going to unencrypted it
and try to access it.

We need to understand what threat you are defending against before we can
advise you.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#93079

FromGrant Edwards <invalid@invalid.invalid>
Date2015-06-24 14:02 +0000
Message-ID<mmed90$nho$1@reader1.panix.com>
In reply to#93073
On 2015-06-24, Steven D'Aprano <steve@pearwood.info> wrote:
> On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote:
>
>> Chunks of data (about 2MB) are to be stored on machines using a
>> peer-to-peer protocol.  The recipient of these chunks can't assume that
>> the payload is benign.  While the data senders are supposed to encrypt
>> data, that's not guaranteed, and I'd like to protect the recipient
>> against exposure to nefarious data by mangling or encrypting the data
>> before it is written to disk.
>
> I don't understand how mangling the data is supposed to protect the
> recipient. Don't they have the ability unmangle the data, and thus expose
> themselves to whatever nasties are in the files?

And how does writing unmangled data to disk expose anybody to
anything?  I've never heard of an exploit where writing an evilly
crafted bit-pattern to disk causes a any sort of problem.

-- 
Grant Edwards               grant.b.edwards        Yow! My mind is making
                                  at               ashtrays in Dayton ...
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#93082

FromEmile van Sebille <emile@fenx.com>
Date2015-06-24 08:52 -0700
Message-ID<mailman.18.1435161161.3674.python-list@python.org>
In reply to#93079
On 6/24/2015 7:02 AM, Grant Edwards wrote:
> And how does writing unmangled data to disk expose anybody to
> anything?  I've never heard of an exploit where writing an evilly
> crafted bit-pattern to disk causes a any sort of problem.

Unless that code is executed at boot.  Mangling would at least prevent 
it from executing.

Emile



[toc] | [prev] | [next] | [standalone]


#93085

FromGrant Edwards <invalid@invalid.invalid>
Date2015-06-24 16:16 +0000
Message-ID<mmel55$ojb$1@reader1.panix.com>
In reply to#93082
On 2015-06-24, Emile van Sebille <emile@fenx.com> wrote:
> On 6/24/2015 7:02 AM, Grant Edwards wrote:
>> And how does writing unmangled data to disk expose anybody to
>> anything?  I've never heard of an exploit where writing an evilly
>> crafted bit-pattern to disk causes a any sort of problem.
>
> Unless that code is executed at boot.

Don't write it somewhere where that might happen.  [Of course you
don't let a remote user determine where the untrusted data gets
written -- that would be completely beyond the pale.] Or does Windows
pick files at random from the disk and execute them?

> Mangling would at least prevent it from executing.

If you don't want a file to be executed, then don't make it
executable.  Or doesn't Windows have any way to control whether a file
is executable or not?

-- 
Grant Edwards               grant.b.edwards        Yow! You were s'posed
                                  at               to laugh!
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#93086

FromChris Angelico <rosuav@gmail.com>
Date2015-06-25 02:23 +1000
Message-ID<mailman.21.1435163021.3674.python-list@python.org>
In reply to#93085
On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards <invalid@invalid.invalid> wrote:
> On 2015-06-24, Emile van Sebille <emile@fenx.com> wrote:
>> Mangling would at least prevent it from executing.
>
> If you don't want a file to be executed, then don't make it
> executable.  Or doesn't Windows have any way to control whether a file
> is executable or not?

Windows doesn't have the Unix file system concept of execute
permission, no. If a file has the .exe extension and the first 512
bytes look like an appropriate header (MZ etc), Windows will happily
run it. With other extensions, similarly - just create a .bat file and
double-click it, it'll run the commands.

ChrisA

[toc] | [prev] | [next] | [standalone]


#93093

FromGrant Edwards <invalid@invalid.invalid>
Date2015-06-24 18:23 +0000
Message-ID<mmesil$8a8$2@reader1.panix.com>
In reply to#93086
On 2015-06-24, Chris Angelico <rosuav@gmail.com> wrote:
> On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards <invalid@invalid.invalid> wrote:
>> On 2015-06-24, Emile van Sebille <emile@fenx.com> wrote:
>>
>>> Mangling would at least prevent it from executing.
>>
>> If you don't want a file to be executed, then don't make it
>> executable.  Or doesn't Windows have any way to control whether a
>> file is executable or not?
>
> Windows doesn't have the Unix file system concept of execute
> permission, no. If a file has the .exe extension and the first 512
> bytes look like an appropriate header (MZ etc), Windows will happily
> run it. With other extensions, similarly - just create a .bat file
> and double-click it, it'll run the commands.

So can prevent execution, just by changing the filename?  Maybe 30
years using Unix has biased me, but that just seems so wrong...

-- 
Grant Edwards               grant.b.edwards        Yow! Here I am in the
                                  at               POSTERIOR OLFACTORY LOBULE
                              gmail.com            but I don't see CARL SAGAN
                                                   anywhere!!

[toc] | [prev] | [next] | [standalone]


#93111

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2015-06-24 21:24 -0400
Message-ID<mailman.38.1435195514.3674.python-list@python.org>
In reply to#93085
On Wed, 24 Jun 2015 16:16:37 +0000 (UTC), Grant Edwards
<invalid@invalid.invalid> declaimed the following:


>If you don't want a file to be executed, then don't make it
>executable.  Or doesn't Windows have any way to control whether a file
>is executable or not?

	If you can specify a fully qualified path/extension, and if the target
is readable, it likely is executable...

	Text scripts less so if the interpreter is not associated with the
extension.

-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#93083

FromChris Angelico <rosuav@gmail.com>
Date2015-06-25 01:55 +1000
Message-ID<mailman.19.1435161308.3674.python-list@python.org>
In reply to#93079
On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille <emile@fenx.com> wrote:
> On 6/24/2015 7:02 AM, Grant Edwards wrote:
>>
>> And how does writing unmangled data to disk expose anybody to
>> anything?  I've never heard of an exploit where writing an evilly
>> crafted bit-pattern to disk causes a any sort of problem.
>
>
> Unless that code is executed at boot.  Mangling would at least prevent it
> from executing.

Or it's on Windows. It's pretty easy to trick Windows into running
some code somewhere. But you can often disrupt that by simply renaming
the file to have no extension.

ChrisA

[toc] | [prev] | [next] | [standalone]


#93084

FromEmile van Sebille <emile@fenx.com>
Date2015-06-24 09:09 -0700
Message-ID<mailman.20.1435162195.3674.python-list@python.org>
In reply to#93079
On 6/24/2015 8:55 AM, Chris Angelico wrote:
> On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille <emile@fenx.com> wrote:
>> On 6/24/2015 7:02 AM, Grant Edwards wrote:
>>>
>>> And how does writing unmangled data to disk expose anybody to
>>> anything?  I've never heard of an exploit where writing an evilly
>>> crafted bit-pattern to disk causes a any sort of problem.
>>
>>
>> Unless that code is executed at boot.  Mangling would at least prevent it
>> from executing.
>
> Or it's on Windows. It's pretty easy to trick Windows into running
> some code somewhere. But you can often disrupt that by simply renaming
> the file to have no extension.

ISTR that windows may look into the file to see if it can 'guess' the 
appropriate application, so dropping the extension may not be 
sufficient.  But maybe they've changed that as my windows experience 
doesn't run much past XP.

Emile


[toc] | [prev] | [next] | [standalone]


#93092

FromRandall Smith <randall@tnr.cc>
Date2015-06-24 13:20 -0500
Message-ID<mailman.25.1435170024.3674.python-list@python.org>
In reply to#93073
On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
> I don't understand how mangling the data is supposed to protect the
> recipient. Don't they have the ability unmangle the data, and thus expose
> themselves to whatever nasties are in the files?

They never look at the data and wouldn't care to unmangle it.  The 
purpose is primarily to prevent automated software (file indexers, virus 
scanners) from doing bad things to the data.

-Randall

[toc] | [prev] | [next] | [standalone]


#93095

FromGrant Edwards <invalid@invalid.invalid>
Date2015-06-24 18:29 +0000
Message-ID<mmesv6$8a8$3@reader1.panix.com>
In reply to#93092
On 2015-06-24, Randall Smith <randall@tnr.cc> wrote:
> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>
>> I don't understand how mangling the data is supposed to protect the
>> recipient. Don't they have the ability unmangle the data, and thus
>> expose themselves to whatever nasties are in the files?
>
> They never look at the data and wouldn't care to unmangle it. 

I obviously don't "get it". If the recipient is never going look at
the data or unmangle it, why not convert every received file to a
single null byte?  That way you save on disk space as well --
especially if you just create links for all files after the initial
one.  ;)

[I supposed next you're going to tell me that Windows filesystems
don't support links.]

> The purpose is primarily to prevent automated software (file
> indexers, virus scanners) from doing bad things to the data.

Life under windows must be more tiresome than I imagined (or could
imagine) if you have to jump through such hoops to keep "automated
software" from doing bad things to your data files.

-- 
Grant Edwards               grant.b.edwards        Yow! My mind is making
                                  at               ashtrays in Dayton ...
                              gmail.com            

[toc] | [prev] | [next] | [standalone]


#93098

FromRandall Smith <randall@tnr.cc>
Date2015-06-24 14:00 -0500
Message-ID<mailman.32.1435172453.3674.python-list@python.org>
In reply to#93095
On 06/24/2015 01:29 PM, Grant Edwards wrote:
> On 2015-06-24, Randall Smith <randall@tnr.cc> wrote:
>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>>
>>> I don't understand how mangling the data is supposed to protect the
>>> recipient. Don't they have the ability unmangle the data, and thus
>>> expose themselves to whatever nasties are in the files?
>>
>> They never look at the data and wouldn't care to unmangle it.
>
> I obviously don't "get it". If the recipient is never going look at
> the data or unmangle it, why not convert every received file to a
> single null byte?  That way you save on disk space as well --
> especially if you just create links for all files after the initial
> one.  ;)
>
> [I supposed next you're going to tell me that Windows filesystems
> don't support links.]
>
>> The purpose is primarily to prevent automated software (file
>> indexers, virus scanners) from doing bad things to the data.
>
> Life under windows must be more tiresome than I imagined (or could
> imagine) if you have to jump through such hoops to keep "automated
> software" from doing bad things to your data files.
>

These are machines storing chunks of other people's data.  The data 
owner chunks a file, compresses and encrypts it, then sends it to 
several storage servers.  The storage server might be a Raspberry PI 
with a USB disk or a Windows XP machine - I can't know which.


I don't use Windows and don't recommend it for this software. 
Nevertheless, many people do use it.

-Randall

[toc] | [prev] | [next] | [standalone]


#93104

FromGrant Edwards <invalid@invalid.invalid>
Date2015-06-24 21:24 +0000
Message-ID<mmf76l$qg9$1@reader1.panix.com>
In reply to#93098
On 2015-06-24, Randall Smith <randall@tnr.cc> wrote:
> On 06/24/2015 01:29 PM, Grant Edwards wrote:
>> On 2015-06-24, Randall Smith <randall@tnr.cc> wrote:
>>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>>>
>>>> I don't understand how mangling the data is supposed to protect the
>>>> recipient. Don't they have the ability unmangle the data, and thus
>>>> expose themselves to whatever nasties are in the files?
>>>
>>> They never look at the data and wouldn't care to unmangle it.
>>
>> I obviously don't "get it". If the recipient is never going look at
>> the data or unmangle it, why not convert every received file to a
>> single null byte?  That way you save on disk space as well --
>> especially if you just create links for all files after the initial
>> one.  ;)
>
> These are machines storing chunks of other people's data.  The data 
> owner chunks a file, compresses and encrypts it, then sends it to 
> several storage servers.  The storage server might be a Raspberry PI 
> with a USB disk or a Windows XP machine - I can't know which.

OK.  But if the recipient (the server) mangles the data and then never
unmangles or reads the data, there doesn't seem to be any point in
storing it.  I must be misunderstanding your statement that the data
is never read/unmangled.

-- 
Grant Edwards               grant.b.edwards        Yow! A can of ASPARAGUS,
                                  at               73 pigeons, some LIVE ammo,
                              gmail.com            and a FROZEN DAQUIRI!!

[toc] | [prev] | [next] | [standalone]


#93107

FromRandall Smith <randall@tnr.cc>
Date2015-06-24 18:13 -0500
Message-ID<mailman.36.1435187595.3674.python-list@python.org>
In reply to#93104
On 06/24/2015 04:24 PM, Grant Edwards wrote:

>
> OK.  But if the recipient (the server) mangles the data and then never
> unmangles or reads the data, there doesn't seem to be any point in
> storing it.  I must be misunderstanding your statement that the data
> is never read/unmangled.
>

When the storage server sends the data (on request), it decodes the data 
before sending.  I'm currently testing this on a Raspberry PI using a 
random substitution with bytearray.maketrans and bytearray.translate on 
Raspberry PI and it is working quite well.

Thanks.

-Randall

[toc] | [prev] | [next] | [standalone]


#93112

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2015-06-24 21:33 -0400
Message-ID<mailman.39.1435196032.3674.python-list@python.org>
In reply to#93073
On Wed, 24 Jun 2015 13:20:07 -0500, Randall Smith <randall@tnr.cc>
declaimed the following:

>On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>> I don't understand how mangling the data is supposed to protect the
>> recipient. Don't they have the ability unmangle the data, and thus expose
>> themselves to whatever nasties are in the files?
>
>They never look at the data and wouldn't care to unmangle it.  The 
>purpose is primarily to prevent automated software (file indexers, virus 
>scanners) from doing bad things to the data.
>

	Which leads to the question: what is "doing bad things".

	File indexers and virus scanners just read files. In the case of the
latter, if they detect a virus they tend to quarantine the file (or even
delete it depending upon settings). You want viruses to NOT BE detected?
Indexers read the file looking for strings to incorporate into the index;
they often skip known binary file types (what are the odds of finding
meaningful text in a JPEG file). And encryption isn't going to stop the
scanners from looking at the file anyway.

	If the goal is to avoid an indexer retrieving "classified" string
contents, the main solution is to turn off indexing on that drive.

	I still don't see anything /I/ would call a "recipient" here. "They
never look at the data and wouldn't care to unmangle it" describes junk
tossed into a waste basket -- stuff comes in but is never used again, and
when the basket is full someone has to empty it to make room for more junk.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#93163

FromRandall Smith <randall@tnr.cc>
Date2015-06-25 14:41 -0500
Message-ID<mailman.82.1435261314.3674.python-list@python.org>
In reply to#93073
On 06/24/2015 08:33 PM, Dennis Lee Bieber wrote:
> On Wed, 24 Jun 2015 13:20:07 -0500, Randall Smith <randall@tnr.cc>
> declaimed the following:
>
>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>>> I don't understand how mangling the data is supposed to protect the
>>> recipient. Don't they have the ability unmangle the data, and thus expose
>>> themselves to whatever nasties are in the files?
>>
>> They never look at the data and wouldn't care to unmangle it.  The
>> purpose is primarily to prevent automated software (file indexers, virus
>> scanners) from doing bad things to the data.
>>
>
> 	Which leads to the question: what is "doing bad things".

Storage nodes are computers running the software in discussion, that 
store chunks of data they are sent (recipient) and send it upon request. 
  Their job (as related to this software) is to accept, store and send 
chunks of data upon request.  So losing data is a bad thing.

The storage node software is cross platform and should run on anything 
from a dedicated Raspberry PI to an old Windows PC.  Data integrity is 
insured using encryption and hashes generated by the original data 
owners.  Normally, a data chunk would look like random bytes, because it 
is encrypted.  However, the storage node cannot prevent the client 
(uploader) from sending unencrypted data.  The purpose of this 
obfuscation is to protect the storage node, as many potential users have 
expressed hesitation in storing other peoples data.

Example: A storage node runs a Desktop OS with an image indexer. It 
receives an unencrypted nasty image or movie. The indexer picks it up 
and shows it in the person's image or movie "Library".

Does that clear things up?


-Randall

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web