Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #93096 > unrolled thread
| Started by | Randall Smith <randall@tnr.cc> |
|---|---|
| First post | 2015-06-24 13:36 -0500 |
| Last post | 2015-06-25 14:09 -0500 |
| Articles | 20 on this page of 97 — 19 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-24 13:36 -0500
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-25 14:07 +1000
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-24 21:27 -0700
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-25 19:25 +1000
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-25 02:41 -0700
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-25 19:57 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-25 10:03 +0000
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-26 01:13 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-25 15:26 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-25 13:58 -0500
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-26 10:33 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-26 10:49 +0000
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-25 19:01 -0600
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-27 03:06 +1000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-26 15:09 -0500
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-26 23:07 +0200
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-26 21:29 +0000
Re: Pure Python Data Mangling or Encrypting Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-26 22:55 +0100
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 00:42 +0200
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-26 16:26 -0700
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-27 00:21 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-26 19:55 -0500
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 07:24 +0200
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-26 19:12 -0500
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-26 15:58 -0600
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-26 19:23 -0500
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-26 23:11 +0200
Re: Pure Python Data Mangling or Encrypting Michael Torrie <torriem@gmail.com> - 2015-06-27 11:02 -0600
Re: Pure Python Data Mangling or Encrypting Paul Rubin <no.email@nospam.invalid> - 2015-06-27 10:45 -0700
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-27 13:38 +1000
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-26 21:05 -0700
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-27 16:16 +1000
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-27 13:30 -0700
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-28 11:18 +1000
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-27 19:11 -0700
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-26 23:47 -0600
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-27 18:38 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 18:53 +1000
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 11:07 +0200
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 19:17 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-27 09:27 +0000
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 12:05 +0200
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 20:16 +1000
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 12:55 +0200
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-27 10:26 +0000
Re: Pure Python Data Mangling or Encrypting Laura Creighton <lac@openend.se> - 2015-06-27 14:27 +0200
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 12:18 +0200
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 21:33 +1000
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-27 08:59 -0600
Re: Pure Python Data Mangling or Encrypting Laura Creighton <lac@openend.se> - 2015-06-27 13:25 +0200
Re: Pure Python Data Mangling or Encrypting Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2015-06-27 15:23 +0300
Re: Pure Python Data Mangling or Encrypting Laura Creighton <lac@openend.se> - 2015-06-27 14:48 +0200
Re: Pure Python Data Mangling or Encrypting Johannes Bauer <dfnsonfsduifb@gmx.de> - 2015-06-27 11:12 +0200
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-27 09:09 -0600
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-28 03:35 +1000
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-28 03:58 +1000
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-27 14:16 -0600
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-28 13:41 +0000
Re: Pure Python Data Mangling or Encrypting Robert Kern <robert.kern@gmail.com> - 2015-06-27 08:58 +0100
Re: Pure Python Data Mangling or Encrypting Robert Kern <robert.kern@gmail.com> - 2015-06-27 09:07 +0100
Re: Pure Python Data Mangling or Encrypting Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-06-27 10:39 -0400
Re: Pure Python Data Mangling or Encrypting Grant Edwards <invalid@invalid.invalid> - 2015-06-27 12:38 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-27 13:22 -0500
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-28 04:51 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-28 09:05 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 11:21 +1000
Re: Pure Python Data Mangling or Encrypting Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-26 23:59 -0600
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-27 09:26 +0000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-27 16:52 +1000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-27 12:08 -0500
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-28 04:50 +1000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-29 15:52 -0500
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-06-30 13:00 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-30 12:19 +0000
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-07-01 04:17 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-07-01 04:33 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-30 18:37 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-07-01 09:38 -0500
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-30 12:39 -0500
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve@pearwood.info> - 2015-07-01 04:59 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-07-01 05:20 +1000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-30 23:25 +0000
Re: Pure Python Data Mangling or Encrypting alister <alister.nospam.ware@ntlworld.com> - 2015-07-01 08:06 +0000
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-28 14:21 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-29 15:46 -0500
Re: Pure Python Data Mangling or Encrypting Jon Ribbens <jon+usenet@unequivocal.co.uk> - 2015-06-29 20:49 +0000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-30 12:43 -0500
Re: Pure Python Data Mangling or Encrypting Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-07-02 10:31 +1200
Re: Pure Python Data Mangling or Encrypting Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-26 02:17 +0100
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-26 12:06 +1000
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-26 12:05 +1000
Re: Pure Python Data Mangling or Encrypting Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-26 03:24 +0100
Re: Pure Python Data Mangling or Encrypting Chris Angelico <rosuav@gmail.com> - 2015-06-26 12:29 +1000
Re: Pure Python Data Mangling or Encrypting Joonas Liik <liik.joonas@gmail.com> - 2015-06-25 13:00 +0300
Re: Pure Python Data Mangling or Encrypting Devin Jeanpierre <jeanpierreda@gmail.com> - 2015-06-25 03:18 -0700
Re: Pure Python Data Mangling or Encrypting Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-25 17:05 +1000
Re: Pure Python Data Mangling or Encrypting Randall Smith <randall@tnr.cc> - 2015-06-25 14:09 -0500
Page 1 of 5 [1] 2 3 4 5 Next page →
| From | Randall Smith <randall@tnr.cc> |
|---|---|
| Date | 2015-06-24 13:36 -0500 |
| Subject | Re: Pure Python Data Mangling or Encrypting |
| Message-ID | <mailman.29.1435170987.3674.python-list@python.org> |
On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote: > Pardon, but that description has me confused. Perhaps I just don't > understand the full use-case. > > Who exactly is supposed to be protected from what? You state "data > senders are supposed to encrypt" which, if the recipient doesn't have the > decryption key, implies the recipient -- isn't the real recipient but just > a transport/storage place until the data is retrieved by the end-user. You got it. I didn't want to explain any more than necessary. But yes, the recipient just stores the data for the end-user. > > If "you" do the encryption on the storage machine, then you need to > also do the decryption when returning the data to the end-user -- which > means the key is available somewhere on the storage machine, and the local > user might obtain access to it and the stored data. Right again. A legitimate data owner would encrypt the data. The storage machine is encrypting to protect itself against unwanted exposure to unencrypted malware. Not that they would go looking at the files, but their virus scanner or file indexer might. > > Given the assumptions I'm making, my recommendation is likely to be > something on the nature of: use an OS designed with security at the core of > the file system; each sender has their own login UID, and the file system > is configured to grant r/w access only to the login -- no execute > permissions, no access by someone not logged in as that user, etc. Yes. This is done for "imaged" systems, but I don't have control over the storage machines. I'm leaning towards using a random substitution cipher suggested by Devin Jeanpierre. If you see any weaknesses in that solution, I'd like to hear them. Thanks for your response. --Randall
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-06-25 14:07 +1000 |
| Message-ID | <558b7e85$0$1648$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #93096 |
On Thu, 25 Jun 2015 04:36 am, Randall Smith wrote: > On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote: > > >> Pardon, but that description has me confused. Perhaps I just don't >> understand the full use-case. >> >> Who exactly is supposed to be protected from what? You state "data >> senders are supposed to encrypt" which, if the recipient doesn't have the >> decryption key, implies the recipient -- isn't the real recipient but >> just a transport/storage place until the data is retrieved by the >> end-user. > > You got it. I didn't want to explain any more than necessary. But yes, > the recipient just stores the data for the end-user. Trust me. That's not all they are doing. >> If "you" do the encryption on the storage machine, then you need to >> also do the decryption when returning the data to the end-user -- which >> means the key is available somewhere on the storage machine, and the >> local user might obtain access to it and the stored data. > > Right again. A legitimate data owner would encrypt the data. The > storage machine is encrypting to protect itself against unwanted > exposure to unencrypted malware. Not that they would go looking at the > files, but their virus scanner or file indexer might. Okay, you're worrying me now. If this is legitimate business, then you shouldn't be worried about the virus scanner or file indexer *scanning* the content of the file. But giving you the benefit of the doubt, that there's nothing underhanded happening, I don't think you have a good model for the potential threats in your software. I think there are at least three different threats: Sender of the data versus the storage machine: - the sender of the data may deliberately send malware, intending to attack the people storing the file; Storage machine versus the end recipient: - the storage machine may be infected by malware which corrupts the file; - the owner of the storage machine may deliberately corrupt the data (this is a special case of the previous); - the owner of the storage machine may want to spy on the files, that is, read the contents without changing the files (attack on privacy). There may be others threats as well, e.g. man-in-the-middle attacks. If this is anything like Bittorrent, you have a whole range of threats. But just sticking to the three above, the first one is partially mitigated by allowing virus scanners to scan the data, but that implies that the owner of the storage machine can spy on the files. So you have a conflict here. Honestly, the *only* real defence against the spying issue is to encrypt the files. Not obfuscate them with a lousy random substitution cipher. The storage machine can keep the files as long as they like, just by making a copy, and spend hours bruteforcing them. They *will* crack the substitution cipher. In pure Python, that may take a few days or weeks; in C, hours or days. If they have the resources to throw at it, minutes. Substitution ciphers have not been effective encryption since, oh, the 1950s, unless you use a one-time pad. Which you won't be. That's assuming they don't just look at the Python source code, grab the cipher key, and decrypt in seconds. If you're serious about protecting your users privacy and their data integrity, you need to use modern strong encryption, and you need to solve the issue of how to get the key from the trusted source to the untrusted storage machine. I have no idea how to do that -- you need to talk to actual security experts, not random Python programmers. A pure Python solution for the encryption is likely to be too slow for more than toy files. Bite the bullet and use a library written in C. Python uses C code for all sorts of modules: math, decimal, bisect, pickle, io, etc. all delegate to C code when available. There's no shame in it. Not to put too fine a point on it, using a substitution cipher because it's easy and fast in pure Python code is like making a boat out of styrofoam because it's light and floats and using aluminium or fibreglass is too expensive. Sure that will work for toy applications, like paddling around the swimming pool in your back yard, but nobody in their right mind would trust it on the deep ocean or a white-water river. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2015-06-24 21:27 -0700 |
| Message-ID | <mailman.42.1435206516.3674.python-list@python.org> |
| In reply to | #93115 |
On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano <steve@pearwood.info> wrote: > But just sticking to the three above, the first one is partially mitigated > by allowing virus scanners to scan the data, but that implies that the > owner of the storage machine can spy on the files. So you have a conflict > here. If it's encrypted malware, and you can't decrypt it, there's no threat. > Honestly, the *only* real defence against the spying issue is to encrypt the > files. Not obfuscate them with a lousy random substitution cipher. The > storage machine can keep the files as long as they like, just by making a > copy, and spend hours bruteforcing them. They *will* crack the substitution > cipher. In pure Python, that may take a few days or weeks; in C, hours or > days. If they have the resources to throw at it, minutes. Substitution > ciphers have not been effective encryption since, oh, the 1950s, unless you > use a one-time pad. Which you won't be. The original post said that the sender will usually send files they encrypted, unless they are malicious. So if the sender wants them to be encrypted, they already are. "While the data senders are supposed to encrypt data, that's not guaranteed, and I'd like to protect the recipient against exposure to nefarious data by mangling or encrypting the data before it is written to disk." The cipher is just to keep the sender from being able to control what is on disk. I am usually very oppositional when it comes to rolling your own crypto, but am I alone here in thinking the OP very clearly laid out their case? -- Devin
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2015-06-25 19:25 +1000 |
| Message-ID | <558bc912$0$2899$c3e8da3$76491128@news.astraweb.com> |
| In reply to | #93116 |
On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote: > On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano <steve@pearwood.info> > wrote: >> But just sticking to the three above, the first one is partially >> mitigated by allowing virus scanners to scan the data, but that implies >> that the owner of the storage machine can spy on the files. So you have a >> conflict here. > > If it's encrypted malware, and you can't decrypt it, there's no threat. If the *only* threat is that the sender will send malware, you can mitigate around that by dropping the file in an unencrypted container. Anything good enough to prevent Windows from executing the code, accidentally or deliberately, say, a tar file with a custom extension. But encrypting the file is also a good solution, and it prevents the storage machine spying on the file contents too. Provided the encryption is strong. >> Honestly, the *only* real defence against the spying issue is to encrypt >> the files. Not obfuscate them with a lousy random substitution cipher. >> The storage machine can keep the files as long as they like, just by >> making a copy, and spend hours bruteforcing them. They *will* crack the >> substitution cipher. In pure Python, that may take a few days or weeks; >> in C, hours or days. If they have the resources to throw at it, minutes. >> Substitution ciphers have not been effective encryption since, oh, the >> 1950s, unless you use a one-time pad. Which you won't be. > > The original post said that the sender will usually send files they > encrypted, unless they are malicious. So if the sender wants them to > be encrypted, they already are. The OP *hopes* that the sender will encrypt the files. I think that's a vanishingly faint hope, unless the application itself encrypts the file. Most people don't have any encryption software beyond password-protecting zip files. Zip 2.0 legacy encryption is crap, and there are plenty of tools available to break it. Winzip has an extension for 128-bit and 256-bit AES encryption, both of which are probably strong enough unless you're targeted by the NSA, but the weak link in the chain is the idea that people will encrypt the software before sending it. Even if they have the tools, laziness being the defining characteristic of most people, they won't use them. > "While the data senders are supposed to encrypt data, that's not > guaranteed, and I'd like to protect the recipient against exposure to > nefarious data by mangling or encrypting the data before it is written > to disk." > > The cipher is just to keep the sender from being able to control what > is on disk. The sender has a copy of the application? Then they can see the type of obfuscation used. If they know the key, or can guess it, they can take their malware, *decrypt* it, and send that, so that *encrypting* that file puts the malicious code on the disk. E.g. suppose I want to send you an insult, but I know your program automatically ROT-13s the strings I send you. Then I send you: 'lbhe sngure fzryyf bs ryqreoreevrf' and your program ROT-13s it to: 'your father smells of elderberries' I know that the OP doesn't propose using ROT-13, but a classical substitution cipher isn't that much stronger. > I am usually very oppositional when it comes to rolling your own > crypto, but am I alone here in thinking the OP very clearly laid out > their case? I don't think any of us *really* understand his use-case or the potential threats, but to my way of thinking, you can never have too strong a cipher or underestimate the risk of users taking short-cuts. -- Steve
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2015-06-25 02:41 -0700 |
| Message-ID | <mailman.46.1435225352.3674.python-list@python.org> |
| In reply to | #93122 |
On Thu, Jun 25, 2015 at 2:25 AM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote: >> The original post said that the sender will usually send files they >> encrypted, unless they are malicious. So if the sender wants them to >> be encrypted, they already are. > > The OP *hopes* that the sender will encrypt the files. I think that's a > vanishingly faint hope, unless the application itself encrypts the file. > > Most people don't have any encryption software beyond password-protecting > zip files. Zip 2.0 legacy encryption is crap, and there are plenty of tools > available to break it. Winzip has an extension for 128-bit and 256-bit AES > encryption, both of which are probably strong enough unless you're targeted > by the NSA, but the weak link in the chain is the idea that people will > encrypt the software before sending it. Even if they have the tools, > laziness being the defining characteristic of most people, they won't use > them. You're right, I was supposing that since they wrote the server, they also wrote the client, and were just protecting from the protocol itself being weak. > I know that the OP doesn't propose using ROT-13, but a classical > substitution cipher isn't that much stronger. Yes, it is. It requires the attacker being able to see something about the ciphertext, unlike ROT13. But it is reasonable to suppose that maybe the attacker can trigger the file getting executed, at which point maybe you can deduce from the behavior what the starting bytes are...? > I don't think any of us *really* understand his use-case or the potential > threats, but to my way of thinking, you can never have too strong a cipher > or underestimate the risk of users taking short-cuts. This is truth. It would be nice if something like keyczar came in the stdlib. (Otherwise, users of Python take shortcuts and use randomized substitution ciphers instead of AES.) -- Devin
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-06-25 19:57 +1000 |
| Message-ID | <mailman.51.1435226250.3674.python-list@python.org> |
| In reply to | #93122 |
On Thu, Jun 25, 2015 at 7:41 PM, Devin Jeanpierre <jeanpierreda@gmail.com> wrote: >> I know that the OP doesn't propose using ROT-13, but a classical >> substitution cipher isn't that much stronger. > > Yes, it is. It requires the attacker being able to see something about > the ciphertext, unlike ROT13. But it is reasonable to suppose that > maybe the attacker can trigger the file getting executed, at which > point maybe you can deduce from the behavior what the starting bytes > are...? > If a symmetric cipher is being used and the key is known, anyone can simply perform a decryption operation on the desired bytes, get back a pile of meaningless encrypted junk, and submit that. When it's encrypted with the same key, voila! The cleartext will reappear. Asymmetric ciphers are a bit different, though. AIUI you can't perform a decryption without the private key, whereas you can encrypt with only the public key. So you ought to be safe on that one; the only way someone could deliberately craft input that, when encrypted with your public key, produces a specific set of bytes, would be to brute-force it. (But I might be wrong on that. I'm no crypto expert.) ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2015-06-25 10:03 +0000 |
| Message-ID | <slrnmonkip.1nu.jon+usenet@frosty.unequivocal.co.uk> |
| In reply to | #93122 |
On 2015-06-25, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote: >> If it's encrypted malware, and you can't decrypt it, there's no threat. > > If the *only* threat is that the sender will send malware, you can mitigate > around that by dropping the file in an unencrypted container. Anything good > enough to prevent Windows from executing the code, accidentally or > deliberately, say, a tar file with a custom extension. That won't stop virus scanners etc potentially making their own minds up about the file. > But encrypting the file is also a good solution, and it prevents the storage > machine spying on the file contents too. Provided the encryption is strong. How would the receiver encrypting the file after receiving it prevent the receiver from seeing what's in the file? >> The original post said that the sender will usually send files they >> encrypted, unless they are malicious. So if the sender wants them to >> be encrypted, they already are. > > The OP *hopes* that the sender will encrypt the files. I think that's a > vanishingly faint hope, unless the application itself encrypts the file. Yes, the application itself encrypts the file. Haven't you been reading what he's saying? > The sender has a copy of the application? Then they can see the type of > obfuscation used. If they know the key, or can guess it, they can take their > malware, *decrypt* it, and send that, so that *encrypting* that file puts > the malicious code on the disk. Not if they don't know the key they can't. > E.g. suppose I want to send you an insult, but I know your program > automatically ROT-13s the strings I send you. Then I send you: > > 'lbhe sngure fzryyf bs ryqreoreevrf' > > and your program ROT-13s it to: > > 'your father smells of elderberries' > > I know that the OP doesn't propose using ROT-13, but a classical > substitution cipher isn't that much stronger. Replace "ROT-13" with "ROT-n" where 'n' is a secret known only to the receiver, and suddenly it's not such a bad method of obfuscation. Improve it to the random-translation-map method he's actually using and you've got really quite a reasonable system. >> I am usually very oppositional when it comes to rolling your own >> crypto, but am I alone here in thinking the OP very clearly laid out >> their case? > > I don't think any of us *really* understand his use-case or the potential > threats, but to my way of thinking, you can never have too strong a cipher > or underestimate the risk of users taking short-cuts. The use case is pretty obvious (a peer-to-peer dropbox type thing) but it does appear to be being misunderstood. This isn't actually a crypto problem at all and "users taking short-cuts" isn't an issue.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-06-26 01:13 +1000 |
| Message-ID | <558c1a7e$0$1668$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #93128 |
On Thu, 25 Jun 2015 08:03 pm, Jon Ribbens wrote: > On 2015-06-25, Steven D'Aprano <steve+comp.lang.python@pearwood.info> > wrote: >> On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote: >>> If it's encrypted malware, and you can't decrypt it, there's no threat. >> >> If the *only* threat is that the sender will send malware, you can >> mitigate around that by dropping the file in an unencrypted container. >> Anything good enough to prevent Windows from executing the code, >> accidentally or deliberately, say, a tar file with a custom extension. > > That won't stop virus scanners etc potentially making their own minds > up about the file. *shrug* Sure, but I was specifically referring to the risk of the malware being executed, not being detected by a virus scanner. Encrypting the file won't even necessarily stop the virus scanner from finding false positives. It might even increase the chances. But it will prevent the virus scanner from finding actual viruses. You may or may not consider that a problem. >> But encrypting the file is also a good solution, and it prevents the >> storage machine spying on the file contents too. Provided the encryption >> is strong. > > How would the receiver encrypting the file after receiving it prevent > the receiver from seeing what's in the file? I didn't say it ought to be encrypted by the receiver. Obviously the encryption needs to be done in a way that the recipient doesn't get access to the key. The obvious way to do that is for the application to encrypt the data before it sends it. Then the receiver just writes the encrypted bytes directly to a file. That would have the benefit of protecting against man-in-the-middle attacks as well, since the file is never transmitted in the clear. >>> The original post said that the sender will usually send files they >>> encrypted, unless they are malicious. So if the sender wants them to >>> be encrypted, they already are. >> >> The OP *hopes* that the sender will encrypt the files. I think that's a >> vanishingly faint hope, unless the application itself encrypts the file. > > Yes, the application itself encrypts the file. Haven't you been > reading what he's saying? I have been reading what the OP has been saying. I'm not sure if you have been. The OP doesn't want to encrypt the file, because he wants the application to be pure Python and encryption in pure Python is too slow. So he wants to obfuscate it with some sort of substitution cipher or equivalent, which may be easily crackable by anyone who really wants to. I've been arguing that the application *should* encrypt the file, and not mess about giving the illusion of security. >> The sender has a copy of the application? Then they can see the type of >> obfuscation used. If they know the key, or can guess it, they can take >> their malware, *decrypt* it, and send that, so that *encrypting* that >> file puts the malicious code on the disk. > > Not if they don't know the key they can't. "If they know the key, or can guess it, ..." "Not if they don't know the key they can't." Really? Glad you're around to point that out to me. But seriously, they have the application. If the application is using a symmetric substitution cipher, it needs the key (because there is only one), so the receiver will have the cipher. With the sort of substitution cipher the OP is experimenting with, forcing a particular result is trivially easy. The sender has access to the application, knows the cipher, knows the key, and can easily generate a file which will generate whatever content the sender wants after being obfuscated. Modern asymmetric ciphers like AES are quite resistant to that sort of attack. There is, so far as I know, no way to generate a file which results in a specific content after encryption. >> E.g. suppose I want to send you an insult, but I know your program >> automatically ROT-13s the strings I send you. Then I send you: >> >> 'lbhe sngure fzryyf bs ryqreoreevrf' >> >> and your program ROT-13s it to: >> >> 'your father smells of elderberries' >> >> I know that the OP doesn't propose using ROT-13, but a classical >> substitution cipher isn't that much stronger. > > Replace "ROT-13" with "ROT-n" where 'n' is a secret known only to the > receiver, and suddenly it's not such a bad method of obfuscation. There are only 256 possible values for n, one of which doesn't transform the data at all (ROT-0). If you're thinking of attacking this by pencil and paper, 255 transformations sounds like a lot. For a computer, that's barely harder than a single transformation. > Improve it to the random-translation-map method he's actually using > and you've got really quite a reasonable system. No, truly you haven't. The OP is experimenting with bytearray.translate, which likely makes it a monoalphabetic substitution cipher, and the techniques for cracking those go back to the 9th century AD. That's over a thousand years of experience in cracking these things. The situation is a bit harder than the sort of traditional ciphers, instead of using an alphabet of 26 letters we have one of 256 bytes. But that's only an order of magnitude bigger, and the cipher is still vulnerable to frequency analysis and other attacks. The only positive to this scheme is that the "encryption" is so weak (it's been effectively obsolete since World War 2, if not before it) that you might find it hard to find ready-made cracking tools for it unless you work for the NSA, CIA or similar. You're relying on security by obscurity: nobody uses this sort of thing any more, because it's so insecure, and that obscurity does give you a *tiny* bit of security against a casual, unmotivated attacker. But once this system starts getting popular, that obscurity will not last. It won't be difficult to build fast cracking programs that will break the so-called "encryption", if it is based on a classical symmetric monoalphabetic substitution cipher. Here's an online tool which can be used for cracking "encrypted" English text: http://www.simonsingh.net/The_Black_Chamber/substitutioncrackingtool.html You obviously wouldn't use that specific site on arbitrary files, but it demonstrates that these classical ciphers are *not* secure. >>> I am usually very oppositional when it comes to rolling your own >>> crypto, but am I alone here in thinking the OP very clearly laid out >>> their case? >> >> I don't think any of us *really* understand his use-case or the potential >> threats, but to my way of thinking, you can never have too strong a >> cipher or underestimate the risk of users taking short-cuts. > > The use case is pretty obvious (a peer-to-peer dropbox type thing) but > it does appear to be being misunderstood. This isn't actually a crypto > problem at all and "users taking short-cuts" isn't an issue. Yes it is. If users don't properly pre-encrypt their files before sending it out to the cloud, AND THEY WON'T, receivers WILL be able to read those files, half-arsed attempts to "encrypt" them or not. The solution to all(?) these security problems is for the application to handle the encryption, using a modern crypto library. But the OP doesn't want to do that because it's too slow when written as pure Python. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2015-06-25 15:26 +0000 |
| Message-ID | <slrnmoo7ev.1nu.jon+usenet@frosty.unequivocal.co.uk> |
| In reply to | #93144 |
On 2015-06-25, Steven D'Aprano <steve@pearwood.info> wrote: > On Thu, 25 Jun 2015 08:03 pm, Jon Ribbens wrote: >> That won't stop virus scanners etc potentially making their own minds >> up about the file. > > *shrug* Sure, but I was specifically referring to the risk of the malware > being executed, not being detected by a virus scanner. > > Encrypting the file won't even necessarily stop the virus scanner from > finding false positives. It might even increase the chances. That seems spectacularly unlikely. > But it will prevent the virus scanner from finding actual viruses. > You may or may not consider that a problem. The OP would consider it a benefit. > I didn't say it ought to be encrypted by the receiver. Obviously the > encryption needs to be done in a way that the recipient doesn't get access > to the key. No, you're still misunderstanding. The encryption needs to be done in a way that the *sender* doesn't get access to the key. The recipient has access to it by definition because the recipient chooses it. > The obvious way to do that is for the application to encrypt the > data before it sends it. Yes, he already said the application does that. The problem is, what if the sender is not the genuine application but is instead a malicious attacker? > Then the receiver just writes the encrypted bytes directly to a file. That's precisely what he's trying to avoid. > That would have the benefit of protecting against man-in-the-middle > attacks as well, since the file is never transmitted in the clear. With what he's talking about, the file after encryption is never transmitted *at all*. > I've been arguing that the application *should* encrypt the file, and not > mess about giving the illusion of security. You haven't understood the threat model. > But seriously, they have the application. If the application is using a > symmetric substitution cipher, it needs the key (because there is only > one), so the receiver will have the cipher. There is not only one key. The recipient would invent a new key for each file after the file is received. > With the sort of substitution cipher the OP is experimenting with, forcing a > particular result is trivially easy. The sender has access to the > application, knows the cipher, knows the key, and can easily generate a > file which will generate whatever content the sender wants after being > obfuscated. No, because the sender does not know the key. >> Replace "ROT-13" with "ROT-n" where 'n' is a secret known only to the >> receiver, and suddenly it's not such a bad method of obfuscation. > > There are only 256 possible values for n, one of which doesn't transform the > data at all (ROT-0). If you're thinking of attacking this by pencil and > paper, 255 transformations sounds like a lot. For a computer, that's barely > harder than a single transformation. Well, it means you need to send 256 times as much data, which is a start. If you're instead using a 256-byte translation table then an attack becomes utterly impractical. >> Improve it to the random-translation-map method he's actually using >> and you've got really quite a reasonable system. > > No, truly you haven't. The OP is experimenting with bytearray.translate, > which likely makes it a monoalphabetic substitution cipher, and the > techniques for cracking those go back to the 9th century AD. Only if you have the ciphertext, which the attacker in this scenario does not. The attacker gets to set the plaintext, knows the algorithm, does not know the key (unless the method of choosing the key has a flaw), and wants to set the ciphertext to some specific string. Frequency analysis doesn't even begin to apply to this scenario. > You're relying on security by obscurity No, he really isn't. >> The use case is pretty obvious (a peer-to-peer dropbox type thing) but >> it does appear to be being misunderstood. This isn't actually a crypto >> problem at all and "users taking short-cuts" isn't an issue. > > Yes it is. If users don't properly pre-encrypt their files before sending it > out to the cloud, AND THEY WON'T, Yes they will. He said his application encrypts the files for them, presumably he is indeed using "proper crypto" for that. > receivers WILL be able to read those files, That's a problem for the sender not the receiver.
[toc] | [prev] | [next] | [standalone]
| From | Randall Smith <randall@tnr.cc> |
|---|---|
| Date | 2015-06-25 13:58 -0500 |
| Message-ID | <mailman.77.1435258712.3674.python-list@python.org> |
| In reply to | #93146 |
Thanks Jon. I couldn't have answered those questions better myself, and I wrote the software in question. I didn't intend to describe the entire system, but rather just enough of it to present the issue at hand. You seem to understand it quite well. I'm now using a randomly generated 256 byte translation table, which performs very well on the lowly Raspberry PI ARM chip. The Raspberry PI is to be my recommended storage node platform. For those that care, the storage system is something like Amazon S3, except storage is distributed peer to peer. Clients compress, encrypt, and chunk data, then send it to storage nodes. Storage nodes propagate the data. Encryption and Authentication are handled through TLS. Files use AES encryption for storage. Storage Nodes are monitored for availability, integrity, and performance. Data transfers are coordinated by a centralized service which tracks storage and transfers. Redundancy is configurable by chunk. Storage nodes are compensated for storage x time. Uploads and downloads can utilize several storage nodes simultaneously to increase throughput. -Randall On 06/25/2015 10:26 AM, Jon Ribbens wrote: > On 2015-06-25, Steven D'Aprano <steve@pearwood.info> wrote: >> On Thu, 25 Jun 2015 08:03 pm, Jon Ribbens wrote: >>> That won't stop virus scanners etc potentially making their own minds >>> up about the file. >> >> *shrug* Sure, but I was specifically referring to the risk of the malware >> being executed, not being detected by a virus scanner. >> >> Encrypting the file won't even necessarily stop the virus scanner from >> finding false positives. It might even increase the chances. > > That seems spectacularly unlikely. > >> But it will prevent the virus scanner from finding actual viruses. >> You may or may not consider that a problem. > > The OP would consider it a benefit. > >> I didn't say it ought to be encrypted by the receiver. Obviously the >> encryption needs to be done in a way that the recipient doesn't get access >> to the key. > > No, you're still misunderstanding. The encryption needs to be done in > a way that the *sender* doesn't get access to the key. The recipient > has access to it by definition because the recipient chooses it. > >> The obvious way to do that is for the application to encrypt the >> data before it sends it. > > Yes, he already said the application does that. The problem is, > what if the sender is not the genuine application but is instead > a malicious attacker? > >> Then the receiver just writes the encrypted bytes directly to a file. > > That's precisely what he's trying to avoid. > >> That would have the benefit of protecting against man-in-the-middle >> attacks as well, since the file is never transmitted in the clear. > > With what he's talking about, the file after encryption is never > transmitted *at all*. > >> I've been arguing that the application *should* encrypt the file, and not >> mess about giving the illusion of security. > > You haven't understood the threat model. > >> But seriously, they have the application. If the application is using a >> symmetric substitution cipher, it needs the key (because there is only >> one), so the receiver will have the cipher. > > There is not only one key. The recipient would invent a new key for > each file after the file is received. > >> With the sort of substitution cipher the OP is experimenting with, forcing a >> particular result is trivially easy. The sender has access to the >> application, knows the cipher, knows the key, and can easily generate a >> file which will generate whatever content the sender wants after being >> obfuscated. > > No, because the sender does not know the key. > >>> Replace "ROT-13" with "ROT-n" where 'n' is a secret known only to the >>> receiver, and suddenly it's not such a bad method of obfuscation. >> >> There are only 256 possible values for n, one of which doesn't transform the >> data at all (ROT-0). If you're thinking of attacking this by pencil and >> paper, 255 transformations sounds like a lot. For a computer, that's barely >> harder than a single transformation. > > Well, it means you need to send 256 times as much data, which is a > start. If you're instead using a 256-byte translation table then > an attack becomes utterly impractical. > >>> Improve it to the random-translation-map method he's actually using >>> and you've got really quite a reasonable system. >> >> No, truly you haven't. The OP is experimenting with bytearray.translate, >> which likely makes it a monoalphabetic substitution cipher, and the >> techniques for cracking those go back to the 9th century AD. > > Only if you have the ciphertext, which the attacker in this scenario > does not. The attacker gets to set the plaintext, knows the algorithm, > does not know the key (unless the method of choosing the key has a > flaw), and wants to set the ciphertext to some specific string. > Frequency analysis doesn't even begin to apply to this scenario. > >> You're relying on security by obscurity > > No, he really isn't. > >>> The use case is pretty obvious (a peer-to-peer dropbox type thing) but >>> it does appear to be being misunderstood. This isn't actually a crypto >>> problem at all and "users taking short-cuts" isn't an issue. >> >> Yes it is. If users don't properly pre-encrypt their files before sending it >> out to the cloud, AND THEY WON'T, > > Yes they will. He said his application encrypts the files for them, > presumably he is indeed using "proper crypto" for that. > >> receivers WILL be able to read those files, > > That's a problem for the sender not the receiver. >
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-06-26 10:33 +1000 |
| Message-ID | <mailman.87.1435278799.3674.python-list@python.org> |
| In reply to | #93146 |
On Fri, Jun 26, 2015 at 1:26 AM, Jon Ribbens <jon+usenet@unequivocal.co.uk> wrote: >> There are only 256 possible values for n, one of which doesn't transform the >> data at all (ROT-0). If you're thinking of attacking this by pencil and >> paper, 255 transformations sounds like a lot. For a computer, that's barely >> harder than a single transformation. > > Well, it means you need to send 256 times as much data, which is a > start. If you're instead using a 256-byte translation table then > an attack becomes utterly impractical. > Utterly impractical? Maybe, if you attempt a pure brute-force approach - there are 256! possible translation tables, which is roughly e500 attempts [1], and at roughly four a microsecond [2] that'd still take a ridiculously long time. But there are two gigantic optimizations you could do. Firstly, there are frequency-based attacks, and byte value duplicates will tell you a lot - classic cryptographic work. And secondly, you can simply take the first few bytes of a file - let's say 16, although a lot of files can be recognized in less than that. Even if there are no duplicate bytes, that'd be a maximum of 16! translation tables that truly matter, or just 2e13. At the same speed, that makes about a million seconds of computing time required. Divide that across a bunch of separate computers (the job is embarrassingly parallel after all), and you could get that result pretty easily. Cut the prefix to just 8 bytes and you have a mere 40K encryption keys to try - so quick that you wouldn't even see it happen. Nope, a simple substitution cipher is still not secure. Even the famous Enigma machine was a lot more than just letter-for-letter substitution - a double letter in the cleartext wouldn't be represented by a double letter in the result - and once the machine's secrets were figured out, the day's key could be reassembled fairly readily. ChrisA [1] It's actually closer to 8.6e506, if you care. [2] timeit result from my laptop - you could do better, but that's a reasonable average
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2015-06-26 10:49 +0000 |
| Message-ID | <slrnmoqbjk.1nu.jon+usenet@frosty.unequivocal.co.uk> |
| In reply to | #93169 |
On 2015-06-26, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Jun 26, 2015 at 1:26 AM, Jon Ribbens ><jon+usenet@unequivocal.co.uk> wrote: >> Well, it means you need to send 256 times as much data, which is a >> start. If you're instead using a 256-byte translation table then >> an attack becomes utterly impractical. > > Utterly impractical? Maybe, if you attempt a pure brute-force approach > - there are 256! possible translation tables, which is roughly e500 > attempts [1], and at roughly four a microsecond [2] that'd still take > a ridiculously long time. But there are two gigantic optimizations you > could do. Firstly, there are frequency-based attacks, No, there aren't. As I already said, the attacker does not have the ciphertext. He can't do anything related to frequency analysis.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-06-25 19:01 -0600 |
| Message-ID | <mailman.89.1435280528.3674.python-list@python.org> |
| In reply to | #93146 |
On Thu, Jun 25, 2015 at 6:33 PM, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Jun 26, 2015 at 1:26 AM, Jon Ribbens > <jon+usenet@unequivocal.co.uk> wrote: >>> There are only 256 possible values for n, one of which doesn't transform the >>> data at all (ROT-0). If you're thinking of attacking this by pencil and >>> paper, 255 transformations sounds like a lot. For a computer, that's barely >>> harder than a single transformation. >> >> Well, it means you need to send 256 times as much data, which is a >> start. If you're instead using a 256-byte translation table then >> an attack becomes utterly impractical. >> > > Utterly impractical? Maybe, if you attempt a pure brute-force approach > - there are 256! possible translation tables, which is roughly e500 > attempts [1], and at roughly four a microsecond [2] that'd still take > a ridiculously long time. But there are two gigantic optimizations you > could do. Firstly, there are frequency-based attacks, and byte value > duplicates will tell you a lot - classic cryptographic work. And > secondly, you can simply take the first few bytes of a file - let's > say 16, although a lot of files can be recognized in less than that. > Even if there are no duplicate bytes, that'd be a maximum of 16! > translation tables that truly matter, or just 2e13. At the same speed, > that makes about a million seconds of computing time required. Divide > that across a bunch of separate computers (the job is embarrassingly > parallel after all), and you could get that result pretty easily. Cut > the prefix to just 8 bytes and you have a mere 40K encryption keys to > try - so quick that you wouldn't even see it happen. Nope, a simple > substitution cipher is still not secure. Even the famous Enigma > machine was a lot more than just letter-for-letter substitution - a > double letter in the cleartext wouldn't be represented by a double > letter in the result - and once the machine's secrets were figured > out, the day's key could be reassembled fairly readily. You're making the same mistake that Steven did in misunderstanding the threat model. The goal isn't to prevent the attacker from working out the key for a file that has already been obfuscated. Any real data that might be exposed by a vulnerability in the server is presumed to have already been strongly encrypted by the user. The goal is to prevent the attacker from guessing a key that hasn't even been generated yet, which could be exploited to engineer the obfuscated content into something malicious. There are no frequency-based attacks possible here, because you can't do frequency analysis on the result of a key that hasn't even been generated yet. Assuming that you have no attack on the key generation itself, the best you can do is send a file deobfuscated with a random key and hope that the recipient randomly chooses the same key; the odds of that happening are 1 in 256!. That said, I do see a potential weakness here: if the attacker can create a malicious payload using only a subset of the 256 possible byte values, then the odds of getting a correct key are increased, since multiple keys will work. For an extreme example, if the attacker can manage to craft a malicious payload that uses only the two byte values 32 and 47, then the probability of getting a key that will obfuscate to that is increased to 1 in 256! / 254!, or 1 in 65280. If they distribute 65280 copies of that payload to various recipients, then they can expect that one recipient on average will get the payload in its malicious form.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-06-27 03:06 +1000 |
| Message-ID | <558d86b0$0$1659$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #93172 |
On Fri, 26 Jun 2015 11:01 am, Ian Kelly wrote: > You're making the same mistake that Steven did in misunderstanding the > threat model. I don't think I'm misunderstanding the threat, I think I'm pointing out a threat which the OP is hoping to just ignore. In an earlier post, I suggested that the threat model should involve at least *three* different attacks, apart from the usual man-in-the-model attacks of data in transit. One is that the attacker is the person sending the data. E.g. I want to send a nasty payload (say, malware, or an offensive image). Another is that the attacker is the recipient of the file, who wants to read the sender's data. As far as I can tell, the OP's plan to defend the sender's privacy is to dump responsibility for encrypting the files in the sender's lap. As far as I'm concerned, perhaps as many as one user in 20000 will pre-encrypt their files. (Early adopters will be unrepresentative of the eventual user base of this system. If this takes off, the user base will likely end up dominated by people who think that "qwerty" is the epitome of unguessable passwords.) Users just don't use crypto unless their applications do it for them. My opinion is that the application ought to do so, and not expect Aunt Tillie to learn how to correctly use encryption software before uploading her files. http://www.catb.org/jargon/html/A/Aunt-Tillie.html It is the OP's prerogative to disagree, of course, but to me, if the OP's app doesn't use strong crypto to encrypt users' data, that's tantamount to saying they don't care about their users' data privacy. Using a monoalphabetic substitution cipher to obfuscate the data is not strong crypto. > The goal isn't to prevent the attacker from working out > the key for a file that has already been obfuscated. Any real data > that might be exposed by a vulnerability in the server is presumed to > have already been strongly encrypted by the user. I think that's a ridiculously unrealistic presumption, unless your user-base is entirely taken from a very small subset of security savvy and pedantically careful users. > The goal is to prevent the attacker from guessing a key that hasn't > even been generated yet, which could be exploited to engineer the > obfuscated content into something malicious. They don't need to predict the key exactly. If they can predict that the key will be, lets say, one of these thousand values, then they can generate one thousand files and upload them. One of them will match the key, and there's your exploit. That's one attack. A second attack is to force the key. The attacker controls the machine the application is running on, they control /dev/urandom and can feed your app whatever not-so-random numbers they like, so potentially they can force the app to use the key of their choosing. Then they don't need 1000 files, they just need one. That's two. Does anyone think that I've thought of all the possible attacks? (Well, hypothetical attacks. I acknowledge that I don't know the application, and cannot be sure that it *actually is* vulnerable to these attacks.) The problem here is that a monoalphabetic substitution cipher is not resistant to preimage attacks. Your only defence is that the key is unknown. If the attacker can force the key, or predict the key, or guess a small range of keys, they can exploit your weak cipher. (Technically, "preimage attack" is usually used to refer to attacks on hash functions. I'm not sure if the same name is used for attacks on ciphers.) https://en.wikipedia.org/wiki/Preimage_attack With a strong crypto cipher, there are no known preimage attacks. Even if the attacker knows exactly what key you are using, they cannot predict what preimage they need to supply in order to generate the malicious payload they want after encryption. (As far as I know.) That is the critical issue right there. The sort of simple monoalphabetic substitution cipher using bytes.translate that the OP is using is vulnerable to preimage attacks. Strong crypto is not. > There are no > frequency-based attacks possible here, because you can't do frequency > analysis on the result of a key that hasn't even been generated yet. Frequency-based attacks apply to a different threat. I'm referring to at least two different attacks here, with different attackers and different victims. Don't mix them up. > Assuming that you have no attack on the key generation itself, the Not a safe assumption! > best you can do is send a file deobfuscated with a random key and hope > that the recipient randomly chooses the same key; the odds of that > happening are 1 in 256!. It's easy to come up with attacks which are no better than brute force. It's the attacks which are better than brute force that you have to watch out for. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Randall Smith <randall@tnr.cc> |
|---|---|
| Date | 2015-06-26 15:09 -0500 |
| Message-ID | <mailman.111.1435349412.3674.python-list@python.org> |
| In reply to | #93197 |
On 06/26/2015 12:06 PM, Steven D'Aprano wrote: > On Fri, 26 Jun 2015 11:01 am, Ian Kelly wrote: > >> You're making the same mistake that Steven did in misunderstanding the >> threat model. > > I don't think I'm misunderstanding the threat, I think I'm pointing out a > threat which the OP is hoping to just ignore. I'm not hoping to ignore anything. I didn't explain the entire system, as it was not necessary to find a solution to the problem at hand. But since you want to make negative assumptions about what I didn't tell you, I'll gladly address your accusations of negligence. > > In an earlier post, I suggested that the threat model should involve at > least *three* different attacks, apart from the usual man-in-the-model > attacks of data in transit. All communication is secured using TLS and authentication handled by X.509 certificates. This prevents man in the middle attacks. Certificates are signed by CAs I control. > > One is that the attacker is the person sending the data. E.g. I want to send > a nasty payload (say, malware, or an offensive image). Another is that the > attacker is the recipient of the file, who wants to read the sender's data. The only person who can read a file is the owner. AES encryption is built into the client software. The only way data can be uploaded unencrypted is if encryption is intentionally disabled. > > As far as I can tell, the OP's plan to defend the sender's privacy is to > dump responsibility for encrypting the files in the sender's lap. As far as > I'm concerned, perhaps as many as one user in 20000 will pre-encrypt their > files. (Early adopters will be unrepresentative of the eventual user base > of this system. If this takes off, the user base will likely end up > dominated by people who think that "qwerty" is the epitome of unguessable > passwords.) Making assumptions again. See above. The client software encrypts by default. You're also assuming there is no password strength checking. > > Users just don't use crypto unless their applications do it for them. And it does. > > My opinion is that the application ought to do so, and not expect Aunt > Tillie to learn how to correctly use encryption software before uploading > her files. > > http://www.catb.org/jargon/html/A/Aunt-Tillie.html > > It is the OP's prerogative to disagree, of course, but to me, if the OP's > app doesn't use strong crypto to encrypt users' data, that's tantamount to > saying they don't care about their users' data privacy. Using a > monoalphabetic substitution cipher to obfuscate the data is not strong > crypto. You've gone on a rampage about nothing. My original description said the client was supposed to encrypt the data, but you want to assume the opposite for some unknown reason. > > >> The goal isn't to prevent the attacker from working out >> the key for a file that has already been obfuscated. Any real data >> that might be exposed by a vulnerability in the server is presumed to >> have already been strongly encrypted by the user. > > I think that's a ridiculously unrealistic presumption, unless your user-base > is entirely taken from a very small subset of security savvy and > pedantically careful users. The difference is he's not assuming I'm a moron. He's giving me the benefit of the doubt. That plus I actually said, "data senders are supposed to encrypt data". In a networked system, you can't make assumptions about what the other peers are doing. You have to handle what comes across the wire. You also have to consider that you may come under attack. That's what this is about. > > >> The goal is to prevent the attacker from guessing a key that hasn't >> even been generated yet, which could be exploited to engineer the >> obfuscated content into something malicious. > > They don't need to predict the key exactly. If they can predict that the key > will be, lets say, one of these thousand values, then they can generate one > thousand files and upload them. One of them will match the key, and there's > your exploit. That's one attack. Thousand Values ??? Isn't it 256!, which is just freaking huge! import math; math.factorial(256) > > A second attack is to force the key. The attacker controls the machine the > application is running on, they control /dev/urandom and can feed your app > whatever not-so-random numbers they like, so potentially they can force the > app to use the key of their choosing. Then they don't need 1000 files, they > just need one. > If the attacker controlled the machine the app was on, why would it fool with /dev/urandom? I think he'd just plant the files he wanted to plant and be done. This is non-nonsensical anyway. > That's two. Does anyone think that I've thought of all the possible attacks? > > (Well, hypothetical attacks. I acknowledge that I don't know the > application, and cannot be sure that it *actually is* vulnerable to these > attacks.) > > The problem here is that a monoalphabetic substitution cipher is not > resistant to preimage attacks. Your only defence is that the key is > unknown. If the attacker can force the key, or predict the key, or guess a > small range of keys, they can exploit your weak cipher. > > (Technically, "preimage attack" is usually used to refer to attacks on hash > functions. I'm not sure if the same name is used for attacks on ciphers.) > > https://en.wikipedia.org/wiki/Preimage_attack > > With a strong crypto cipher, there are no known preimage attacks. Even if > the attacker knows exactly what key you are using, they cannot predict what > preimage they need to supply in order to generate the malicious payload > they want after encryption. (As far as I know.) > > That is the critical issue right there. The sort of simple monoalphabetic > substitution cipher using bytes.translate that the OP is using is > vulnerable to preimage attacks. Strong crypto is not. It isn't vulnerable to preimage attacks unless you can guess the key out of 256! possibilities. The key doesn't even exist until after the data is sent. Give me one plausible scenario where an attacker can cause malware to hit the disk after bytearray.translate with a 256 byte translation table and I'll be thankful to you. As it stands now, you're either ignoring information I've already given, assuming I've made moronic design decisions the pouncing on them, or completely misunderstanding the issue at hand. > > >> There are no >> frequency-based attacks possible here, because you can't do frequency >> analysis on the result of a key that hasn't even been generated yet. > > Frequency-based attacks apply to a different threat. I'm referring to at > least two different attacks here, with different attackers and different > victims. Don't mix them up. > > >> Assuming that you have no attack on the key generation itself, the > > Not a safe assumption! For this case it is a safe assumption. For the same reason you're assuming the PSU isn't defective. An attack on /dev/urandom for instance would also compromise TLS, and every sort of secure key generation, not just the key for byte translation. That is a separate problem altogether with a separate solution. > > >> best you can do is send a file deobfuscated with a random key and hope >> that the recipient randomly chooses the same key; the odds of that >> happening are 1 in 256!. > > It's easy to come up with attacks which are no better than brute force. It's > the attacks which are better than brute force that you have to watch out > for. > And that's why we're having this discussion. Do you know of an attack in which you can control the output (say at least 100 consecutive bytes) for data which goes through a 256 byte translation table, chosen randomly from 256! permutations after the data is sent. If you do, I'm all ears! But at this point you're just setting up straw men and knocking them down.
[toc] | [prev] | [next] | [standalone]
| From | Johannes Bauer <dfnsonfsduifb@gmx.de> |
|---|---|
| Date | 2015-06-26 23:07 +0200 |
| Message-ID | <mmkev4$uhl$1@news.albasani.net> |
| In reply to | #93204 |
On 26.06.2015 22:09, Randall Smith wrote: > You've gone on a rampage about nothing. My original description said > the client was supposed to encrypt the data, but you want to assume the > opposite for some unknown reason. While you seem to think that Steven is rampaging about nothing, he does have a fair point: You consistently were vague about wheter you want to have encryption, authentication or obfuscation of data. This suggests that you may not be so sure yourself what it is you actually want. All Steven is doing is pointing out that people do good crypto for a reason. It's 2015 and we're still discussion "substitution ciphers", really? Good crypto is available, it's fast, it has awesome cryptanalysis. All Steven is pointing out is that when ten crypto-laymen meet in a Python newsgroup and think they have invented a soooper secure scheme, it may still be complete and utter crap. Just not everone can see it. You always play around with the 256! which would be a ridiculously high security margin (1684 bits of security, woooo!). You totally ignore that the system can be broken in a linear fashion. I don't need to know all 256 characters to do damage, sometimes even a handful will already give me part of what I need and the option to crack more and more. This is something that would ultimately and instantly disqualify your "crypto"system as utterly insecure. Nobody assumes you're a moron. But it's safe to assume that you're a crypto layman, because only laymen have no clue on how difficult it is to get cryptography even remotely right. Everyone who knows the trade uses proven constructions not because it's inconvenient, but because it's one of the very few ways to achieve a secure system. That said, for your solution this type of obfuscation may be fine. And chances are that nobody will ever notice. But don't claim you weren't warned about the abyss when you designed your solution and people break this stuff. Because then you might *look* like a moron (even if you're not), since the first question people will ask will be: "Why? Why on earth?" It's a blatantly obvious bad idea(tm). That people in 2015 actually defend inventing a substitution-cipher "crypto"system sends literally shivers down my spine. Cheers, Johannes -- >> Wo hattest Du das Beben nochmal GENAU vorhergesagt? > Zumindest nicht öffentlich! Ah, der neueste und bis heute genialste Streich unsere großen Kosmologen: Die Geheim-Vorhersage. - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>
[toc] | [prev] | [next] | [standalone]
| From | Jon Ribbens <jon+usenet@unequivocal.co.uk> |
|---|---|
| Date | 2015-06-26 21:29 +0000 |
| Message-ID | <slrnmorh57.1nu.jon+usenet@frosty.unequivocal.co.uk> |
| In reply to | #93205 |
On 2015-06-26, Johannes Bauer <dfnsonfsduifb@gmx.de> wrote: > On 26.06.2015 22:09, Randall Smith wrote: >> You've gone on a rampage about nothing. My original description said >> the client was supposed to encrypt the data, but you want to assume the >> opposite for some unknown reason. > > While you seem to think that Steven is rampaging about nothing, he does > have a fair point: You consistently were vague about wheter you want to > have encryption, authentication or obfuscation of data. This suggests > that you may not be so sure yourself what it is you actually want. He hasn't been vague, you and Steven just haven't been paying attention. > You always play around with the 256! which would be a ridiculously high > security margin (1684 bits of security, woooo!). You totally ignore that > the system can be broken in a linear fashion. No, it can't, because the attacker does not have access to the ciphertext. > Nobody assumes you're a moron. But it's safe to assume that you're a > crypto layman, because only laymen have no clue on how difficult it is > to get cryptography even remotely right. Amateur crypto is indeed a bad idea. But what you're still not getting is that what he's doing here *isn't crypto*. He's just trying to avoid letting third parties write completely arbitrary data to the disk. You know what would be a perfectly good solution to his problem? Base 64 encoding. That would solve the issue pretty much completely, the only reason it's not an ideal solution is that it of course increases the size of the data. > That people in 2015 actually defend inventing a substitution-cipher > "crypto"system sends literally shivers down my spine. Nobody is defending such a thing, you just haven't understood what problem is being solved here.
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-06-26 22:55 +0100 |
| Message-ID | <mailman.112.1435355733.3674.python-list@python.org> |
| In reply to | #93207 |
On 26/06/2015 22:29, Jon Ribbens wrote: > On 2015-06-26, Johannes Bauer <dfnsonfsduifb@gmx.de> wrote: >> On 26.06.2015 22:09, Randall Smith wrote: >>> You've gone on a rampage about nothing. My original description said >>> the client was supposed to encrypt the data, but you want to assume the >>> opposite for some unknown reason. >> >> While you seem to think that Steven is rampaging about nothing, he does >> have a fair point: You consistently were vague about wheter you want to >> have encryption, authentication or obfuscation of data. This suggests >> that you may not be so sure yourself what it is you actually want. > > He hasn't been vague, you and Steven just haven't been paying > attention. > >> You always play around with the 256! which would be a ridiculously high >> security margin (1684 bits of security, woooo!). You totally ignore that >> the system can be broken in a linear fashion. > > No, it can't, because the attacker does not have access to the > ciphertext. > >> Nobody assumes you're a moron. But it's safe to assume that you're a >> crypto layman, because only laymen have no clue on how difficult it is >> to get cryptography even remotely right. > > Amateur crypto is indeed a bad idea. But what you're still not getting > is that what he's doing here *isn't crypto*. He's just trying to avoid > letting third parties write completely arbitrary data to the disk. You > know what would be a perfectly good solution to his problem? Base 64 > encoding. That would solve the issue pretty much completely, the only > reason it's not an ideal solution is that it of course increases the > size of the data. > >> That people in 2015 actually defend inventing a substitution-cipher >> "crypto"system sends literally shivers down my spine. > > Nobody is defending such a thing, you just haven't understood what > problem is being solved here. > To be perfectly blunt I gave up days ago trying to follow what was being said, just too many words from all angles and too few diagrams for me to follow. I sincerely hope it doesn't end in tears. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Johannes Bauer <dfnsonfsduifb@gmx.de> |
|---|---|
| Date | 2015-06-27 00:42 +0200 |
| Message-ID | <mmkkhc$m5d$1@news.albasani.net> |
| In reply to | #93207 |
On 26.06.2015 23:29, Jon Ribbens wrote: >> While you seem to think that Steven is rampaging about nothing, he does >> have a fair point: You consistently were vague about wheter you want to >> have encryption, authentication or obfuscation of data. This suggests >> that you may not be so sure yourself what it is you actually want. > > He hasn't been vague, you and Steven just haven't been paying > attention. Bullshit. Even the topic indicates that he doesn't know what he wants: "data mangling" or "encryption", which one is it? >> You always play around with the 256! which would be a ridiculously high >> security margin (1684 bits of security, woooo!). You totally ignore that >> the system can be broken in a linear fashion. > > No, it can't, because the attacker does not have access to the > ciphertext. Or so you claim. I could go into detail about how the assumtion that the ciphertext is secret is not a smart one in the context of cryptography. And how side channels and other leakage may affect overall system security. But I'm going to save my time on that. I do get paid to review cryptographic systems and part of the job is dealing with belligerent people who have read Schneier's blog and think they can outsmart anyone else. Since I don't get paid to convice you, it's absolutely fine that you think your substitution scheme is the grand prize. >> Nobody assumes you're a moron. But it's safe to assume that you're a >> crypto layman, because only laymen have no clue on how difficult it is >> to get cryptography even remotely right. > > Amateur crypto is indeed a bad idea. But what you're still not getting > is that what he's doing here *isn't crypto*. So the topic says "Encrypting". If you look really closely at the word, the part "crypt" might give away to you that cryptography is involved. > He's just trying to avoid > letting third parties write completely arbitrary data to the disk. There's your requirement. Then there's obviously some kind of implication when a third party *can* write arbitrary data to disk. And your other solution to that problem... > You > know what would be a perfectly good solution to his problem? Base 64 > encoding. That would solve the issue pretty much completely, the only > reason it's not an ideal solution is that it of course increases the > size of the data. ...wow. That's a nice interpretation of not letting a third party write completely arbitrary data. According to your definition, this would be: It's okay if the attacker can control 6 of 8 bits. >> That people in 2015 actually defend inventing a substitution-cipher >> "crypto"system sends literally shivers down my spine. > > Nobody is defending such a thing, you just haven't understood what > problem is being solved here. Oh I understand your "solutions" plenty well. The only thing I don't understand is why you don't own a Fields medal yet for your groundbreaking work on bulletproof obfuscation. Cheers, Johannes -- >> Wo hattest Du das Beben nochmal GENAU vorhergesagt? > Zumindest nicht öffentlich! Ah, der neueste und bis heute genialste Streich unsere großen Kosmologen: Die Geheim-Vorhersage. - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2015-06-26 16:26 -0700 |
| Message-ID | <mailman.114.1435361236.3674.python-list@python.org> |
| In reply to | #93210 |
Johannes, I agree with a lot of what you say, but can you please have less of a mean attitude? -- Devin On Fri, Jun 26, 2015 at 3:42 PM, Johannes Bauer <dfnsonfsduifb@gmx.de> wrote: > On 26.06.2015 23:29, Jon Ribbens wrote: > >>> While you seem to think that Steven is rampaging about nothing, he does >>> have a fair point: You consistently were vague about wheter you want to >>> have encryption, authentication or obfuscation of data. This suggests >>> that you may not be so sure yourself what it is you actually want. >> >> He hasn't been vague, you and Steven just haven't been paying >> attention. > > Bullshit. Even the topic indicates that he doesn't know what he wants: > "data mangling" or "encryption", which one is it? > >>> You always play around with the 256! which would be a ridiculously high >>> security margin (1684 bits of security, woooo!). You totally ignore that >>> the system can be broken in a linear fashion. >> >> No, it can't, because the attacker does not have access to the >> ciphertext. > > Or so you claim. > > I could go into detail about how the assumtion that the ciphertext is > secret is not a smart one in the context of cryptography. And how side > channels and other leakage may affect overall system security. But I'm > going to save my time on that. I do get paid to review cryptographic > systems and part of the job is dealing with belligerent people who have > read Schneier's blog and think they can outsmart anyone else. Since I > don't get paid to convice you, it's absolutely fine that you think your > substitution scheme is the grand prize. > >>> Nobody assumes you're a moron. But it's safe to assume that you're a >>> crypto layman, because only laymen have no clue on how difficult it is >>> to get cryptography even remotely right. >> >> Amateur crypto is indeed a bad idea. But what you're still not getting >> is that what he's doing here *isn't crypto*. > > So the topic says "Encrypting". If you look really closely at the word, > the part "crypt" might give away to you that cryptography is involved. > >> He's just trying to avoid >> letting third parties write completely arbitrary data to the disk. > > There's your requirement. Then there's obviously some kind of > implication when a third party *can* write arbitrary data to disk. And > your other solution to that problem... > >> You >> know what would be a perfectly good solution to his problem? Base 64 >> encoding. That would solve the issue pretty much completely, the only >> reason it's not an ideal solution is that it of course increases the >> size of the data. > > ...wow. > > That's a nice interpretation of not letting a third party write > completely arbitrary data. According to your definition, this would be: > It's okay if the attacker can control 6 of 8 bits. > >>> That people in 2015 actually defend inventing a substitution-cipher >>> "crypto"system sends literally shivers down my spine. >> >> Nobody is defending such a thing, you just haven't understood what >> problem is being solved here. > > Oh I understand your "solutions" plenty well. The only thing I don't > understand is why you don't own a Fields medal yet for your > groundbreaking work on bulletproof obfuscation. > > Cheers, > Johannes > > -- >>> Wo hattest Du das Beben nochmal GENAU vorhergesagt? >> Zumindest nicht öffentlich! > Ah, der neueste und bis heute genialste Streich unsere großen > Kosmologen: Die Geheim-Vorhersage. > - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org> > -- > https://mail.python.org/mailman/listinfo/python-list
[toc] | [prev] | [next] | [standalone]
Page 1 of 5 [1] 2 3 4 5 Next page →
Back to top | Article view | comp.lang.python
csiph-web