Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #75871 > unrolled thread
| Started by | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| First post | 2014-08-08 02:01 -0700 |
| Last post | 2014-08-10 10:38 -0600 |
| Articles | 20 on this page of 29 — 10 participants |
Back to article view | Back to comp.lang.python
Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 02:01 -0700
Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-08 19:22 +1000
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 02:42 -0700
Re: Template language for random string generation Ned Batchelder <ned@nedbatchelder.com> - 2014-08-08 07:20 -0400
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 06:02 -0700
Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-08 21:29 +1000
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 06:03 -0700
Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-09 00:08 +1000
Re: Template language for random string generation Skip Montanaro <skip@pobox.com> - 2014-08-08 09:35 -0500
Re: Template language for random string generation cwolf.algo@gmail.com - 2014-08-08 11:43 -0700
Re: Template language for random string generation Nick Cash <nick.cash@npcinternational.com> - 2014-08-08 20:28 +0000
Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-08 16:03 -0600
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 23:52 -0700
Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-09 01:49 -0600
Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-09 01:57 -0600
Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 05:43 -0700
Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-11 02:31 +1000
Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 11:28 -0700
Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-11 12:22 +1000
Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-11 12:31 +1000
Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-11 00:01 -0700
Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-11 05:25 +1000
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 22:06 -0700
Re: Template language for random string generation Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-08-11 08:58 +0100
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 09:34 -0700
Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-10 10:47 -0600
Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 21:56 -0700
Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 11:48 -0700
Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-10 10:38 -0600
Page 1 of 2 [1] 2 Next page →
| From | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| Date | 2014-08-08 02:01 -0700 |
| Subject | Template language for random string generation |
| Message-ID | <14d94692-2257-4dfb-a82f-f1674a839233@googlegroups.com> |
This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language. Example:
>>> from strgen import StringGenerator as SG
>>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
u'F0vghTjKalf4^mGLk'
The template ([\l\d]{8:15}&[\d]&[\p]) generates a string from 8 to 15 characters in length with letters, digits. It is guaranteed to have at least one digit (maybe more) and exactly one punctuation character.
If you look at various forums, like Stackoverflow, on how to generate random strings with Python, especially for passwords and other hopefully secure tokens, you will see dozens of variations of this:
>>> import random
>>> import string
>>> mypassword = ''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(10))
There is nothing wrong with this (it's the right answer and is very fast), but it leads developers to constantly:
* Use cryptographically weak methods
* Forget that the above does not guarantee a result that includes the different classes of characters
* Doesn't include variable length or minimum length output
* It's a lot of typing and the resulting code is vastly different each time making it hard to understand what features were implemented, especially for those new to the language
* You can extend the above to include whatever requirements you want, but it's a constant exercise in wheel reinvention that is extremely verbose, error prone and confusing for exactly the same purposes each time
This application (generation of random strings for passwords, vouchers, secure ids, test data, etc.) is so general, it seems to beg for a general solution. So, why not have a standard way of expressing these using a simple template language?
strgen:
* Is far less verbose than commonly offered solutions
* Trivial editing of the pattern lets you incorporate additional important features (variable length, minimum length, additional character classes, etc.)
* Uses a pattern language superficially similar to regular expressions, so it's easy to learn
* Uses SystemRandom class (if available, or falls back to Random)
* Supports > 2.6 through 3.3
* Supports unicode
* Uses a parse tree, so you can have complex - nested - expressions to do tricky data generation tasks, especially for test data generation
In my opinion, it would make using Python for this application much easier and more consistent for very common requirements. The template language could easily be a cross-language standard like regex.
You can `pip install strgen`.
It's on Github: https://github.com/paul-wolf/strgen
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-08-08 19:22 +1000 |
| Message-ID | <mailman.12743.1407489761.18130.python-list@python.org> |
| In reply to | #75871 |
On Fri, Aug 8, 2014 at 7:01 PM, Paul Wolf <paulwolf333@gmail.com> wrote: > This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language. Looks good! One thing, though: > * Supports > 2.6 through 3.3 The implication of a simple reading of this statement is that your code should run on 2.6, 2.7, 3.0, 3.1, 3.2, and 3.3, and hasn't been tested on 3.4. But I eyeballed your code, and I'm seeing a lot of u'string' prefixes, which aren't supported on 3.0-3.2 (they were reinstated in 3.3 as per PEP 414), so a more likely version set would be 2.6+, 3.3+. What's the actual version support? Apologies for making such a minor quibble! But I'm curious as to what you actually support. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| Date | 2014-08-08 02:42 -0700 |
| Message-ID | <1fc4393c-7d70-495e-ac10-a51acfb56d99@googlegroups.com> |
| In reply to | #75873 |
On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico wrote: > But I eyeballed your code, and I'm seeing a lot of > u'string' prefixes, which aren't supported on 3.0-3.2 (they were > reinstated in 3.3 as per PEP 414), so a more likely version set would > > be 2.6+, 3.3+. What's the actual version support? > ChrisA I'm going to have to assume you are right that I only tested on 3.3, skipping > 2.7 and < 3.3. I'll create an issue for that.
[toc] | [prev] | [next] | [standalone]
| From | Ned Batchelder <ned@nedbatchelder.com> |
|---|---|
| Date | 2014-08-08 07:20 -0400 |
| Message-ID | <mailman.12745.1407496858.18130.python-list@python.org> |
| In reply to | #75875 |
On 8/8/14 5:42 AM, Paul Wolf wrote: > On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico wrote: >> But I eyeballed your code, and I'm seeing a lot of >> u'string' prefixes, which aren't supported on 3.0-3.2 (they were >> reinstated in 3.3 as per PEP 414), so a more likely version set would >> >> be 2.6+, 3.3+. What's the actual version support? >> ChrisA > > I'm going to have to assume you are right that I only tested on 3.3, skipping > 2.7 and < 3.3. I'll create an issue for that. > Don't bother trying to support <=3.2. It will be far more difficult than it is worth in terms of adoption of the library. Also, you don't need to write a "proposal" for your library. You've written the library, and it's on PyPI. You aren't trying to add it to the stdlib, so there's no agreement you need to get from anyone else. It can simply succeed on its merits with people using it. -- Ned Batchelder, http://nedbatchelder.com
[toc] | [prev] | [next] | [standalone]
| From | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| Date | 2014-08-08 06:02 -0700 |
| Message-ID | <5f1d7430-03da-4972-a649-acc7b6cc8fa4@googlegroups.com> |
| In reply to | #75876 |
On Friday, 8 August 2014 12:20:36 UTC+1, Ned Batchelder wrote: > On 8/8/14 5:42 AM, Paul Wolf wrote: > > Don't bother trying to support <=3.2. It will be far more difficult > > than it is worth in terms of adoption of the library. > > Also, you don't need to write a "proposal" for your library. You've > > written the library, and it's on PyPI. You aren't trying to add it to Thanks for that. I'll follow that advice.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-08-08 21:29 +1000 |
| Message-ID | <mailman.12746.1407497357.18130.python-list@python.org> |
| In reply to | #75875 |
On Fri, Aug 8, 2014 at 9:20 PM, Ned Batchelder <ned@nedbatchelder.com> wrote: > On 8/8/14 5:42 AM, Paul Wolf wrote: >> >> On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico wrote: >>> >>> But I eyeballed your code, and I'm seeing a lot of >>> u'string' prefixes, which aren't supported on 3.0-3.2 (they were >>> reinstated in 3.3 as per PEP 414), so a more likely version set would >>> >>> be 2.6+, 3.3+. What's the actual version support? >>> ChrisA >> >> >> I'm going to have to assume you are right that I only tested on 3.3, >> skipping > 2.7 and < 3.3. I'll create an issue for that. >> > > Don't bother trying to support <=3.2. It will be far more difficult than it > is worth in terms of adoption of the library. Agreed. I would be looking at the solution here being "test on 3.4, then (assuming no problems) declare that it works on 3.3+". Anyone on Debian Wheezy can spin up a Python 3 from source anyway, and presumably ditto for any other Linux distro that's distributing 3.1 or 3.2; most other platforms should have a more modern Python available one way or another. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| Date | 2014-08-08 06:03 -0700 |
| Message-ID | <252cd452-24ad-4ee7-b5e6-a992741f9eb5@googlegroups.com> |
| In reply to | #75877 |
On Friday, 8 August 2014 12:29:09 UTC+1, Chris Angelico wrote: > Debian Wheezy can spin up a Python 3 from source anyway, and > > presumably ditto for any other Linux distro that's distributing 3.1 or > > 3.2; most other platforms should have a more modern Python available > > one way or another. > > > > ChrisA Yes, agreed. I'll update the version info.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-08-09 00:08 +1000 |
| Message-ID | <53e4d9f4$0$6574$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #75871 |
Paul Wolf wrote:
> This is a proposal with a working implementation for a random string
> generation template syntax for Python. `strgen` is a module for generating
> random strings in Python using a regex-like template language. Example:
>
> >>> from strgen import StringGenerator as SG
> >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
> u'F0vghTjKalf4^mGLk'
Nice! Although very specialised :-)
I second what Ned and Chris have to say.
> If you look at various forums, like Stackoverflow, on how to generate
> random strings with Python, especially for passwords and other hopefully
> secure tokens, you will see dozens of variations of this:
[...]
> There is nothing wrong with this (it's the right answer and is very fast),
> but it leads developers to constantly:
>
> * Use cryptographically weak methods
> * Forget that the above does not guarantee a result that includes the
> different classes of characters
> * Doesn't include variable length or minimum length output
> * It's a lot of typing and the resulting code is vastly different each
> time making it hard to understand what features were
> implemented, especially for those new to the language
> * You can extend the above to include whatever requirements you want,
> but it's a constant exercise in wheel reinvention that is extremely
> verbose, error prone and confusing for exactly the same purposes
> each time
So, there's nothing wrong with it, except for the five things you list which
are wrong with it :-)
Seriously, if you're going to compete with the Stackoverflow ad hoc
solutions, you have to be more assertive that there is a problem with the
ad hoc solutions.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Skip Montanaro <skip@pobox.com> |
|---|---|
| Date | 2014-08-08 09:35 -0500 |
| Message-ID | <mailman.12751.1407508515.18130.python-list@python.org> |
| In reply to | #75871 |
[Multipart message — attachments visible in raw view] — view raw
One suggestion, though perhaps nothing actually needs changing. I occasionally run into sites which define their password constraints as something like "minimum 8 characters, at least one number, one uppercase letter, and one special character." Their notion of "special" (which in my mind means any printable character which isn't a letter, whitespace, or digit) is only a subset. You include a "/" or a ";" and they kick your nice random password back at you, sometimes without telling you what you actually did wrong, only repeating, "minimum 8 characters, at least one number and one special character." You are left to discover through trial-and-error which "special" characters are actually allowed. Once you figure that out, I suppose you could use something like "[.-,()&@]" or whatever is actually allowed, but it would be nice if perhaps there was a way to figure out what some of these sites actually mean by "special" characters and define a \-escape which represents the lowest common denominator set of "special" characters. Definitely a small point though. Skip P.S. Probably a topic for a separate thread, and not actually Python-related, but on a related note, I have never found a free password keeper which works on all my platforms (Mac, Android, Unix). That is one stumbling block (for me) to actually using extremely strong passwords. If you have some thoughts, please contact me off-list.
[toc] | [prev] | [next] | [standalone]
| From | cwolf.algo@gmail.com |
|---|---|
| Date | 2014-08-08 11:43 -0700 |
| Message-ID | <e3b277a3-a5de-45d2-a13d-6493cc6213a7@googlegroups.com> |
| In reply to | #75890 |
On Friday, August 8, 2014 10:35:12 AM UTC-4, Skip Montanaro wrote: > One suggestion, though perhaps nothing actually needs changing. > > > I occasionally run into sites which define their password constraints as something like "minimum 8 characters, at least one number, one uppercase letter, and one special character." Their notion of "special" (which in my mind means any printable character which isn't a letter, whitespace, or digit) is only a subset. You include a "/" or a ";" and they kick your nice random password back at you, sometimes without telling you what you actually did wrong, only repeating, "minimum 8 characters, at least one number and one special character." You are left to discover through trial-and-error which "special" characters are actually allowed. Once you figure that out, I suppose you could use something like "[.-,()&@]" or whatever is actually allowed, but it would be nice if perhaps there was a way to figure out what some of these sites actually mean by "special" characters and define a \-escape which represents the lowest common denominator set of "special" characters. > > > > Definitely a small point though. > > > Skip > > > > P.S. Probably a topic for a separate thread, and not actually Python-related, but on a related note, I have never found a free password keeper which works on all my platforms (Mac, Android, Unix). That is one stumbling block (for me) to actually using extremely strong passwords. If you have some thoughts, please contact me off-list. Skip - try "lastpass.com" it's cross platform, include Win, Mac, Linux, Android and iOS.
[toc] | [prev] | [next] | [standalone]
| From | Nick Cash <nick.cash@npcinternational.com> |
|---|---|
| Date | 2014-08-08 20:28 +0000 |
| Message-ID | <mailman.12766.1407535707.18130.python-list@python.org> |
| In reply to | #75902 |
On 08/08/2014 01:45 PM, cwolf.algo@gmail.com wrote: > On Friday, August 8, 2014 10:35:12 AM UTC-4, Skip Montanaro wrote: >> P.S. Probably a topic for a separate thread, and not actually Python-related, but on a related note, I have never found a free password keeper which works on all my platforms (Mac, Android, Unix). That is one stumbling block (for me) to actually using extremely strong passwords. If you have some thoughts, please contact me off-list. > Skip - try "lastpass.com" it's cross platform, include Win, Mac, Linux, Android and iOS. LastPass is pretty nice (and I use it on Windows, Mac, Linux and Android myself), but the mobile versions aren't free: https://lastpass.com/misc_download2.php
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-08-08 16:03 -0600 |
| Message-ID | <mailman.12765.1407535446.18130.python-list@python.org> |
| In reply to | #75871 |
On Fri, Aug 8, 2014 at 3:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote: > * Uses SystemRandom class (if available, or falls back to Random) A simple improvement would be to also allow the user to pass in a Random object, in case they have their own source of randomness they want to use, or for fake Randoms used for writing unit tests that invoke strgen. Have you given any thought to adding a validation mode, where the user provides a template and a string and wants to know if the string matches the template?
[toc] | [prev] | [next] | [standalone]
| From | Paul Wolf <paulwolf333@gmail.com> |
|---|---|
| Date | 2014-08-08 23:52 -0700 |
| Message-ID | <58187503-1651-4eca-a131-49f474148f62@googlegroups.com> |
| In reply to | #75909 |
On Friday, 8 August 2014 23:03:18 UTC+1, Ian wrote: > On Fri, Aug 8, 2014 at 3:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote: > > > * Uses SystemRandom class (if available, or falls back to Random) > A simple improvement would be to also allow the user to pass in a > Random object That is not a bad idea. I'll create an issue for it. It is a design goal to use the standard library within the implementation so users have a guarantee about exactly how the data is generated. But your suggestion is not inconsistent with that. > > Have you given any thought to adding a validation mode, where the user > provides a template and a string and wants to know if the string > matches the template? Isn't that what regular expressions are? Or do you have a clarifying use case? strgen is provided as the converse of regular expressions.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-08-09 01:49 -0600 |
| Message-ID | <mailman.12786.1407570638.18130.python-list@python.org> |
| In reply to | #75933 |
On Sat, Aug 9, 2014 at 12:52 AM, Paul Wolf <paulwolf333@gmail.com> wrote: > On Friday, 8 August 2014 23:03:18 UTC+1, Ian wrote: >> Have you given any thought to adding a validation mode, where the user >> provides a template and a string and wants to know if the string >> matches the template? > > Isn't that what regular expressions are? Or do you have a clarifying use case? > > strgen is provided as the converse of regular expressions. The syntax is not equivalent though. You can't take a strgen template, pass it into the re module, and just expect it to work. Also, I'm not sure how best to go about writing a regular expression for, e.g. "12 or more letters, digits, and punctuation, including at least one each of uppercase letter, lowercase letter, digit, and punctuation". I'm fairly certain that language is regular, but actually matching it with a regular expression would be a nightmare.
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2014-08-09 01:57 -0600 |
| Message-ID | <mailman.12787.1407571088.18130.python-list@python.org> |
| In reply to | #75933 |
On Sat, Aug 9, 2014 at 1:49 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Sat, Aug 9, 2014 at 12:52 AM, Paul Wolf <paulwolf333@gmail.com> wrote: >> On Friday, 8 August 2014 23:03:18 UTC+1, Ian wrote: >>> Have you given any thought to adding a validation mode, where the user >>> provides a template and a string and wants to know if the string >>> matches the template? >> >> Isn't that what regular expressions are? Or do you have a clarifying use case? >> >> strgen is provided as the converse of regular expressions. > > The syntax is not equivalent though. You can't take a strgen template, > pass it into the re module, and just expect it to work. > > Also, I'm not sure how best to go about writing a regular expression > for, e.g. "12 or more letters, digits, and punctuation, including at > least one each of uppercase letter, lowercase letter, digit, and > punctuation". I'm fairly certain that language is regular, but > actually matching it with a regular expression would be a nightmare. To clarify further, validating that *without* using a regular expression is not too terribly difficult, but the value that I see in validating it with a strgen is that one could then be sure that one's string generation and validation were equivalent. In contrast, if you have a strgen for generation and a series of string manipulations for validation, then it's hard to be certain there aren't any differences.
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2014-08-10 05:43 -0700 |
| Message-ID | <mailman.12817.1407674628.18130.python-list@python.org> |
| In reply to | #75871 |
On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
> This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language. Example:
>
> >>> from strgen import StringGenerator as SG
> >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
> u'F0vghTjKalf4^mGLk'
Why aren't you using regular expressions? I am all for conciseness,
but using an existing format is so helpful...
Unfortunately, the equivalent regexp probably looks like
r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'
(I've been working on this kind of thing with regexps, but it's still
incomplete.)
> * Uses SystemRandom class (if available, or falls back to Random)
This sounds cryptographically weak. Isn't the normal thing to do to
use a cryptographic hash function to generate a pseudorandom sequence?
Someone should write a cryptographically secure pseudorandom number
generator library for Python. :(
(I think OpenSSL comes with one, but then you can't choose the seed.)
-- Devin
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-08-11 02:31 +1000 |
| Message-ID | <53e79e46$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #75981 |
Devin Jeanpierre wrote:
> On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
>> This is a proposal with a working implementation for a random string
>> generation template syntax for Python. `strgen` is a module for
>> generating random strings in Python using a regex-like template language.
>> Example:
>>
>> >>> from strgen import StringGenerator as SG
>> >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
>> u'F0vghTjKalf4^mGLk'
>
> Why aren't you using regular expressions? I am all for conciseness,
> but using an existing format is so helpful...
You've just answered your own question:
> Unfortunately, the equivalent regexp probably looks like
> r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'
Apart from being needlessly verbose, regex syntax is not appropriate because
it specifies too much, specifies too little, and specifies the wrong
things. It specifies too much: regexes like ^ and $ are meaningless in this
case. It specifies too little: there's no regex for the "shuffle operator".
And it specifies the wrong things: regexes like (?= ...) as used in your
example are for matching, not generating strings, and it isn't clear
what "match any character but don't consume any of the string" means when
generating strings.
Personally, I think even the OP's specified language is too complex. For
example, it supports literal text, but given the use-case (password
generators) do we really want to support templates like "password[\d]"? I
don't think so, and if somebody did, they can trivially say "password" +
SG('[\d]').render().
Larry Wall (the creator of Perl) has stated that one of the mistakes with
Perl's regular expression mini-language is that the Huffman coding is
wrong. Common things should be short, uncommon things can afford to be
longer. Since the most common thing for password generation is to specify
character classes, they should be short, e.g. d rather than [\d] (one
character versus four).
The template given could potentially be simplified to:
"(LD){8:15}&D&P"
where the round brackets () are purely used for grouping. Character codes
are specified by a single letter. (I use uppercase to avoid the problem
that l & 1 look very similar. YMMV.) The model here is custom format codes
from spreadsheets, which should be comfortable to anyone who is familiar
with Excel or OpenOffice. If you insist on having the facility to including
literal text in your templates, might I suggest:
"'password'd" # Literal string "password", followed by a single digit.
but personally I believe that for the use-case given, that's a mistake.
Alternatively, date/time templates use two-character codes like %Y %m etc,
which is better than
> (I've been working on this kind of thing with regexps, but it's still
> incomplete.)
>
>> * Uses SystemRandom class (if available, or falls back to Random)
>
> This sounds cryptographically weak. Isn't the normal thing to do to
> use a cryptographic hash function to generate a pseudorandom sequence?
I don't think that using a good, but not cryptographically-strong, random
number generator to generate passwords is a serious vulnerability. What's
your threat model? Attacks on passwords tend to be one of a very few:
- dictionary attacks (including tables of common passwords and
simple transformations of words, e.g. 'pas5w0d');
- brute force against short and weak passwords;
- attacking the hash function used to store passwords (not the password
itself), e.g. rainbow tables;
- keyloggers or some other way of stealing the password (including
phishing sites and the ever-popular "beat them with a lead pipe
until they give up the password");
- other social attacks, e.g. guessing that the person's password is their
date of birth in reverse.
But unless the random number generator is *ridiculously* weak ("9, 9, 9, 9,
9, 9, ...") I can't see any way to realistically attack the password
generator based on the weakness of the random number generator. Perhaps I'm
missing something?
> Someone should write a cryptographically secure pseudorandom number
> generator library for Python. :(
Here, let me google that for you :-)
https://duckduckgo.com/html/?q=python+crypto
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Devin Jeanpierre <jeanpierreda@gmail.com> |
|---|---|
| Date | 2014-08-10 11:28 -0700 |
| Message-ID | <mailman.12823.1407696946.18130.python-list@python.org> |
| In reply to | #75984 |
On Sun, Aug 10, 2014 at 9:31 AM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: >> (I've been working on this kind of thing with regexps, but it's still >> incomplete.) >> >>> * Uses SystemRandom class (if available, or falls back to Random) >> >> This sounds cryptographically weak. Isn't the normal thing to do to >> use a cryptographic hash function to generate a pseudorandom sequence? > > I don't think that using a good, but not cryptographically-strong, random > number generator to generate passwords is a serious vulnerability. What's > your threat model? I've always wanted a password generator that worked on the fly based off of a master password. If the passwords are generated randomly but not cryptographically securely so, then given sufficiently many passwords, the master password might be deduced. CSPRNGs guarantee otherwise. >> Someone should write a cryptographically secure pseudorandom number >> generator library for Python. :( > > Here, let me google that for you I should clarify that OpenSSL has one (which is what I assume you're alluding to), but it doesn't let you choose the seed, so it's useless for deterministic password generation. There are also lots of small libraries some person wrote at some time, but that sounds shady. ;) -- Devin
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2014-08-11 12:22 +1000 |
| Message-ID | <53e828e9$0$29966$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #75998 |
Devin Jeanpierre wrote: > On Sun, Aug 10, 2014 at 9:31 AM, Steven D'Aprano > <steve+comp.lang.python@pearwood.info> wrote: >> I don't think that using a good, but not cryptographically-strong, random >> number generator to generate passwords is a serious vulnerability. What's >> your threat model? > > I've always wanted a password generator that worked on the fly based > off of a master password. If the passwords are generated randomly but > not cryptographically securely so, then given sufficiently many > passwords, the master password might be deduced. o_O So, what you're saying is that you're concerned that if an attacker has all your passwords, they might be able to generate new passwords? [...] >>> Someone should write a cryptographically secure pseudorandom number >>> generator library for Python. :( >> >> Here, let me google that for you > > I should clarify that OpenSSL has one (which is what I assume you're > alluding to), No. If you follow the link I provided, I'm sure you will find what you are after. > but it doesn't let you choose the seed, so it's useless > for deterministic password generation. There are also lots of small > libraries some person wrote at some time, but that sounds shady. ;) You mean the opposite to OpenSSL, which was handed down to Mankind from the Gods? The size of the library doesn't matter, what matters is how well it implements what crypto standards. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-08-11 12:31 +1000 |
| Message-ID | <mailman.12833.1407724268.18130.python-list@python.org> |
| In reply to | #76014 |
On Mon, Aug 11, 2014 at 12:22 PM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > You mean the opposite to OpenSSL, which was handed down to Mankind from > the Gods? I thought Prometheus stole OpenSSL and gave it to mankind so a group of Minotaurs would stop teasing him about Heartbleed. ChrisA
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web