Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #75871 > unrolled thread

Template language for random string generation

Started byPaul Wolf <paulwolf333@gmail.com>
First post2014-08-08 02:01 -0700
Last post2014-08-10 10:38 -0600
Articles 20 on this page of 29 — 10 participants

Back to article view | Back to comp.lang.python


Contents

  Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 02:01 -0700
    Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-08 19:22 +1000
      Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 02:42 -0700
        Re: Template language for random string generation Ned Batchelder <ned@nedbatchelder.com> - 2014-08-08 07:20 -0400
          Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 06:02 -0700
        Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-08 21:29 +1000
          Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 06:03 -0700
    Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-09 00:08 +1000
    Re: Template language for random string generation Skip Montanaro <skip@pobox.com> - 2014-08-08 09:35 -0500
      Re: Template language for random string generation cwolf.algo@gmail.com - 2014-08-08 11:43 -0700
        Re: Template language for random string generation Nick Cash <nick.cash@npcinternational.com> - 2014-08-08 20:28 +0000
    Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-08 16:03 -0600
      Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-08 23:52 -0700
        Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-09 01:49 -0600
        Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-09 01:57 -0600
    Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 05:43 -0700
      Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-11 02:31 +1000
        Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 11:28 -0700
          Re: Template language for random string generation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-08-11 12:22 +1000
            Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-11 12:31 +1000
            Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-11 00:01 -0700
        Re: Template language for random string generation Chris Angelico <rosuav@gmail.com> - 2014-08-11 05:25 +1000
        Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 22:06 -0700
          Re: Template language for random string generation Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-08-11 08:58 +0100
      Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 09:34 -0700
        Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-10 10:47 -0600
          Re: Template language for random string generation Paul Wolf <paulwolf333@gmail.com> - 2014-08-10 21:56 -0700
        Re: Template language for random string generation Devin Jeanpierre <jeanpierreda@gmail.com> - 2014-08-10 11:48 -0700
    Re: Template language for random string generation Ian Kelly <ian.g.kelly@gmail.com> - 2014-08-10 10:38 -0600

Page 1 of 2  [1] 2  Next page →


#75871 — Template language for random string generation

FromPaul Wolf <paulwolf333@gmail.com>
Date2014-08-08 02:01 -0700
SubjectTemplate language for random string generation
Message-ID<14d94692-2257-4dfb-a82f-f1674a839233@googlegroups.com>
This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language. Example: 

    >>> from strgen import StringGenerator as SG
    >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
    u'F0vghTjKalf4^mGLk'

The template ([\l\d]{8:15}&[\d]&[\p]) generates a string from 8 to 15 characters in length with letters, digits. It is guaranteed to have at least one digit (maybe more) and exactly one punctuation character. 

If you look at various forums, like Stackoverflow, on how to generate random strings with Python, especially for passwords and other hopefully secure tokens, you will see dozens of variations of this: 

   >>> import random
   >>> import string
   >>> mypassword = ''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(10))

There is nothing wrong with this (it's the right answer and is very fast), but it leads developers to constantly:

* Use cryptographically weak methods
* Forget that the above does not guarantee a result that includes the different classes of characters
* Doesn't include variable length or minimum length output
* It's a lot of typing and the resulting code is vastly different each time making it hard to understand what features were implemented, especially for those new to the language
* You can extend the above to include whatever requirements you want, but it's a constant exercise in wheel reinvention that is extremely verbose, error prone and confusing for exactly the same purposes each time

This application (generation of random strings for passwords, vouchers, secure ids, test data, etc.) is so general, it seems to beg for a general solution. So, why not have a standard way of expressing these using a simple template language? 

strgen: 

* Is far less verbose than commonly offered solutions
* Trivial editing of the pattern lets you incorporate additional important features (variable length, minimum length, additional character classes, etc.)
* Uses a pattern language superficially similar to regular expressions, so it's easy to learn
* Uses SystemRandom class (if available, or falls back to Random)
* Supports > 2.6 through 3.3
* Supports unicode
* Uses a parse tree, so you can have complex - nested - expressions to do tricky data generation tasks, especially for test data generation

In my opinion, it would make using Python for this application much easier and more consistent for very common requirements. The template language could easily be a cross-language standard like regex.  

You can `pip install strgen`. 

It's on Github: https://github.com/paul-wolf/strgen

[toc] | [next] | [standalone]


#75873

FromChris Angelico <rosuav@gmail.com>
Date2014-08-08 19:22 +1000
Message-ID<mailman.12743.1407489761.18130.python-list@python.org>
In reply to#75871
On Fri, Aug 8, 2014 at 7:01 PM, Paul Wolf <paulwolf333@gmail.com> wrote:
> This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language.

Looks good! One thing, though:

> * Supports > 2.6 through 3.3

The implication of a simple reading of this statement is that your
code should run on 2.6, 2.7, 3.0, 3.1, 3.2, and 3.3, and hasn't been
tested on 3.4. But I eyeballed your code, and I'm seeing a lot of
u'string' prefixes, which aren't supported on 3.0-3.2 (they were
reinstated in 3.3 as per PEP 414), so a more likely version set would
be 2.6+, 3.3+. What's the actual version support?

Apologies for making such a minor quibble! But I'm curious as to what
you actually support.

ChrisA

[toc] | [prev] | [next] | [standalone]


#75875

FromPaul Wolf <paulwolf333@gmail.com>
Date2014-08-08 02:42 -0700
Message-ID<1fc4393c-7d70-495e-ac10-a51acfb56d99@googlegroups.com>
In reply to#75873
On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico  wrote:
> But I eyeballed your code, and I'm seeing a lot of
> u'string' prefixes, which aren't supported on 3.0-3.2 (they were
> reinstated in 3.3 as per PEP 414), so a more likely version set would
> 
> be 2.6+, 3.3+. What's the actual version support?
> ChrisA

I'm going to have to assume you are right that I only tested on 3.3, skipping > 2.7 and < 3.3. I'll create an issue for that. 

[toc] | [prev] | [next] | [standalone]


#75876

FromNed Batchelder <ned@nedbatchelder.com>
Date2014-08-08 07:20 -0400
Message-ID<mailman.12745.1407496858.18130.python-list@python.org>
In reply to#75875
On 8/8/14 5:42 AM, Paul Wolf wrote:
> On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico  wrote:
>> But I eyeballed your code, and I'm seeing a lot of
>> u'string' prefixes, which aren't supported on 3.0-3.2 (they were
>> reinstated in 3.3 as per PEP 414), so a more likely version set would
>>
>> be 2.6+, 3.3+. What's the actual version support?
>> ChrisA
>
> I'm going to have to assume you are right that I only tested on 3.3, skipping > 2.7 and < 3.3. I'll create an issue for that.
>

Don't bother trying to support <=3.2.  It will be far more difficult 
than it is worth in terms of adoption of the library.

Also, you don't need to write a "proposal" for your library. You've 
written the library, and it's on PyPI.  You aren't trying to add it to 
the stdlib, so there's no agreement you need to get from anyone else. 
It can simply succeed on its merits with people using it.

-- 
Ned Batchelder, http://nedbatchelder.com

[toc] | [prev] | [next] | [standalone]


#75879

FromPaul Wolf <paulwolf333@gmail.com>
Date2014-08-08 06:02 -0700
Message-ID<5f1d7430-03da-4972-a649-acc7b6cc8fa4@googlegroups.com>
In reply to#75876
On Friday, 8 August 2014 12:20:36 UTC+1, Ned Batchelder  wrote:
> On 8/8/14 5:42 AM, Paul Wolf wrote:
> 

> Don't bother trying to support <=3.2.  It will be far more difficult 
> 
> than it is worth in terms of adoption of the library.
> 
> Also, you don't need to write a "proposal" for your library. You've 
> 
> written the library, and it's on PyPI.  You aren't trying to add it to 

Thanks for that. I'll follow that advice. 

[toc] | [prev] | [next] | [standalone]


#75877

FromChris Angelico <rosuav@gmail.com>
Date2014-08-08 21:29 +1000
Message-ID<mailman.12746.1407497357.18130.python-list@python.org>
In reply to#75875
On Fri, Aug 8, 2014 at 9:20 PM, Ned Batchelder <ned@nedbatchelder.com> wrote:
> On 8/8/14 5:42 AM, Paul Wolf wrote:
>>
>> On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico  wrote:
>>>
>>> But I eyeballed your code, and I'm seeing a lot of
>>> u'string' prefixes, which aren't supported on 3.0-3.2 (they were
>>> reinstated in 3.3 as per PEP 414), so a more likely version set would
>>>
>>> be 2.6+, 3.3+. What's the actual version support?
>>> ChrisA
>>
>>
>> I'm going to have to assume you are right that I only tested on 3.3,
>> skipping > 2.7 and < 3.3. I'll create an issue for that.
>>
>
> Don't bother trying to support <=3.2.  It will be far more difficult than it
> is worth in terms of adoption of the library.

Agreed. I would be looking at the solution here being "test on 3.4,
then (assuming no problems) declare that it works on 3.3+". Anyone on
Debian Wheezy can spin up a Python 3 from source anyway, and
presumably ditto for any other Linux distro that's distributing 3.1 or
3.2; most other platforms should have a more modern Python available
one way or another.

ChrisA

[toc] | [prev] | [next] | [standalone]


#75880

FromPaul Wolf <paulwolf333@gmail.com>
Date2014-08-08 06:03 -0700
Message-ID<252cd452-24ad-4ee7-b5e6-a992741f9eb5@googlegroups.com>
In reply to#75877
On Friday, 8 August 2014 12:29:09 UTC+1, Chris Angelico  wrote:
> Debian Wheezy can spin up a Python 3 from source anyway, and
> 
> presumably ditto for any other Linux distro that's distributing 3.1 or
> 
> 3.2; most other platforms should have a more modern Python available
> 
> one way or another.
> 
> 
> 
> ChrisA

Yes, agreed. I'll update the version info. 

[toc] | [prev] | [next] | [standalone]


#75886

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-08-09 00:08 +1000
Message-ID<53e4d9f4$0$6574$c3e8da3$5496439d@news.astraweb.com>
In reply to#75871
Paul Wolf wrote:

> This is a proposal with a working implementation for a random string
> generation template syntax for Python. `strgen` is a module for generating
> random strings in Python using a regex-like template language. Example:
> 
>     >>> from strgen import StringGenerator as SG
>     >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
>     u'F0vghTjKalf4^mGLk'

Nice! Although very specialised :-)

I second what Ned and Chris have to say.

> If you look at various forums, like Stackoverflow, on how to generate
> random strings with Python, especially for passwords and other hopefully
> secure tokens, you will see dozens of variations of this:
[...]
> There is nothing wrong with this (it's the right answer and is very fast),
> but it leads developers to constantly:
> 
> * Use cryptographically weak methods
> * Forget that the above does not guarantee a result that includes the
>   different classes of characters 
> * Doesn't include variable length or minimum length output
> * It's a lot of typing and the resulting code is vastly different each
>   time making it hard to understand what features were 
>   implemented, especially for those new to the language 
> * You can extend the above to include whatever requirements you want,
>   but it's a constant exercise in wheel reinvention that is extremely
>   verbose, error prone and confusing for exactly the same purposes
>   each time 

So, there's nothing wrong with it, except for the five things you list which
are wrong with it :-)

Seriously, if you're going to compete with the Stackoverflow ad hoc
solutions, you have to be more assertive that there is a problem with the
ad hoc solutions.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#75890

FromSkip Montanaro <skip@pobox.com>
Date2014-08-08 09:35 -0500
Message-ID<mailman.12751.1407508515.18130.python-list@python.org>
In reply to#75871

[Multipart message — attachments visible in raw view] — view raw

One suggestion, though perhaps nothing actually needs changing.

I occasionally run into sites which define their password constraints as
something like "minimum 8 characters, at least one number, one uppercase
letter, and one special character." Their notion of "special" (which in my
mind means any printable character which isn't a letter, whitespace, or
digit) is only a subset.  You include a "/" or a ";" and they kick your
nice random password back at you, sometimes without telling you what you
actually did wrong, only repeating, "minimum 8 characters, at least one
number and one special character." You are left to discover through
trial-and-error which "special" characters are actually allowed. Once you
figure that out, I suppose you could use something like "[.-,()&@]" or
whatever is actually allowed, but it would be nice if perhaps there was a
way to figure out what some of these sites actually mean by "special"
characters and define a \-escape which represents the lowest common
denominator set of "special" characters.

Definitely a small point though.

Skip

P.S. Probably a topic for a separate thread, and not actually
Python-related, but on a related note, I have never found a free password
keeper which works on all my platforms (Mac, Android, Unix). That is one
stumbling block (for me) to actually using extremely strong passwords. If
you have some thoughts, please contact me off-list.

[toc] | [prev] | [next] | [standalone]


#75902

Fromcwolf.algo@gmail.com
Date2014-08-08 11:43 -0700
Message-ID<e3b277a3-a5de-45d2-a13d-6493cc6213a7@googlegroups.com>
In reply to#75890
On Friday, August 8, 2014 10:35:12 AM UTC-4, Skip Montanaro wrote:
> One suggestion, though perhaps nothing actually needs changing.
> 
> 
> I occasionally run into sites which define their password constraints as something like "minimum 8 characters, at least one number, one uppercase letter, and one special character." Their notion of "special" (which in my mind means any printable character which isn't a letter, whitespace, or digit) is only a subset.  You include a "/" or a ";" and they kick your nice random password back at you, sometimes without telling you what you actually did wrong, only repeating, "minimum 8 characters, at least one number and one special character." You are left to discover through trial-and-error which "special" characters are actually allowed. Once you figure that out, I suppose you could use something like "[.-,()&@]" or whatever is actually allowed, but it would be nice if perhaps there was a way to figure out what some of these sites actually mean by "special" characters and define a \-escape which represents the lowest common denominator set of "special" characters.
> 
> 
> 
> Definitely a small point though.
> 
> 
> Skip
> 
> 
> 
> P.S. Probably a topic for a separate thread, and not actually Python-related, but on a related note, I have never found a free password keeper which works on all my platforms (Mac, Android, Unix). That is one stumbling block (for me) to actually using extremely strong passwords. If you have some thoughts, please contact me off-list.

Skip - try "lastpass.com" it's cross platform, include Win, Mac, Linux, Android and iOS.

[toc] | [prev] | [next] | [standalone]


#75910

FromNick Cash <nick.cash@npcinternational.com>
Date2014-08-08 20:28 +0000
Message-ID<mailman.12766.1407535707.18130.python-list@python.org>
In reply to#75902
On 08/08/2014 01:45 PM, cwolf.algo@gmail.com wrote:
> On Friday, August 8, 2014 10:35:12 AM UTC-4, Skip Montanaro wrote:
>> P.S. Probably a topic for a separate thread, and not actually Python-related, but on a related note, I have never found a free password keeper which works on all my platforms (Mac, Android, Unix). That is one stumbling block (for me) to actually using extremely strong passwords. If you have some thoughts, please contact me off-list.
> Skip - try "lastpass.com" it's cross platform, include Win, Mac, Linux, Android and iOS.

LastPass is pretty nice (and I use it on Windows, Mac, Linux and Android
myself), but the mobile versions aren't free:
https://lastpass.com/misc_download2.php

[toc] | [prev] | [next] | [standalone]


#75909

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-08-08 16:03 -0600
Message-ID<mailman.12765.1407535446.18130.python-list@python.org>
In reply to#75871
On Fri, Aug 8, 2014 at 3:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
> * Uses SystemRandom class (if available, or falls back to Random)

A simple improvement would be to also allow the user to pass in a
Random object, in case they have their own source of randomness they
want to use, or for fake Randoms used for writing unit tests that
invoke strgen.

Have you given any thought to adding a validation mode, where the user
provides a template and a string and wants to know if the string
matches the template?

[toc] | [prev] | [next] | [standalone]


#75933

FromPaul Wolf <paulwolf333@gmail.com>
Date2014-08-08 23:52 -0700
Message-ID<58187503-1651-4eca-a131-49f474148f62@googlegroups.com>
In reply to#75909
On Friday, 8 August 2014 23:03:18 UTC+1, Ian  wrote:
> On Fri, Aug 8, 2014 at 3:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
> 
> > * Uses SystemRandom class (if available, or falls back to Random)
> A simple improvement would be to also allow the user to pass in a
> Random object

That is not a bad idea. I'll create an issue for it. 

It is a design goal to use the standard library within the implementation so users have a guarantee about exactly how the data is generated. But your suggestion is not inconsistent with that. 

> 
> Have you given any thought to adding a validation mode, where the user
> provides a template and a string and wants to know if the string
> matches the template?

Isn't that what regular expressions are? Or do you have a clarifying use case? 

strgen is provided as the converse of regular expressions. 

[toc] | [prev] | [next] | [standalone]


#75935

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-08-09 01:49 -0600
Message-ID<mailman.12786.1407570638.18130.python-list@python.org>
In reply to#75933
On Sat, Aug 9, 2014 at 12:52 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
> On Friday, 8 August 2014 23:03:18 UTC+1, Ian  wrote:
>> Have you given any thought to adding a validation mode, where the user
>> provides a template and a string and wants to know if the string
>> matches the template?
>
> Isn't that what regular expressions are? Or do you have a clarifying use case?
>
> strgen is provided as the converse of regular expressions.

The syntax is not equivalent though. You can't take a strgen template,
pass it into the re module, and just expect it to work.

Also, I'm not sure how best to go about writing a regular expression
for, e.g. "12 or more letters, digits, and punctuation, including at
least one each of uppercase letter, lowercase letter, digit, and
punctuation". I'm fairly certain that language is regular, but
actually matching it with a regular expression would be a nightmare.

[toc] | [prev] | [next] | [standalone]


#75936

FromIan Kelly <ian.g.kelly@gmail.com>
Date2014-08-09 01:57 -0600
Message-ID<mailman.12787.1407571088.18130.python-list@python.org>
In reply to#75933
On Sat, Aug 9, 2014 at 1:49 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Sat, Aug 9, 2014 at 12:52 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
>> On Friday, 8 August 2014 23:03:18 UTC+1, Ian  wrote:
>>> Have you given any thought to adding a validation mode, where the user
>>> provides a template and a string and wants to know if the string
>>> matches the template?
>>
>> Isn't that what regular expressions are? Or do you have a clarifying use case?
>>
>> strgen is provided as the converse of regular expressions.
>
> The syntax is not equivalent though. You can't take a strgen template,
> pass it into the re module, and just expect it to work.
>
> Also, I'm not sure how best to go about writing a regular expression
> for, e.g. "12 or more letters, digits, and punctuation, including at
> least one each of uppercase letter, lowercase letter, digit, and
> punctuation". I'm fairly certain that language is regular, but
> actually matching it with a regular expression would be a nightmare.

To clarify further, validating that *without* using a regular
expression is not too terribly difficult, but the value that I see in
validating it with a strgen is that one could then be sure that one's
string generation and validation were equivalent. In contrast, if you
have a strgen for generation and a series of string manipulations for
validation, then it's hard to be certain there aren't any differences.

[toc] | [prev] | [next] | [standalone]


#75981

FromDevin Jeanpierre <jeanpierreda@gmail.com>
Date2014-08-10 05:43 -0700
Message-ID<mailman.12817.1407674628.18130.python-list@python.org>
In reply to#75871
On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
> This is a proposal with a working implementation for a random string generation template syntax for Python. `strgen` is a module for generating random strings in Python using a regex-like template language. Example:
>
>     >>> from strgen import StringGenerator as SG
>     >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
>     u'F0vghTjKalf4^mGLk'

Why aren't you using regular expressions? I am all for conciseness,
but using an existing format is so helpful...

Unfortunately, the equivalent regexp probably looks like
r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'

(I've been working on this kind of thing with regexps, but it's still
incomplete.)

> * Uses SystemRandom class (if available, or falls back to Random)

This sounds cryptographically weak. Isn't the normal thing to do to
use a cryptographic hash function to generate a pseudorandom sequence?

Someone should write a cryptographically secure pseudorandom number
generator library for Python. :(

(I think OpenSSL comes with one, but then you can't choose the seed.)

-- Devin

[toc] | [prev] | [next] | [standalone]


#75984

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-08-11 02:31 +1000
Message-ID<53e79e46$0$29967$c3e8da3$5496439d@news.astraweb.com>
In reply to#75981
Devin Jeanpierre wrote:

> On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf <paulwolf333@gmail.com> wrote:
>> This is a proposal with a working implementation for a random string
>> generation template syntax for Python. `strgen` is a module for
>> generating random strings in Python using a regex-like template language.
>> Example:
>>
>>     >>> from strgen import StringGenerator as SG
>>     >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
>>     u'F0vghTjKalf4^mGLk'
> 
> Why aren't you using regular expressions? I am all for conciseness,
> but using an existing format is so helpful...

You've just answered your own question:

> Unfortunately, the equivalent regexp probably looks like
> r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'

Apart from being needlessly verbose, regex syntax is not appropriate because
it specifies too much, specifies too little, and specifies the wrong
things. It specifies too much: regexes like ^ and $ are meaningless in this
case. It specifies too little: there's no regex for the "shuffle operator".
And it specifies the wrong things: regexes like (?= ...) as used in your
example are for matching, not generating strings, and it isn't clear
what "match any character but don't consume any of the string" means when
generating strings.

Personally, I think even the OP's specified language is too complex. For
example, it supports literal text, but given the use-case (password
generators) do we really want to support templates like "password[\d]"? I
don't think so, and if somebody did, they can trivially say "password" +
SG('[\d]').render().

Larry Wall (the creator of Perl) has stated that one of the mistakes with
Perl's regular expression mini-language is that the Huffman coding is
wrong. Common things should be short, uncommon things can afford to be
longer. Since the most common thing for password generation is to specify
character classes, they should be short, e.g. d rather than [\d] (one
character versus four).

The template given could potentially be simplified to:

"(LD){8:15}&D&P"

where the round brackets () are purely used for grouping. Character codes
are specified by a single letter. (I use uppercase to avoid the problem
that l & 1 look very similar. YMMV.) The model here is custom format codes
from spreadsheets, which should be comfortable to anyone who is familiar
with Excel or OpenOffice. If you insist on having the facility to including
literal text in your templates, might I suggest:

"'password'd"  # Literal string "password", followed by a single digit.

but personally I believe that for the use-case given, that's a mistake.

Alternatively, date/time templates use two-character codes like %Y %m etc,
which is better than 



> (I've been working on this kind of thing with regexps, but it's still
> incomplete.)
> 
>> * Uses SystemRandom class (if available, or falls back to Random)
> 
> This sounds cryptographically weak. Isn't the normal thing to do to
> use a cryptographic hash function to generate a pseudorandom sequence?

I don't think that using a good, but not cryptographically-strong, random
number generator to generate passwords is a serious vulnerability. What's
your threat model? Attacks on passwords tend to be one of a very few:

- dictionary attacks (including tables of common passwords and 
  simple transformations of words, e.g. 'pas5w0d');

- brute force against short and weak passwords;

- attacking the hash function used to store passwords (not the password
  itself), e.g. rainbow tables;

- keyloggers or some other way of stealing the password (including
  phishing sites and the ever-popular "beat them with a lead pipe 
  until they give up the password");

- other social attacks, e.g. guessing that the person's password is their
  date of birth in reverse.

But unless the random number generator is *ridiculously* weak ("9, 9, 9, 9,
9, 9, ...") I can't see any way to realistically attack the password
generator based on the weakness of the random number generator. Perhaps I'm
missing something?


> Someone should write a cryptographically secure pseudorandom number
> generator library for Python. :(

Here, let me google that for you :-)

https://duckduckgo.com/html/?q=python+crypto



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#75998

FromDevin Jeanpierre <jeanpierreda@gmail.com>
Date2014-08-10 11:28 -0700
Message-ID<mailman.12823.1407696946.18130.python-list@python.org>
In reply to#75984
On Sun, Aug 10, 2014 at 9:31 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>> (I've been working on this kind of thing with regexps, but it's still
>> incomplete.)
>>
>>> * Uses SystemRandom class (if available, or falls back to Random)
>>
>> This sounds cryptographically weak. Isn't the normal thing to do to
>> use a cryptographic hash function to generate a pseudorandom sequence?
>
> I don't think that using a good, but not cryptographically-strong, random
> number generator to generate passwords is a serious vulnerability. What's
> your threat model?

I've always wanted a password generator that worked on the fly based
off of a master password. If the passwords are generated randomly but
not cryptographically securely so, then given sufficiently many
passwords, the master password might be deduced. CSPRNGs guarantee
otherwise.

>> Someone should write a cryptographically secure pseudorandom number
>> generator library for Python. :(
>
> Here, let me google that for you

I should clarify that OpenSSL has one (which is what I assume you're
alluding to), but it doesn't let you choose the seed, so it's useless
for deterministic password generation. There are also lots of small
libraries some person wrote at some time, but that sounds shady. ;)

-- Devin

[toc] | [prev] | [next] | [standalone]


#76014

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2014-08-11 12:22 +1000
Message-ID<53e828e9$0$29966$c3e8da3$5496439d@news.astraweb.com>
In reply to#75998
Devin Jeanpierre wrote:

> On Sun, Aug 10, 2014 at 9:31 AM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:

>> I don't think that using a good, but not cryptographically-strong, random
>> number generator to generate passwords is a serious vulnerability. What's
>> your threat model?
> 
> I've always wanted a password generator that worked on the fly based
> off of a master password. If the passwords are generated randomly but
> not cryptographically securely so, then given sufficiently many
> passwords, the master password might be deduced.

o_O

So, what you're saying is that you're concerned that if an attacker has all
your passwords, they might be able to generate new passwords?


[...]
>>> Someone should write a cryptographically secure pseudorandom number
>>> generator library for Python. :(
>>
>> Here, let me google that for you
> 
> I should clarify that OpenSSL has one (which is what I assume you're
> alluding to), 

No. If you follow the link I provided, I'm sure you will find what you are
after.


> but it doesn't let you choose the seed, so it's useless 
> for deterministic password generation. There are also lots of small
> libraries some person wrote at some time, but that sounds shady. ;)

You mean the opposite to OpenSSL, which was handed down to Mankind from the
Gods? The size of the library doesn't matter, what matters is how well it
implements what crypto standards.




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#76015

FromChris Angelico <rosuav@gmail.com>
Date2014-08-11 12:31 +1000
Message-ID<mailman.12833.1407724268.18130.python-list@python.org>
In reply to#76014
On Mon, Aug 11, 2014 at 12:22 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> You mean the opposite to OpenSSL, which was handed down to Mankind from
> the Gods?

I thought Prometheus stole OpenSSL and gave it to mankind so a group
of Minotaurs would stop teasing him about Heartbleed.

ChrisA

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web