Groups > comp.lang.python > #50578 > unrolled thread

help with explaining how to split a list of tuples into parts

Started by	peter@ifoley.id.au
First post	2013-07-12 23:43 -0700
Last post	2013-07-13 10:33 -0400
Articles	7 — 4 participants

Back to article view | Back to comp.lang.python

  help with explaining how to split a list of tuples into parts peter@ifoley.id.au - 2013-07-12 23:43 -0700
    Re: help with explaining how to split a list of tuples into parts Peter Otten <__peter__@web.de> - 2013-07-13 09:28 +0200
      Re: help with explaining how to split a list of tuples into parts peter@ifoley.id.au - 2013-07-13 04:47 -0700
    Re: help with explaining how to split a list of tuples into parts Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 08:11 +0000
      Re: help with explaining how to split a list of tuples into parts peter@ifoley.id.au - 2013-07-13 05:05 -0700
        Re: help with explaining how to split a list of tuples into parts Roy Smith <roy@panix.com> - 2013-07-13 10:50 -0400
    Re: help with explaining how to split a list of tuples into parts Roy Smith <roy@panix.com> - 2013-07-13 10:33 -0400

#50578 — help with explaining how to split a list of tuples into parts

From	peter@ifoley.id.au
Date	2013-07-12 23:43 -0700
Subject	help with explaining how to split a list of tuples into parts
Message-ID	<fba5dac7-963f-4c1b-b40b-c0a54d681530@googlegroups.com>

Hi List,

I am new to Python and wondering if there is a better python way to do something.  As a learning exercise I decided to create a python bash script to wrap around the Python Crypt library (Version 2.7).

My attempt is located here - https://gist.github.com/pjfoley/5989653

I am trying to wrap my head around list comprehensions, I have read the docs at http://docs.python.org/2/tutorial/datastructures.html#list-comprehensions and read various google results.  I think my lack of knowledge is making it difficult to know what key word to search on.

Essentially I have this list of tuples

# Tuple == (Hash Method, Salt Length, Magic String, Hashed Password Length)
supported_hashes=[('crypt',2,'',13), ('md5',8,'$1$',22), ('sha256',16,'$5$',43), ('sha512',16,'$6$',86)]

This list contains the valid hash methods that the Crypt Library supports plus some lookup values I want to use in the code.

I have managed to work out how to extract a list of just the first value of each tuple (line 16) which I use as part of the validation against the --hash argparse option.

My Question.

Looking at line 27, This line returns the tuple that mataches the hash type the user selects from the command line.  Which I then split the seperate parts over lines 29 to 31.

I am wondering if there is a more efficient way to do this such that I could do:

salt_length, hash_type, expected_password_length = [x for x in supported_hashes if x[0] == args.hash]

From my limited understanding the first x is the return value from the function which meets the criteria.  So could I do something like:

... = [(x[0][1], x[0][2], x[0][3]) for x in supported_hashes if x[0] == args.hash]

I am happy to be pointed to some documentation which might help clarify what I need to do.  

Also if there is anything else that could be improved on with the code happy to be contacted off list.

Thanks,

Peter.

[toc] | [next] | [standalone]

#50580

From	Peter Otten <__peter__@web.de>
Date	2013-07-13 09:28 +0200
Message-ID	<mailman.4672.1373700539.3114.python-list@python.org>
In reply to	#50578

peter@ifoley.id.au wrote:

> Hi List,
> 
> I am new to Python and wondering if there is a better python way to do
> something.  As a learning exercise I decided to create a python bash
> script to wrap around the Python Crypt library (Version 2.7).
> 
> My attempt is located here - https://gist.github.com/pjfoley/5989653
> 
> I am trying to wrap my head around list comprehensions, I have read the
> docs at
> http://docs.python.org/2/tutorial/datastructures.html#list-comprehensions
> and read various google results.  I think my lack of knowledge is making
> it difficult to know what key word to search on.
> 
> Essentially I have this list of tuples
> 
> # Tuple == (Hash Method, Salt Length, Magic String, Hashed Password
> # Length)
> supported_hashes=[('crypt',2,'',13), ('md5',8,'$1$',22),
> ('sha256',16,'$5$',43), ('sha512',16,'$6$',86)]
> 
> This list contains the valid hash methods that the Crypt Library supports
> plus some lookup values I want to use in the code.
> 
> I have managed to work out how to extract a list of just the first value
> of each tuple (line 16) which I use as part of the validation against the
> --hash argparse option.
> 
> My Question.
> 
> Looking at line 27, This line returns the tuple that mataches the hash
> type the user selects from the command line.  Which I then split the
> seperate parts over lines 29 to 31.
> 
> I am wondering if there is a more efficient way to do this such that I
> could do:
> 
> salt_length, hash_type, expected_password_length = [x for x in
> supported_hashes if x[0] == args.hash]
> 
> From my limited understanding the first x is the return value from the
> function which meets the criteria.  So could I do something like:
> 
> ... = [(x[0][1], x[0][2], x[0][3]) for x in supported_hashes if x[0] ==
> args.hash]
> 
> I am happy to be pointed to some documentation which might help clarify
> what I need to do.
> 
> Also if there is anything else that could be improved on with the code
> happy to be contacted off list.

Every time when you have to look up something you should think 'dict', and I
expect that pretty that will happen automatically.
Also, to split a tuple into its items you can "unpack" it:

triple = (1, 2, 3)
one, two, three = triple
assert one == 1 and two == 2 and three == 3

So:

supported_hashes = {
    "crypt": (2, "", 13),
    "md5": (8, "$1$", 22),
    ...
}
...
parser.add_argument(
    '--hash', default='sha512', 
    choices=supported_hashes, # accept the keys
    help='Which Hash function to use')
...
salt_length, hash_type, expected_password_length = supported_hashes[args.hash]
...

[toc] | [prev] | [next] | [standalone]

#50592

From	peter@ifoley.id.au
Date	2013-07-13 04:47 -0700
Message-ID	<2a457e73-cbb2-4f2f-9b84-5feaaab809dd@googlegroups.com>
In reply to	#50580

On Saturday, 13 July 2013 17:28:50 UTC+10, Peter Otten  wrote:
> 
> Every time when you have to look up something you should think 'dict', and I
> 
> expect that pretty that will happen automatically.
> 
> Also, to split a tuple into its items you can "unpack" it:
> 
> 
> 
> triple = (1, 2, 3)
> 
> one, two, three = triple
> 
> assert one == 1 and two == 2 and three == 3
> 
> 
> 
> So:
> 
> 
> 
> supported_hashes = {
> 
>     "crypt": (2, "", 13),
> 
>     "md5": (8, "$1$", 22),
> 
>     ...
> 
> }
> 
> ...
> 
> parser.add_argument(
> 
>     '--hash', default='sha512', 
> 
>     choices=supported_hashes, # accept the keys
> 
>     help='Which Hash function to use')
> 
> ...
> 
> salt_length, hash_type, expected_password_length = supported_hashes[args.hash]
> 
> ...

Hi Peter,

Thanks for the pointers I will try your suggestion out and read some more.

Peter.

[toc] | [prev] | [next] | [standalone]

#50582

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-07-13 08:11 +0000
Message-ID	<51e10ba5$0$9505$c3e8da3$5496439d@news.astraweb.com>
In reply to	#50578

On Fri, 12 Jul 2013 23:43:55 -0700, peter wrote:

> Hi List,
> 
> I am new to Python and wondering if there is a better python way to do
> something.  As a learning exercise I decided to create a python bash
> script to wrap around the Python Crypt library (Version 2.7).

A Python bash script? What does that mean? Python and bash are two 
different languages.

> My attempt is located here - https://gist.github.com/pjfoley/5989653

A word of advice: don't assume that just because people are reading your 
posts, that they necessarily will follow links to view your code. There 
could be all sorts of reasons why they might not:

- they may have email access, but are blocked from the web;

- they may not have a browser that gets on with github;

- they may be reading email via a smart phone, and not want to pay extra 
to go to a website;

- too lazy, or too busy, to follow a link;

- they don't want to get bogged down in trying to debug a large block of 
someone else's code.

Or some other reason. For best results, you should try to simplify the 
problem as much as possible, bringing it down to the most trivial, easy 
example you can, small enough to include directly in the body of your 
email. That might be one line, or twenty lines.

See also: http://sscce.org/

> I am trying to wrap my head around list comprehensions, I have read the
> docs at
> http://docs.python.org/2/tutorial/datastructures.html#list-
comprehensions
> and read various google results.  I think my lack of knowledge is making
> it difficult to know what key word to search on.

I don't really think that list comps have anything to do with the problem 
at hand. You seem to have discovered a hammer (list comps) and are now 
trying to hammer everything with it. I don't think the list comp is the 
right tool for the job below.

But, for what it is worth, a list comp is simply a short-cut for a for-
loop, where the body of the loop is limited to a single expression. So 
this for-loop:

result = []
for value in some_values:
    result.append(calculate(value))

can be re-written as this list comp:

result = [calculate(value) for value in some_values]

> Essentially I have this list of tuples
> 
> # Tuple == (Hash Method, Salt Length, Magic String, Hashed Password
> Length) 
> supported_hashes=[('crypt',2,'',13), ('md5',8,'$1$',22),
> ('sha256',16,'$5$',43), ('sha512',16,'$6$',86)]
> 
> This list contains the valid hash methods that the Crypt Library
> supports plus some lookup values I want to use in the code.

Consider using namedtuple to use named fields rather than just numbered 
fields. For example:

from collections import namedtuple
Record = namedtuple("Record", "method saltlen magic hashpwdlen")

defines a new type called "Record", with four named fields. They you 
might do something like this:

x = Record('crypt', 2, '', 13)
print(x.saltlen)

=> prints 2

You might do this:

supported_hashes = [
    Record('crypt', 2, '', 13),
    Record('md5', 8, '$1$', 22),
    Record('sha256', 16, '$5$', 43),
    Record('sha512', 16, '$6$', 86),
    ]

although I think a better plan would be to use a dict rather than a list, 
something like this:

from collections import namedtuple
CryptRecord = namedtuple("CryptRecord", 
    "salt_length magic_string expected_password_length")

supported_hashes = {
    'crypt': CryptRecord(2, '', 13),
    'md5': CryptRecord(8, '$1$', 22),
    'sha256': CryptRecord(16, '$5$', 43),
    'sha512': CryptRecord(16, '$6$', 86),
    }

This will let you look crypt methods up by name:

method = supported_hashes['md5']
print(method.magic_string)

=> prints '$1$'

> I have managed to work out how to extract a list of just the first value
> of each tuple (line 16) which I use as part of the validation against
> the --hash argparse option.
> 
> My Question.
> 
> Looking at line 27, This line returns the tuple that mataches the hash
> type the user selects from the command line.  Which I then split the
> seperate parts over lines 29 to 31.
> 
> I am wondering if there is a more efficient way to do this such that I
> could do:
> 
> salt_length, hash_type, expected_password_length = [x for x in
> supported_hashes if x[0] == args.hash]

Have you tried it? What happens when you do so? What error message do you 
get? If you print the list comp, what do you get?

Hint: on the left hand side, you have three names. On the right hand 
side, you have a list containing one item. That the list was created from 
a list comprehension is irrelevant. What happens when you do this?

spam, ham, eggs = [(1, 2, 3)]  # List of 1 item, a tuple.

What happens when you extract the item out of the list?

spam, ham, eggs = [(1, 2, 3)][0]

> From my limited understanding the first x is the return value from the
> function which meets the criteria.  So could I do something like:

No, not the first x. The *only* x, since you only have one x that matches 
the condition.

Consider your list of supported_hashes. If you run this code:

result = []
for x in supported_hashes:
    if x == arg.hash:
        result.append(x)

what is the value of result? How many items does it have? If need be, 
call len(result) to see.

You need to extract the first (only) item from the list. Then you will 
get a different error: *too many* items to unpack, instead of too few. So 
you need either an extra name on the left, which you ignore:

spam, ham, eggs = [(1, 2, 3, 4)][0]  # fails

who_cares, spam, ham, eggs = [(1, 2, 3, 4)][0]

or you need to reduce the number of items on the right:

spam, ham, eggs = [(1, 2, 3, 4)][0][1:]

Can you see why the last one works? The word you are looking for is 
"slicing", and you can test it like this:

print( [100, 200, 300, 400, 500][1:] )
print( [100, 200, 300, 400, 500][2:4] )
print( [100, 200, 300, 400, 500][2:5] )

> ... = [(x[0][1], x[0][2], x[0][3]) for x in supported_hashes if x[0] ==
> args.hash]

You don't need to manually split the x tuple into 3 pieces. Slicing is 
faster and simpler:

[x[1:] for x in supported_hashes if x[0] == args.hash]

But if you use my suggestion for a dictionary, you can just say:

salt_length, hash_type, password_length = supported_hashes[args.hash]

as a simple, fast lookup, no list comp needed.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#50593

From	peter@ifoley.id.au
Date	2013-07-13 05:05 -0700
Message-ID	<892e3baa-b214-4c57-a828-a51db0ff7595@googlegroups.com>
In reply to	#50582

On Saturday, 13 July 2013 18:11:18 UTC+10, Steven D'Aprano  wrote:
> On Fri, 12 Jul 2013 23:43:55 -0700, peter wrote:
> > 
> > I am new to Python and wondering if there is a better python way to do
> 
> A Python bash script? What does that mean? Python and bash are two 
> different languages.

Sorry I was trying to give some context, my mistake I am glad it did not confuse the intent of my question and you were able to provide some really helpful suggestions below.

> 
> > My attempt is located here - https://gist.github.com/pjfoley/5989653
> 
> A word of advice: don't assume that just because people are reading your 
> 
> posts, that they necessarily will follow links to view your code. There 
> 
> could be all sorts of reasons why they might not:
> 
> * SNIP Justification *

> 
> Or some other reason. For best results, you should try to simplify the 
> 
> problem as much as possible, bringing it down to the most trivial, easy 
> 
> example you can, small enough to include directly in the body of your 
> 
> email. That might be one line, or twenty lines.
> 

In my defence I was trying to give some context for my problem domain.  The bottom of the original post (which you provided some suggestions on) had the simplified question.  I find that people are more likely to help if they don't think its just a *homework question* and that someone is genuinely trying to learn.

As I mentioned in the very first line of my email I am just starting to learn Python and was not even sure what I was supposed to be looking for.  Obviously I found the hammer straight away *BINGO*!  However in my defence when I was searching I did not have the necessary knowledge to even start looking in the right place.  For reference I started searching for slice's, splitting and separating.

I will look into the techniques and keywords you suggested and try your various suggestions.

Thanks for your pointers.

Peter

[toc] | [prev] | [next] | [standalone]

#50597

From	Roy Smith <roy@panix.com>
Date	2013-07-13 10:50 -0400
Message-ID	<roy-FAFA7D.10503813072013@70-1-84-166.pools.spcsdns.net>
In reply to	#50593

In article <892e3baa-b214-4c57-a828-a51db0ff7595@googlegroups.com>,
 peter@ifoley.id.au wrote:

> In my defence I was trying to give some context for my problem domain.

I think posting a link to the github page was perfectly fine.  It wasn't 
a huge amount of code to look at, and the way github presents the code 
with line numbers (which you referred to) made it easy to understand 
what you were asking.

[toc] | [prev] | [next] | [standalone]

#50595

From	Roy Smith <roy@panix.com>
Date	2013-07-13 10:33 -0400
Message-ID	<roy-85313F.10332713072013@70-1-84-166.pools.spcsdns.net>
In reply to	#50578

In article <fba5dac7-963f-4c1b-b40b-c0a54d681530@googlegroups.com>,
 peter@ifoley.id.au wrote:

> Hi List,
> 
> I am new to Python and wondering if there is a better python way to do 
> something.  As a learning exercise I decided to create a python bash script 
> to wrap around the Python Crypt library (Version 2.7).
> 
> My attempt is located here - https://gist.github.com/pjfoley/5989653

This looks like it should work, but it's a kind of weird use of list 
comprehensions.  Fundamentally, you're not trying to create a list, 
you're trying to select the one item which matches your key.  A better 
data structure would be a dict:

supported_hashes={'crypt':   (2,  '',    13),
                  'md5':     (8,  '$1$', 22),
                  'sha256':  (16, '$5$', 43),
                  'sha512':  (16, '$6$', 86),
                  }

then your selection logic becomes:

try:
    crypt_tuple = supported_hashes[args.hash]
except KeyError:
    print "unknown hash type"

Another thing you might want to look into is named tuples 
(http://docs.python.org/2/library/collections.html).  You could do 
something like:

from collections import namedtuple
HashInfo = namedtuple('HashInfo', ['salt_length',
                                   'hash_type',
                                   'expected_password_length'])

supported_hashes={'crypt':   HashInfo(2,  '',    13),
                  'md5':     HashInfo(8,  '$1$', 22),
                  'sha256':  HashInfo(16, '$5$', 43),
                  'sha512':  HashInfo(16, '$6$', 86),
                  }

and now you can refer to the tuple elements by name instead of by 
numeric index.

[toc] | [prev] | [standalone]

csiph-web

help with explaining how to split a list of tuples into parts

Contents

#50578 — help with explaining how to split a list of tuples into parts

#50580

#50592

#50582

#50593

#50597

#50595