Groups > comp.lang.python > #42377 > unrolled thread

Creating a dictionary from a .txt file

Started by	"C.T." <swilks06@gmail.com>
First post	2013-03-31 08:52 -0700
Last post	2013-04-01 20:12 -0400
Articles	20 — 9 participants

Back to article view | Back to comp.lang.python

  Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 08:52 -0700
    Re: Creating a dictionary from a .txt file Chris Angelico <rosuav@gmail.com> - 2013-04-01 03:06 +1100
      Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:19 -0700
        Re: Creating a dictionary from a .txt file Chris Angelico <rosuav@gmail.com> - 2013-04-01 04:22 +1100
      Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:19 -0700
    Re: Creating a dictionary from a .txt file Mark Janssen <dreamingforward@gmail.com> - 2013-03-31 09:20 -0700
      Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 09:52 -0700
        Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-03-31 13:31 -0400
          Re: Creating a dictionary from a .txt file Roy Smith <roy@panix.com> - 2013-03-31 14:41 -0400
            Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-03-31 17:37 -0400
            Re: Creating a dictionary from a .txt file Neil Cerutti <neilc@norwich.edu> - 2013-04-01 11:41 +0000
              Re: Creating a dictionary from a .txt file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-01 23:53 +0000
                Re: Creating a dictionary from a .txt file Walter Hurry <walterhurry@lavabit.com> - 2013-04-02 00:28 +0000
                Re: Creating a dictionary from a .txt file Neil Cerutti <neilc@norwich.edu> - 2013-04-03 12:55 +0000
      Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 09:52 -0700
    Re: Creating a dictionary from a .txt file Roy Smith <roy@panix.com> - 2013-03-31 12:38 -0400
      Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:28 -0700
    Re: Creating a dictionary from a .txt file Terry Jan Reedy <tjreedy@udel.edu> - 2013-03-31 15:04 -0400
    Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-04-01 16:53 -0700
      Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-04-01 20:12 -0400

#42377 — Creating a dictionary from a .txt file

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 08:52 -0700
Subject	Creating a dictionary from a .txt file
Message-ID	<d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com>

Hello,

I'm currently working on a homework problem that requires me to create a dictionary from a .txt file that contains some of the worst cars ever made. The file looks something like this:

1958 MGA Twin Cam
1958 Zunndapp Janus
1961 Amphicar
1961 Corvair
1966 Peel Trident
1970 AMC Gremlin
1970 Triumph Stag
1971 Chrysler Imperial LeBaron Two-Door Hardtop

The car manufacturer should be the key and a tuple containing the year and the model should be the key's value. I tried the following to just get the contents of the file into a list, but only the very last line in the txt file is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I print temp.

d={}
car_file = open('worstcars.txt', 'r')
for line in car_file:
    temp = line.split()
print (temp)
car_file.close()

After playing around with the code, I came up with the following code to get everything into a list:

d=[]
car_file = open('worstcars.txt', 'r')
for line in car_file:
    d.append(line.strip('\n'))
print (d)
car_file.close()

Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:

['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']

Any help would be appreciated!

[toc] | [next] | [standalone]

#42378

From	Chris Angelico <rosuav@gmail.com>
Date	2013-04-01 03:06 +1100
Message-ID	<mailman.4018.1364745985.2939.python-list@python.org>
In reply to	#42377

On Mon, Apr 1, 2013 at 2:52 AM, C.T. <swilks06@gmail.com> wrote:
> After playing around with the code, I came up with the following code to get everything into a list:
>
> d=[]
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
>     d.append(line.strip('\n'))
> print (d)
> car_file.close()
>
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.

Ah, a nice straight-forward text parsing problem!

The question is how to recognize the manufacturer. Is it guaranteed to
be the second blank-delimited word, with the year being the first? If
so, you were almost there with .split().

car_file = open('worstcars.txt', 'r')
# You may want to consider the 'with' statement here - no need to close()
for line in car_file:
    temp = line.split(None, 2)
    if len(temp)==3:
        year, mfg, model = temp
        # Now do something with these three values
        print("Manufacturer: %s  Year: %s  Model: %s"%(mfg,year,model))

That's sorted out the parsing side of things. Do you know how to build
up the dictionary from there?

What happens if there are multiple entries in the file for the same
manufacturer? Do you need to handle that?

ChrisA

[toc] | [prev] | [next] | [standalone]

#42383

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 10:19 -0700
Message-ID	<e181e462-38f7-4312-9125-ef47ef42b284@googlegroups.com>
In reply to	#42378

On Sunday, March 31, 2013 12:06:18 PM UTC-4, Chris Angelico wrote:
> On Mon, Apr 1, 2013 at 2:52 AM, C.T.
> 
> > After playing around with the code, I came up with the following code to get everything into a list:
> 
> >
> 
> > d=[]
> 
> > car_file = open('worstcars.txt', 'r')
> 
> > for line in car_file:
> 
> >     d.append(line.strip('\n'))
> 
> > print (d)
> 
> > car_file.close()
> 
> >
> 
> > Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.
> 
> 
> 
> Ah, a nice straight-forward text parsing problem!
> 
> 
> 
> The question is how to recognize the manufacturer. Is it guaranteed to
> 
> be the second blank-delimited word, with the year being the first? If
> 
> so, you were almost there with .split().
> 
> 
> 
> car_file = open('worstcars.txt', 'r')
> 
> # You may want to consider the 'with' statement here - no need to close()
> 
> for line in car_file:
> 
>     temp = line.split(None, 2)
> 
>     if len(temp)==3:
> 
>         year, mfg, model = temp
> 
>         # Now do something with these three values
> 
>         print("Manufacturer: %s  Year: %s  Model: %s"%(mfg,year,model))
> 
> 
> 
> That's sorted out the parsing side of things. Do you know how to build
> 
> up the dictionary from there?
> 
> 
> 
> What happens if there are multiple entries in the file for the same
> 
> manufacturer? Do you need to handle that?
> 
> 
> 
> ChrisA

Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names.

[toc] | [prev] | [next] | [standalone]

#42385

From	Chris Angelico <rosuav@gmail.com>
Date	2013-04-01 04:22 +1100
Message-ID	<mailman.4022.1364750566.2939.python-list@python.org>
In reply to	#42383

On Mon, Apr 1, 2013 at 4:19 AM, C.T. <swilks06@gmail.com> wrote:
> Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names.

Then you're going to need some other form of magic to recognize where
the manufacturer ends and the model starts. Do you have, say, tabs
between the fields and spaces within?

ChrisA

[toc] | [prev] | [next] | [standalone]

#42384

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 10:19 -0700
Message-ID	<mailman.4021.1364750406.2939.python-list@python.org>
In reply to	#42378

On Sunday, March 31, 2013 12:06:18 PM UTC-4, Chris Angelico wrote:
> On Mon, Apr 1, 2013 at 2:52 AM, C.T.
> 
> > After playing around with the code, I came up with the following code to get everything into a list:
> 
> >
> 
> > d=[]
> 
> > car_file = open('worstcars.txt', 'r')
> 
> > for line in car_file:
> 
> >     d.append(line.strip('\n'))
> 
> > print (d)
> 
> > car_file.close()
> 
> >
> 
> > Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.
> 
> 
> 
> Ah, a nice straight-forward text parsing problem!
> 
> 
> 
> The question is how to recognize the manufacturer. Is it guaranteed to
> 
> be the second blank-delimited word, with the year being the first? If
> 
> so, you were almost there with .split().
> 
> 
> 
> car_file = open('worstcars.txt', 'r')
> 
> # You may want to consider the 'with' statement here - no need to close()
> 
> for line in car_file:
> 
>     temp = line.split(None, 2)
> 
>     if len(temp)==3:
> 
>         year, mfg, model = temp
> 
>         # Now do something with these three values
> 
>         print("Manufacturer: %s  Year: %s  Model: %s"%(mfg,year,model))
> 
> 
> 
> That's sorted out the parsing side of things. Do you know how to build
> 
> up the dictionary from there?
> 
> 
> 
> What happens if there are multiple entries in the file for the same
> 
> manufacturer? Do you need to handle that?
> 
> 
> 
> ChrisA

Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names.

[toc] | [prev] | [next] | [standalone]

#42379

From	Mark Janssen <dreamingforward@gmail.com>
Date	2013-03-31 09:20 -0700
Message-ID	<mailman.4019.1364746832.2939.python-list@python.org>
In reply to	#42377

[Multipart message — attachments visible in raw view] — view raw

>
> Every line is now an element in list d. The question I have now is how can
> I make a dictionary out of the list d with the car manufacturer as the key
> and a tuple containing the year and the model should be the key's value.
> Here is a sample of what list d looks like:
>
> ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto',
> '2003 Hummer H2', '2004 Chevy SSR']
>
> Any help would be appreciated!
>
>
As long as your data is consistently ordered, just use list indexing.  d[2]
is your key, and (d[1],d[3]) the key's value.

Mark
Tacoma, Washington

[toc] | [prev] | [next] | [standalone]

#42381

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 09:52 -0700
Message-ID	<d1a5fee7-2ce6-4446-8193-587e4c12a0a0@googlegroups.com>
In reply to	#42379

On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:
> 
> 
> 
> 
> ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']
> 
> 
> 
> Any help would be appreciated!
> 
> 
> 
> 
> As long as your data is consistently ordered, just use list indexing.  d[2] is your key, and (d[1],d[3]) the key's value.
> 
> 
> 
> Mark
> Tacoma, Washington


Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two.  I've added the following, but the output is still not what I need it to be.


t={}
for i in d :
    t[d[d.index(i)][5:]]= tuple(d[d.index(i)][:4])

print (t)

The output looks something like this:

{'Ford Model T': ('1', '9', '0', '9'), 'Mosler Consulier GTP': ('1', '9', '8', '5'), 'Scripps-Booth Bi-Autogo': ('1', '9', '1', '3'), 'Morgan Plus 8 Propane': ('1', '9', '7', '5'), 'Fiat Multipla': ('1', '9', '9', '8'), 'Ford Pinto': ('1', '9', '7', '1'), 'Triumph Stag': ('1', '9', '7', '0'), 'BMW 7-series': ('2', '0', '0', '2')}


Here the key is the car manufacturer and car model and the value is a tuple containing the year separated by a comma.( Not sure why that is ?)

[toc] | [prev] | [next] | [standalone]

#42387

From	Dave Angel <davea@davea.name>
Date	2013-03-31 13:31 -0400
Message-ID	<mailman.4023.1364751102.2939.python-list@python.org>
In reply to	#42381

On 03/31/2013 12:52 PM, C.T. wrote:
> On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
>>  <SNIP>
>>
>
> Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two.  I've added the following, but the output is still not what I need it to be.
>
>

So the correct answer is "it cannot be done," and an explanation.

Many times I've been given impossible conditions for a problem.  And 
invariably the correct solution is to press bac on the supplier of the 
constraints.

Unless there are some invisible characters in that file, lie tabs in 
between the fields, it loocs liec you're out of luc.  Or you could 
manually edit the file before running the program.

[The character after 'j' is broccen on this cceyboard.]
-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#42390

From	Roy Smith <roy@panix.com>
Date	2013-03-31 14:41 -0400
Message-ID	<roy-A43BBD.14414131032013@news.panix.com>
In reply to	#42387

In article <mailman.4023.1364751102.2939.python-list@python.org>,
 Dave Angel <davea@davea.name> wrote:

> On 03/31/2013 12:52 PM, C.T. wrote:
> > On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> >>  <SNIP>
> >>
> >
> > Thank you, Mark! My problem is the data isn't consistently ordered. I can 
> > use slicing and indexing to put the year into a tuple, but because a car 
> > manufacturer could have two names (ie, Aston Martin) or a car model could 
> > have two names(ie, Iron Duke), its harder to use slicing and indexing for 
> > those two.  I've added the following, but the output is still not what I 
> > need it to be.
> 
> So the correct answer is "it cannot be done," and an explanation.
> 
> Many times I've been given impossible conditions for a problem.  And 
> invariably the correct solution is to press [back] on the supplier of the 
> constraints.

In real life, you often have to deal with crappy input data (and bogus 
project requirements).  Sometimes you just need to be creative.

There's only a small set of car manufacturers.  A good start would be 
mining wikipedia's [[List of automobile manufacturers]].  Once you've 
got that list, you could try matching portions of the input against the 
list.

Depending on how much effort you wanted to put into this, you could 
explore all sorts of fuzzy matching (ie "delorean" vs "delorean motor 
company"), but even a simple search is better than giving up.

And, this is a good excuse to explore some of the interesting 
third-party modules.  For example, mwclient ("pip install mwclient") 
gives you a neat Python interface to wikipedia.  And there's a whole 
landscape of string matching packages to explore.

We deal with this every day at Songza.  Are Kesha and Ke$ha the same 
artist?  Pushing back on the record labels to clean up their catalogs 
isn't going to get us very far.

[toc] | [prev] | [next] | [standalone]

#42418

From	Dave Angel <davea@davea.name>
Date	2013-03-31 17:37 -0400
Message-ID	<mailman.4032.1364765875.2939.python-list@python.org>
In reply to	#42390

On 03/31/2013 02:41 PM, Roy Smith wrote:
> In article <mailman.4023.1364751102.2939.python-list@python.org>,
>   Dave Angel <davea@davea.name> wrote:
>
>> On 03/31/2013 12:52 PM, C.T. wrote:
>>> On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
>>>>   <SNIP>
>>>>
>>>
>>> Thank you, Mark! My problem is the data isn't consistently ordered. I can
>>> use slicing and indexing to put the year into a tuple, but because a car
>>> manufacturer could have two names (ie, Aston Martin) or a car model could
>>> have two names(ie, Iron Duke), its harder to use slicing and indexing for
>>> those two.  I've added the following, but the output is still not what I
>>> need it to be.
>>
>> So the correct answer is "it cannot be done," and an explanation.
>>
>> Many times I've been given impossible conditions for a problem.  And
>> invariably the correct solution is to press [back] on the supplier of the
>> constraints.
>
> In real life, you often have to deal with crappy input data (and bogus
> project requirements).  Sometimes you just need to be creative.
>
> There's only a small set of car manufacturers.  A good start would be
> mining wikipedia's [[List of automobile manufacturers]].  Once you've
> got that list, you could try matching portions of the input against the
> list.
>
> Depending on how much effort you wanted to put into this, you could
> explore all sorts of fuzzy matching (ie "delorean" vs "delorean motor
> company"), but even a simple search is better than giving up.
>
> And, this is a good excuse to explore some of the interesting
> third-party modules.  For example, mwclient ("pip install mwclient")
> gives you a neat Python interface to wikipedia.  And there's a whole
> landscape of string matching packages to explore.
>
> We deal with this every day at Songza.  Are Kesha and Ke$ha the same
> artist?  Pushing back on the record labels to clean up their catalogs
> isn't going to get us very far.
>

I agree with everything you've said, although in your case, presumably 
the record labels are not your client/boss, so that's not who you push 
back against.  The client should know when the data is being fudged, and 
have a say in how it's to be done.

But this is a homework assignment.  I think the OP is learning Python, 
not how to second-guess a client.


-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#42456

From	Neil Cerutti <neilc@norwich.edu>
Date	2013-04-01 11:41 +0000
Message-ID	<arta2fFh3neU2@mid.individual.net>
In reply to	#42390

On 2013-03-31, Roy Smith <roy@panix.com> wrote:
> And, this is a good excuse to explore some of the interesting
> third-party modules.  For example, mwclient ("pip install
> mwclient") gives you a neat Python interface to wikipedia.  And
> there's a whole landscape of string matching packages to
> explore.
>
> We deal with this every day at Songza.  Are Kesha and Ke$ha the
> same artist?  Pushing back on the record labels to clean up
> their catalogs isn't going to get us very far.

I tried searching for Frost*, an interesting artist I recently
learned about. His name, in combination with a similarly named
rap artist, breaks most search tools.

My guess is this homework is simply borken.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]

#42525

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-04-01 23:53 +0000
Message-ID	<515a1e03$0$29967$c3e8da3$5496439d@news.astraweb.com>
In reply to	#42456

On Mon, 01 Apr 2013 11:41:03 +0000, Neil Cerutti wrote:

> I tried searching for Frost*, an interesting artist I recently learned
> about. 

"Interesting artist" -- is that another term for "wanker"?

*wink*

> His name, in combination with a similarly named rap artist,
> breaks most search tools.

As far as I'm concerned, anyone in the 21st century who names themselves 
or their work (a movie, book, programming language, etc.) something which 
breaks search tools is just *begging* for obscurity, and we ought to 
respect their wishes.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#42532

From	Walter Hurry <walterhurry@lavabit.com>
Date	2013-04-02 00:28 +0000
Message-ID	<kjd8n7$44v$1@news.albasani.net>
In reply to	#42525

On Mon, 01 Apr 2013 23:53:40 +0000, Steven D'Aprano wrote:

> As far as I'm concerned, anyone in the 21st century who names themselves
> or their work (a movie, book, programming language, etc.) something
> which breaks search tools is just *begging* for obscurity, and we ought
> to respect their wishes.

IIRC, there was an eccentric musician some years back who did just that.

I seem to remember that he changed his name to some kind of androgynous 
looking rune with what appeared to be a bent trumpet across it.

Thenceforth it became known as "the (piss) artist: formally gnome ass 
prints", or at least something sounding vaguely like that.

[toc] | [prev] | [next] | [standalone]

#42659

From	Neil Cerutti <neilc@norwich.edu>
Date	2013-04-03 12:55 +0000
Message-ID	<as2n6rFnjp7U1@mid.individual.net>
In reply to	#42525

On 2013-04-01, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> On Mon, 01 Apr 2013 11:41:03 +0000, Neil Cerutti wrote:
>
>
>> I tried searching for Frost*, an interesting artist I recently learned
>> about. 
>
> "Interesting artist" -- is that another term for "wanker"?
>
> *wink*

hee-hee. It depends on how much of a hankering you have for
pretentious progressive synth-rock.

>> His name, in combination with a similarly named rap artist,
>> breaks most search tools.
>
> As far as I'm concerned, anyone in the 21st century who names
> themselves or their work (a movie, book, programming language,
> etc.) something which breaks search tools is just *begging* for
> obscurity, and we ought to respect their wishes.

I do think it's something he did on purpose. The asterisk, I
believe, symbolizes the exclusive genius of his fans.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]

#42382

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 09:52 -0700
Message-ID	<mailman.4020.1364749257.2939.python-list@python.org>
In reply to	#42379

On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:
> 
> 
> 
> 
> ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']
> 
> 
> 
> Any help would be appreciated!
> 
> 
> 
> 
> As long as your data is consistently ordered, just use list indexing.  d[2] is your key, and (d[1],d[3]) the key's value.
> 
> 
> 
> Mark
> Tacoma, Washington


Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two.  I've added the following, but the output is still not what I need it to be.


t={}
for i in d :
    t[d[d.index(i)][5:]]= tuple(d[d.index(i)][:4])

print (t)

The output looks something like this:

{'Ford Model T': ('1', '9', '0', '9'), 'Mosler Consulier GTP': ('1', '9', '8', '5'), 'Scripps-Booth Bi-Autogo': ('1', '9', '1', '3'), 'Morgan Plus 8 Propane': ('1', '9', '7', '5'), 'Fiat Multipla': ('1', '9', '9', '8'), 'Ford Pinto': ('1', '9', '7', '1'), 'Triumph Stag': ('1', '9', '7', '0'), 'BMW 7-series': ('2', '0', '0', '2')}


Here the key is the car manufacturer and car model and the value is a tuple containing the year separated by a comma.( Not sure why that is ?)

[toc] | [prev] | [next] | [standalone]

#42380

From	Roy Smith <roy@panix.com>
Date	2013-03-31 12:38 -0400
Message-ID	<roy-7B35BF.12385631032013@news.panix.com>
In reply to	#42377

In article <d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com>,
 "C.T." <swilks06@gmail.com> wrote:

> Hello,
> 
> I'm currently working on a homework problem that requires me to create a 
> dictionary from a .txt file that contains some of the worst cars ever made. 
> The file looks something like this:
> 
> 1958 MGA Twin Cam
> 1958 Zunndapp Janus
> 1961 Amphicar
> 1961 Corvair
> 1966 Peel Trident
> 1970 AMC Gremlin
> 1970 Triumph Stag
> 1971 Chrysler Imperial LeBaron Two-Door Hardtop
> 
> The car manufacturer should be the key and a tuple containing the year and 
> the model should be the key's value. I tried the following to just get the 
> contents of the file into a list, but only the very last line in the txt file 
> is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I 
> print temp.
> 
> d={}
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
>     temp = line.split()
> print (temp)
> car_file.close()

Yup.  Because you run through the whole file, putting each line into 
temp, overwriting the previous temp value.

> d=[]
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
>     d.append(line.strip('\n'))
> print (d)
> car_file.close()

You could do most of that with just:

car_file = open('worstcars.txt', 'r')
d = car_file.readlines()

but there's no real reason to read the whole file into a list.  What you 
probably want to do is something like:

d = {}
car_file = open('worstcars.txt', 'r')
for line in car_file:
   year, manufacturer, model = parse_line(line)
   d[manufacturer] = (year, model)

One comment about the above; it assumes that there's only a single entry 
for a given manufacturer in the file.  If that's not true, the above 
code will only keep the last one.  But let's assume it's true for the 
moment.

Now, we're just down to writing parse_line().  This takes a string and 
breaks it up into 3 strings.  I'm going to leave this as an exercise for 
you to work out.  The complicated part is going to be figuring out some 
logic to deal with anything from multi-word model names ("Imperial 
LeBaron Two-Door Hardtop"), to lines like the Corvair where there is no 
manufacturer (or maybe there's no model?).

[toc] | [prev] | [next] | [standalone]

#42386

From	"C.T." <swilks06@gmail.com>
Date	2013-03-31 10:28 -0700
Message-ID	<dc1bc939-fd8d-436d-872a-19779602351e@googlegroups.com>
In reply to	#42380

On Sunday, March 31, 2013 12:38:56 PM UTC-4, Roy Smith wrote:
> In article <d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com>,
> 
>  "C.T."  wrote:
> 
> 
> 
> > Hello,
> 
> > 
> 
> > I'm currently working on a homework problem that requires me to create a 
> 
> > dictionary from a .txt file that contains some of the worst cars ever made. 
> 
> > The file looks something like this:
> 
> > 
> 
> > 1958 MGA Twin Cam
> 
> > 1958 Zunndapp Janus
> 
> > 1961 Amphicar
> 
> > 1961 Corvair
> 
> > 1966 Peel Trident
> 
> > 1970 AMC Gremlin
> 
> > 1970 Triumph Stag
> 
> > 1971 Chrysler Imperial LeBaron Two-Door Hardtop
> 
> > 
> 
> > The car manufacturer should be the key and a tuple containing the year and 
> 
> > the model should be the key's value. I tried the following to just get the 
> 
> > contents of the file into a list, but only the very last line in the txt file 
> 
> > is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I 
> 
> > print temp.
> 
> > 
> 
> > d={}
> 
> > car_file = open('worstcars.txt', 'r')
> 
> > for line in car_file:
> 
> >     temp = line.split()
> 
> > print (temp)
> 
> > car_file.close()
> 
> 
> 
> Yup.  Because you run through the whole file, putting each line into 
> 
> temp, overwriting the previous temp value.
> 
> 
> 
> > d=[]
> 
> > car_file = open('worstcars.txt', 'r')
> 
> > for line in car_file:
> 
> >     d.append(line.strip('\n'))
> 
> > print (d)
> 
> > car_file.close()
> 
> 
> 
> You could do most of that with just:
> 
> 
> 
> car_file = open('worstcars.txt', 'r')
> 
> d = car_file.readlines()
> 
> 
> 
> but there's no real reason to read the whole file into a list.  What you 
> 
> probably want to do is something like:
> 
> 
> 
> d = {}
> 
> car_file = open('worstcars.txt', 'r')
> 
> for line in car_file:
> 
>    year, manufacturer, model = parse_line(line)
> 
>    d[manufacturer] = (year, model)
> 
> 
> 
> One comment about the above; it assumes that there's only a single entry 
> 
> for a given manufacturer in the file.  If that's not true, the above 
> 
> code will only keep the last one.  But let's assume it's true for the 
> 
> moment.
> 
> 
> 
> Now, we're just down to writing parse_line().  This takes a string and 
> 
> breaks it up into 3 strings.  I'm going to leave this as an exercise for 
> 
> you to work out.  The complicated part is going to be figuring out some 
> 
> logic to deal with anything from multi-word model names ("Imperial 
> 
> LeBaron Two-Door Hardtop"), to lines like the Corvair where there is no 
> 
> manufacturer (or maybe there's no model?).

Roy, thank you so much! I'll do some more research to see how I can achieve this. Thank you!

[toc] | [prev] | [next] | [standalone]

#42392

From	Terry Jan Reedy <tjreedy@udel.edu>
Date	2013-03-31 15:04 -0400
Message-ID	<mailman.4024.1364756707.2939.python-list@python.org>
In reply to	#42377

On 3/31/2013 11:52 AM, C.T. wrote:
> Hello,
>
> I'm currently working on a homework problem that requires me to create a dictionary from a .txt file that contains some of the worst cars ever made. The file looks something like this:
>
> 1958 MGA Twin Cam
> 1958 Zunndapp Janus
> 1961 Amphicar
> 1961 Corvair
> 1966 Peel Trident
> 1970 AMC Gremlin
> 1970 Triumph Stag
> 1971 Chrysler Imperial LeBaron Two-Door Hardtop
>
> The car manufacturer should be the key and a tuple containing the year and the model should be the key's value. I tried the following to just get the contents of the file into a list, but only the very last line in the txt file is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I print temp.
>
> d={}
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
>      temp = line.split()

If all makers are one word (Austen-Martin would be ok, and if the file 
is otherwise consistently year maker model words, then adding 
'maxsplit=3' to the split call would be all the parsing you need.

[toc] | [prev] | [next] | [standalone]

#42526

From	"C.T." <swilks06@gmail.com>
Date	2013-04-01 16:53 -0700
Message-ID	<e83bb343-fbed-4d30-ad87-21a99a929f61@googlegroups.com>
In reply to	#42377

Thanks for all the help everyone! After I manually edited the txt file, this is what I came up with:

car_dict = {}
car_file = open('cars.txt', 'r')


 
for line in car_file: 
    temp = line.strip().split(None, 2)
    temp2 = line.strip().split('\t')
    
    
    if len(temp)==3:  
        year, manufacturer, model = temp[0] ,temp2[0][5:], temp2[1]
        value = (year, model)
        if manufacturer in car_dict:
            car_dict.setdefault(manufacturer,[]).append(value)
        else:
            car_dict[manufacturer] = [value]
        
    
    elif len(temp)==2:
        year, manufacturer, model = temp[0], 'Unknown' , temp2[1]
        value = (year, model)
        if manufacturer in car_dict:
            car_dict.setdefault(manufacturer,[]).append(value)
        else:
            car_dict[manufacturer] = [value]

    
car_file.close()

print (car_dict)

It may not be the most pythonic way of doing this, but it works for me. I am learning python, and this problem was problem the most challenging so far. Thank you all, again!

[toc] | [prev] | [next] | [standalone]

#42531

From	Dave Angel <davea@davea.name>
Date	2013-04-01 20:12 -0400
Message-ID	<mailman.26.1364861572.17481.python-list@python.org>
In reply to	#42526

On 04/01/2013 07:53 PM, C.T. wrote:
> Thanks for all the help everyone! After I manually edited the txt file, this is what I came up with:
>
> car_dict = {}
> car_file = open('cars.txt', 'r')
>
>
>
> for line in car_file:
>      temp = line.strip().split(None, 2)
>      temp2 = line.strip().split('\t')
>
>
>      if len(temp)==3:
>          year, manufacturer, model = temp[0] ,temp2[0][5:], temp2[1]
>          value = (year, model)
>          if manufacturer in car_dict:
>              car_dict.setdefault(manufacturer,[]).append(value)

That's rather redundant.  Once you've determined that the particular key 
is already there, why bother with the setdefault() call?  Or to put it 
another way, why bother to test if it's there when you're going to use 
setdefault to handle the case where it's not?


>          else:
>              car_dict[manufacturer] = [value]
>
>
>      elif len(temp)==2:
>          year, manufacturer, model = temp[0], 'Unknown' , temp2[1]
>          value = (year, model)
>          if manufacturer in car_dict:
>              car_dict.setdefault(manufacturer,[]).append(value)
>          else:
>              car_dict[manufacturer] = [value]
>
>
> car_file.close()
>
> print (car_dict)
>
> It may not be the most pythonic way of doing this, but it works for me. I am learning python, and this problem was problem the most challenging so far. Thank you all, again!
>


-- 
DaveA

[toc] | [prev] | [standalone]

csiph-web

Creating a dictionary from a .txt file

Contents

#42377 — Creating a dictionary from a .txt file

#42378

#42383

#42385

#42384

#42379

#42381

#42387

#42390

#42418

#42456

#42525

#42532

#42659

#42382

#42380

#42386

#42392

#42526

#42531