Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #42377 > unrolled thread
| Started by | "C.T." <swilks06@gmail.com> |
|---|---|
| First post | 2013-03-31 08:52 -0700 |
| Last post | 2013-04-01 20:12 -0400 |
| Articles | 20 — 9 participants |
Back to article view | Back to comp.lang.python
Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 08:52 -0700
Re: Creating a dictionary from a .txt file Chris Angelico <rosuav@gmail.com> - 2013-04-01 03:06 +1100
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:19 -0700
Re: Creating a dictionary from a .txt file Chris Angelico <rosuav@gmail.com> - 2013-04-01 04:22 +1100
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:19 -0700
Re: Creating a dictionary from a .txt file Mark Janssen <dreamingforward@gmail.com> - 2013-03-31 09:20 -0700
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 09:52 -0700
Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-03-31 13:31 -0400
Re: Creating a dictionary from a .txt file Roy Smith <roy@panix.com> - 2013-03-31 14:41 -0400
Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-03-31 17:37 -0400
Re: Creating a dictionary from a .txt file Neil Cerutti <neilc@norwich.edu> - 2013-04-01 11:41 +0000
Re: Creating a dictionary from a .txt file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-01 23:53 +0000
Re: Creating a dictionary from a .txt file Walter Hurry <walterhurry@lavabit.com> - 2013-04-02 00:28 +0000
Re: Creating a dictionary from a .txt file Neil Cerutti <neilc@norwich.edu> - 2013-04-03 12:55 +0000
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 09:52 -0700
Re: Creating a dictionary from a .txt file Roy Smith <roy@panix.com> - 2013-03-31 12:38 -0400
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-03-31 10:28 -0700
Re: Creating a dictionary from a .txt file Terry Jan Reedy <tjreedy@udel.edu> - 2013-03-31 15:04 -0400
Re: Creating a dictionary from a .txt file "C.T." <swilks06@gmail.com> - 2013-04-01 16:53 -0700
Re: Creating a dictionary from a .txt file Dave Angel <davea@davea.name> - 2013-04-01 20:12 -0400
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 08:52 -0700 |
| Subject | Creating a dictionary from a .txt file |
| Message-ID | <d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com> |
Hello,
I'm currently working on a homework problem that requires me to create a dictionary from a .txt file that contains some of the worst cars ever made. The file looks something like this:
1958 MGA Twin Cam
1958 Zunndapp Janus
1961 Amphicar
1961 Corvair
1966 Peel Trident
1970 AMC Gremlin
1970 Triumph Stag
1971 Chrysler Imperial LeBaron Two-Door Hardtop
The car manufacturer should be the key and a tuple containing the year and the model should be the key's value. I tried the following to just get the contents of the file into a list, but only the very last line in the txt file is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I print temp.
d={}
car_file = open('worstcars.txt', 'r')
for line in car_file:
temp = line.split()
print (temp)
car_file.close()
After playing around with the code, I came up with the following code to get everything into a list:
d=[]
car_file = open('worstcars.txt', 'r')
for line in car_file:
d.append(line.strip('\n'))
print (d)
car_file.close()
Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:
['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']
Any help would be appreciated!
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-01 03:06 +1100 |
| Message-ID | <mailman.4018.1364745985.2939.python-list@python.org> |
| In reply to | #42377 |
On Mon, Apr 1, 2013 at 2:52 AM, C.T. <swilks06@gmail.com> wrote:
> After playing around with the code, I came up with the following code to get everything into a list:
>
> d=[]
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
> d.append(line.strip('\n'))
> print (d)
> car_file.close()
>
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.
Ah, a nice straight-forward text parsing problem!
The question is how to recognize the manufacturer. Is it guaranteed to
be the second blank-delimited word, with the year being the first? If
so, you were almost there with .split().
car_file = open('worstcars.txt', 'r')
# You may want to consider the 'with' statement here - no need to close()
for line in car_file:
temp = line.split(None, 2)
if len(temp)==3:
year, mfg, model = temp
# Now do something with these three values
print("Manufacturer: %s Year: %s Model: %s"%(mfg,year,model))
That's sorted out the parsing side of things. Do you know how to build
up the dictionary from there?
What happens if there are multiple entries in the file for the same
manufacturer? Do you need to handle that?
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 10:19 -0700 |
| Message-ID | <e181e462-38f7-4312-9125-ef47ef42b284@googlegroups.com> |
| In reply to | #42378 |
On Sunday, March 31, 2013 12:06:18 PM UTC-4, Chris Angelico wrote:
> On Mon, Apr 1, 2013 at 2:52 AM, C.T.
>
> > After playing around with the code, I came up with the following code to get everything into a list:
>
> >
>
> > d=[]
>
> > car_file = open('worstcars.txt', 'r')
>
> > for line in car_file:
>
> > d.append(line.strip('\n'))
>
> > print (d)
>
> > car_file.close()
>
> >
>
> > Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.
>
>
>
> Ah, a nice straight-forward text parsing problem!
>
>
>
> The question is how to recognize the manufacturer. Is it guaranteed to
>
> be the second blank-delimited word, with the year being the first? If
>
> so, you were almost there with .split().
>
>
>
> car_file = open('worstcars.txt', 'r')
>
> # You may want to consider the 'with' statement here - no need to close()
>
> for line in car_file:
>
> temp = line.split(None, 2)
>
> if len(temp)==3:
>
> year, mfg, model = temp
>
> # Now do something with these three values
>
> print("Manufacturer: %s Year: %s Model: %s"%(mfg,year,model))
>
>
>
> That's sorted out the parsing side of things. Do you know how to build
>
> up the dictionary from there?
>
>
>
> What happens if there are multiple entries in the file for the same
>
> manufacturer? Do you need to handle that?
>
>
>
> ChrisA
Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-01 04:22 +1100 |
| Message-ID | <mailman.4022.1364750566.2939.python-list@python.org> |
| In reply to | #42383 |
On Mon, Apr 1, 2013 at 4:19 AM, C.T. <swilks06@gmail.com> wrote: > Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names. Then you're going to need some other form of magic to recognize where the manufacturer ends and the model starts. Do you have, say, tabs between the fields and spaces within? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 10:19 -0700 |
| Message-ID | <mailman.4021.1364750406.2939.python-list@python.org> |
| In reply to | #42378 |
On Sunday, March 31, 2013 12:06:18 PM UTC-4, Chris Angelico wrote:
> On Mon, Apr 1, 2013 at 2:52 AM, C.T.
>
> > After playing around with the code, I came up with the following code to get everything into a list:
>
> >
>
> > d=[]
>
> > car_file = open('worstcars.txt', 'r')
>
> > for line in car_file:
>
> > d.append(line.strip('\n'))
>
> > print (d)
>
> > car_file.close()
>
> >
>
> > Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value.
>
>
>
> Ah, a nice straight-forward text parsing problem!
>
>
>
> The question is how to recognize the manufacturer. Is it guaranteed to
>
> be the second blank-delimited word, with the year being the first? If
>
> so, you were almost there with .split().
>
>
>
> car_file = open('worstcars.txt', 'r')
>
> # You may want to consider the 'with' statement here - no need to close()
>
> for line in car_file:
>
> temp = line.split(None, 2)
>
> if len(temp)==3:
>
> year, mfg, model = temp
>
> # Now do something with these three values
>
> print("Manufacturer: %s Year: %s Model: %s"%(mfg,year,model))
>
>
>
> That's sorted out the parsing side of things. Do you know how to build
>
> up the dictionary from there?
>
>
>
> What happens if there are multiple entries in the file for the same
>
> manufacturer? Do you need to handle that?
>
>
>
> ChrisA
Thank you, Chris! I could use slicing and indexing to build the dictionary but the problem is with the car manufacturer an the car model. Either or both could be multiple names.
[toc] | [prev] | [next] | [standalone]
| From | Mark Janssen <dreamingforward@gmail.com> |
|---|---|
| Date | 2013-03-31 09:20 -0700 |
| Message-ID | <mailman.4019.1364746832.2939.python-list@python.org> |
| In reply to | #42377 |
[Multipart message — attachments visible in raw view] — view raw
> > Every line is now an element in list d. The question I have now is how can > I make a dictionary out of the list d with the car manufacturer as the key > and a tuple containing the year and the model should be the key's value. > Here is a sample of what list d looks like: > > ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', > '2003 Hummer H2', '2004 Chevy SSR'] > > Any help would be appreciated! > > As long as your data is consistently ordered, just use list indexing. d[2] is your key, and (d[1],d[3]) the key's value. Mark Tacoma, Washington
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 09:52 -0700 |
| Message-ID | <d1a5fee7-2ce6-4446-8193-587e4c12a0a0@googlegroups.com> |
| In reply to | #42379 |
On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:
>
>
>
>
> ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']
>
>
>
> Any help would be appreciated!
>
>
>
>
> As long as your data is consistently ordered, just use list indexing. d[2] is your key, and (d[1],d[3]) the key's value.
>
>
>
> Mark
> Tacoma, Washington
Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two. I've added the following, but the output is still not what I need it to be.
t={}
for i in d :
t[d[d.index(i)][5:]]= tuple(d[d.index(i)][:4])
print (t)
The output looks something like this:
{'Ford Model T': ('1', '9', '0', '9'), 'Mosler Consulier GTP': ('1', '9', '8', '5'), 'Scripps-Booth Bi-Autogo': ('1', '9', '1', '3'), 'Morgan Plus 8 Propane': ('1', '9', '7', '5'), 'Fiat Multipla': ('1', '9', '9', '8'), 'Ford Pinto': ('1', '9', '7', '1'), 'Triumph Stag': ('1', '9', '7', '0'), 'BMW 7-series': ('2', '0', '0', '2')}
Here the key is the car manufacturer and car model and the value is a tuple containing the year separated by a comma.( Not sure why that is ?)
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-03-31 13:31 -0400 |
| Message-ID | <mailman.4023.1364751102.2939.python-list@python.org> |
| In reply to | #42381 |
On 03/31/2013 12:52 PM, C.T. wrote: > On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote: >> <SNIP> >> > > Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two. I've added the following, but the output is still not what I need it to be. > > So the correct answer is "it cannot be done," and an explanation. Many times I've been given impossible conditions for a problem. And invariably the correct solution is to press bac on the supplier of the constraints. Unless there are some invisible characters in that file, lie tabs in between the fields, it loocs liec you're out of luc. Or you could manually edit the file before running the program. [The character after 'j' is broccen on this cceyboard.] -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-03-31 14:41 -0400 |
| Message-ID | <roy-A43BBD.14414131032013@news.panix.com> |
| In reply to | #42387 |
In article <mailman.4023.1364751102.2939.python-list@python.org>,
Dave Angel <davea@davea.name> wrote:
> On 03/31/2013 12:52 PM, C.T. wrote:
> > On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> >> <SNIP>
> >>
> >
> > Thank you, Mark! My problem is the data isn't consistently ordered. I can
> > use slicing and indexing to put the year into a tuple, but because a car
> > manufacturer could have two names (ie, Aston Martin) or a car model could
> > have two names(ie, Iron Duke), its harder to use slicing and indexing for
> > those two. I've added the following, but the output is still not what I
> > need it to be.
>
> So the correct answer is "it cannot be done," and an explanation.
>
> Many times I've been given impossible conditions for a problem. And
> invariably the correct solution is to press [back] on the supplier of the
> constraints.
In real life, you often have to deal with crappy input data (and bogus
project requirements). Sometimes you just need to be creative.
There's only a small set of car manufacturers. A good start would be
mining wikipedia's [[List of automobile manufacturers]]. Once you've
got that list, you could try matching portions of the input against the
list.
Depending on how much effort you wanted to put into this, you could
explore all sorts of fuzzy matching (ie "delorean" vs "delorean motor
company"), but even a simple search is better than giving up.
And, this is a good excuse to explore some of the interesting
third-party modules. For example, mwclient ("pip install mwclient")
gives you a neat Python interface to wikipedia. And there's a whole
landscape of string matching packages to explore.
We deal with this every day at Songza. Are Kesha and Ke$ha the same
artist? Pushing back on the record labels to clean up their catalogs
isn't going to get us very far.
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-03-31 17:37 -0400 |
| Message-ID | <mailman.4032.1364765875.2939.python-list@python.org> |
| In reply to | #42390 |
On 03/31/2013 02:41 PM, Roy Smith wrote:
> In article <mailman.4023.1364751102.2939.python-list@python.org>,
> Dave Angel <davea@davea.name> wrote:
>
>> On 03/31/2013 12:52 PM, C.T. wrote:
>>> On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
>>>> <SNIP>
>>>>
>>>
>>> Thank you, Mark! My problem is the data isn't consistently ordered. I can
>>> use slicing and indexing to put the year into a tuple, but because a car
>>> manufacturer could have two names (ie, Aston Martin) or a car model could
>>> have two names(ie, Iron Duke), its harder to use slicing and indexing for
>>> those two. I've added the following, but the output is still not what I
>>> need it to be.
>>
>> So the correct answer is "it cannot be done," and an explanation.
>>
>> Many times I've been given impossible conditions for a problem. And
>> invariably the correct solution is to press [back] on the supplier of the
>> constraints.
>
> In real life, you often have to deal with crappy input data (and bogus
> project requirements). Sometimes you just need to be creative.
>
> There's only a small set of car manufacturers. A good start would be
> mining wikipedia's [[List of automobile manufacturers]]. Once you've
> got that list, you could try matching portions of the input against the
> list.
>
> Depending on how much effort you wanted to put into this, you could
> explore all sorts of fuzzy matching (ie "delorean" vs "delorean motor
> company"), but even a simple search is better than giving up.
>
> And, this is a good excuse to explore some of the interesting
> third-party modules. For example, mwclient ("pip install mwclient")
> gives you a neat Python interface to wikipedia. And there's a whole
> landscape of string matching packages to explore.
>
> We deal with this every day at Songza. Are Kesha and Ke$ha the same
> artist? Pushing back on the record labels to clean up their catalogs
> isn't going to get us very far.
>
I agree with everything you've said, although in your case, presumably
the record labels are not your client/boss, so that's not who you push
back against. The client should know when the data is being fudged, and
have a say in how it's to be done.
But this is a homework assignment. I think the OP is learning Python,
not how to second-guess a client.
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2013-04-01 11:41 +0000 |
| Message-ID | <arta2fFh3neU2@mid.individual.net> |
| In reply to | #42390 |
On 2013-03-31, Roy Smith <roy@panix.com> wrote:
> And, this is a good excuse to explore some of the interesting
> third-party modules. For example, mwclient ("pip install
> mwclient") gives you a neat Python interface to wikipedia. And
> there's a whole landscape of string matching packages to
> explore.
>
> We deal with this every day at Songza. Are Kesha and Ke$ha the
> same artist? Pushing back on the record labels to clean up
> their catalogs isn't going to get us very far.
I tried searching for Frost*, an interesting artist I recently
learned about. His name, in combination with a similarly named
rap artist, breaks most search tools.
My guess is this homework is simply borken.
--
Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-04-01 23:53 +0000 |
| Message-ID | <515a1e03$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #42456 |
On Mon, 01 Apr 2013 11:41:03 +0000, Neil Cerutti wrote: > I tried searching for Frost*, an interesting artist I recently learned > about. "Interesting artist" -- is that another term for "wanker"? *wink* > His name, in combination with a similarly named rap artist, > breaks most search tools. As far as I'm concerned, anyone in the 21st century who names themselves or their work (a movie, book, programming language, etc.) something which breaks search tools is just *begging* for obscurity, and we ought to respect their wishes. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Walter Hurry <walterhurry@lavabit.com> |
|---|---|
| Date | 2013-04-02 00:28 +0000 |
| Message-ID | <kjd8n7$44v$1@news.albasani.net> |
| In reply to | #42525 |
On Mon, 01 Apr 2013 23:53:40 +0000, Steven D'Aprano wrote: > As far as I'm concerned, anyone in the 21st century who names themselves > or their work (a movie, book, programming language, etc.) something > which breaks search tools is just *begging* for obscurity, and we ought > to respect their wishes. IIRC, there was an eccentric musician some years back who did just that. I seem to remember that he changed his name to some kind of androgynous looking rune with what appeared to be a bent trumpet across it. Thenceforth it became known as "the (piss) artist: formally gnome ass prints", or at least something sounding vaguely like that.
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2013-04-03 12:55 +0000 |
| Message-ID | <as2n6rFnjp7U1@mid.individual.net> |
| In reply to | #42525 |
On 2013-04-01, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Mon, 01 Apr 2013 11:41:03 +0000, Neil Cerutti wrote: > > >> I tried searching for Frost*, an interesting artist I recently learned >> about. > > "Interesting artist" -- is that another term for "wanker"? > > *wink* hee-hee. It depends on how much of a hankering you have for pretentious progressive synth-rock. >> His name, in combination with a similarly named rap artist, >> breaks most search tools. > > As far as I'm concerned, anyone in the 21st century who names > themselves or their work (a movie, book, programming language, > etc.) something which breaks search tools is just *begging* for > obscurity, and we ought to respect their wishes. I do think it's something he did on purpose. The asterisk, I believe, symbolizes the exclusive genius of his fans. -- Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 09:52 -0700 |
| Message-ID | <mailman.4020.1364749257.2939.python-list@python.org> |
| In reply to | #42379 |
On Sunday, March 31, 2013 12:20:25 PM UTC-4, zipher wrote:
> Every line is now an element in list d. The question I have now is how can I make a dictionary out of the list d with the car manufacturer as the key and a tuple containing the year and the model should be the key's value. Here is a sample of what list d looks like:
>
>
>
>
> ['1899 Horsey Horseless', '1909 Ford Model T', '1911 Overland OctoAuto', '2003 Hummer H2', '2004 Chevy SSR']
>
>
>
> Any help would be appreciated!
>
>
>
>
> As long as your data is consistently ordered, just use list indexing. d[2] is your key, and (d[1],d[3]) the key's value.
>
>
>
> Mark
> Tacoma, Washington
Thank you, Mark! My problem is the data isn't consistently ordered. I can use slicing and indexing to put the year into a tuple, but because a car manufacturer could have two names (ie, Aston Martin) or a car model could have two names(ie, Iron Duke), its harder to use slicing and indexing for those two. I've added the following, but the output is still not what I need it to be.
t={}
for i in d :
t[d[d.index(i)][5:]]= tuple(d[d.index(i)][:4])
print (t)
The output looks something like this:
{'Ford Model T': ('1', '9', '0', '9'), 'Mosler Consulier GTP': ('1', '9', '8', '5'), 'Scripps-Booth Bi-Autogo': ('1', '9', '1', '3'), 'Morgan Plus 8 Propane': ('1', '9', '7', '5'), 'Fiat Multipla': ('1', '9', '9', '8'), 'Ford Pinto': ('1', '9', '7', '1'), 'Triumph Stag': ('1', '9', '7', '0'), 'BMW 7-series': ('2', '0', '0', '2')}
Here the key is the car manufacturer and car model and the value is a tuple containing the year separated by a comma.( Not sure why that is ?)
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-03-31 12:38 -0400 |
| Message-ID | <roy-7B35BF.12385631032013@news.panix.com> |
| In reply to | #42377 |
In article <d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com>,
"C.T." <swilks06@gmail.com> wrote:
> Hello,
>
> I'm currently working on a homework problem that requires me to create a
> dictionary from a .txt file that contains some of the worst cars ever made.
> The file looks something like this:
>
> 1958 MGA Twin Cam
> 1958 Zunndapp Janus
> 1961 Amphicar
> 1961 Corvair
> 1966 Peel Trident
> 1970 AMC Gremlin
> 1970 Triumph Stag
> 1971 Chrysler Imperial LeBaron Two-Door Hardtop
>
> The car manufacturer should be the key and a tuple containing the year and
> the model should be the key's value. I tried the following to just get the
> contents of the file into a list, but only the very last line in the txt file
> is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I
> print temp.
>
> d={}
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
> temp = line.split()
> print (temp)
> car_file.close()
Yup. Because you run through the whole file, putting each line into
temp, overwriting the previous temp value.
> d=[]
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
> d.append(line.strip('\n'))
> print (d)
> car_file.close()
You could do most of that with just:
car_file = open('worstcars.txt', 'r')
d = car_file.readlines()
but there's no real reason to read the whole file into a list. What you
probably want to do is something like:
d = {}
car_file = open('worstcars.txt', 'r')
for line in car_file:
year, manufacturer, model = parse_line(line)
d[manufacturer] = (year, model)
One comment about the above; it assumes that there's only a single entry
for a given manufacturer in the file. If that's not true, the above
code will only keep the last one. But let's assume it's true for the
moment.
Now, we're just down to writing parse_line(). This takes a string and
breaks it up into 3 strings. I'm going to leave this as an exercise for
you to work out. The complicated part is going to be figuring out some
logic to deal with anything from multi-word model names ("Imperial
LeBaron Two-Door Hardtop"), to lines like the Corvair where there is no
manufacturer (or maybe there's no model?).
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-03-31 10:28 -0700 |
| Message-ID | <dc1bc939-fd8d-436d-872a-19779602351e@googlegroups.com> |
| In reply to | #42380 |
On Sunday, March 31, 2013 12:38:56 PM UTC-4, Roy Smith wrote:
> In article <d15c39bc-5d2a-42c9-a76b-23768b61c391@googlegroups.com>,
>
> "C.T." wrote:
>
>
>
> > Hello,
>
> >
>
> > I'm currently working on a homework problem that requires me to create a
>
> > dictionary from a .txt file that contains some of the worst cars ever made.
>
> > The file looks something like this:
>
> >
>
> > 1958 MGA Twin Cam
>
> > 1958 Zunndapp Janus
>
> > 1961 Amphicar
>
> > 1961 Corvair
>
> > 1966 Peel Trident
>
> > 1970 AMC Gremlin
>
> > 1970 Triumph Stag
>
> > 1971 Chrysler Imperial LeBaron Two-Door Hardtop
>
> >
>
> > The car manufacturer should be the key and a tuple containing the year and
>
> > the model should be the key's value. I tried the following to just get the
>
> > contents of the file into a list, but only the very last line in the txt file
>
> > is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I
>
> > print temp.
>
> >
>
> > d={}
>
> > car_file = open('worstcars.txt', 'r')
>
> > for line in car_file:
>
> > temp = line.split()
>
> > print (temp)
>
> > car_file.close()
>
>
>
> Yup. Because you run through the whole file, putting each line into
>
> temp, overwriting the previous temp value.
>
>
>
> > d=[]
>
> > car_file = open('worstcars.txt', 'r')
>
> > for line in car_file:
>
> > d.append(line.strip('\n'))
>
> > print (d)
>
> > car_file.close()
>
>
>
> You could do most of that with just:
>
>
>
> car_file = open('worstcars.txt', 'r')
>
> d = car_file.readlines()
>
>
>
> but there's no real reason to read the whole file into a list. What you
>
> probably want to do is something like:
>
>
>
> d = {}
>
> car_file = open('worstcars.txt', 'r')
>
> for line in car_file:
>
> year, manufacturer, model = parse_line(line)
>
> d[manufacturer] = (year, model)
>
>
>
> One comment about the above; it assumes that there's only a single entry
>
> for a given manufacturer in the file. If that's not true, the above
>
> code will only keep the last one. But let's assume it's true for the
>
> moment.
>
>
>
> Now, we're just down to writing parse_line(). This takes a string and
>
> breaks it up into 3 strings. I'm going to leave this as an exercise for
>
> you to work out. The complicated part is going to be figuring out some
>
> logic to deal with anything from multi-word model names ("Imperial
>
> LeBaron Two-Door Hardtop"), to lines like the Corvair where there is no
>
> manufacturer (or maybe there's no model?).
Roy, thank you so much! I'll do some more research to see how I can achieve this. Thank you!
[toc] | [prev] | [next] | [standalone]
| From | Terry Jan Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-03-31 15:04 -0400 |
| Message-ID | <mailman.4024.1364756707.2939.python-list@python.org> |
| In reply to | #42377 |
On 3/31/2013 11:52 AM, C.T. wrote:
> Hello,
>
> I'm currently working on a homework problem that requires me to create a dictionary from a .txt file that contains some of the worst cars ever made. The file looks something like this:
>
> 1958 MGA Twin Cam
> 1958 Zunndapp Janus
> 1961 Amphicar
> 1961 Corvair
> 1966 Peel Trident
> 1970 AMC Gremlin
> 1970 Triumph Stag
> 1971 Chrysler Imperial LeBaron Two-Door Hardtop
>
> The car manufacturer should be the key and a tuple containing the year and the model should be the key's value. I tried the following to just get the contents of the file into a list, but only the very last line in the txt file is shown as a list with three elements (ie, ['2004', 'Chevy', 'SSR']) when I print temp.
>
> d={}
> car_file = open('worstcars.txt', 'r')
> for line in car_file:
> temp = line.split()
If all makers are one word (Austen-Martin would be ok, and if the file
is otherwise consistently year maker model words, then adding
'maxsplit=3' to the split call would be all the parsing you need.
[toc] | [prev] | [next] | [standalone]
| From | "C.T." <swilks06@gmail.com> |
|---|---|
| Date | 2013-04-01 16:53 -0700 |
| Message-ID | <e83bb343-fbed-4d30-ad87-21a99a929f61@googlegroups.com> |
| In reply to | #42377 |
Thanks for all the help everyone! After I manually edited the txt file, this is what I came up with:
car_dict = {}
car_file = open('cars.txt', 'r')
for line in car_file:
temp = line.strip().split(None, 2)
temp2 = line.strip().split('\t')
if len(temp)==3:
year, manufacturer, model = temp[0] ,temp2[0][5:], temp2[1]
value = (year, model)
if manufacturer in car_dict:
car_dict.setdefault(manufacturer,[]).append(value)
else:
car_dict[manufacturer] = [value]
elif len(temp)==2:
year, manufacturer, model = temp[0], 'Unknown' , temp2[1]
value = (year, model)
if manufacturer in car_dict:
car_dict.setdefault(manufacturer,[]).append(value)
else:
car_dict[manufacturer] = [value]
car_file.close()
print (car_dict)
It may not be the most pythonic way of doing this, but it works for me. I am learning python, and this problem was problem the most challenging so far. Thank you all, again!
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-04-01 20:12 -0400 |
| Message-ID | <mailman.26.1364861572.17481.python-list@python.org> |
| In reply to | #42526 |
On 04/01/2013 07:53 PM, C.T. wrote:
> Thanks for all the help everyone! After I manually edited the txt file, this is what I came up with:
>
> car_dict = {}
> car_file = open('cars.txt', 'r')
>
>
>
> for line in car_file:
> temp = line.strip().split(None, 2)
> temp2 = line.strip().split('\t')
>
>
> if len(temp)==3:
> year, manufacturer, model = temp[0] ,temp2[0][5:], temp2[1]
> value = (year, model)
> if manufacturer in car_dict:
> car_dict.setdefault(manufacturer,[]).append(value)
That's rather redundant. Once you've determined that the particular key
is already there, why bother with the setdefault() call? Or to put it
another way, why bother to test if it's there when you're going to use
setdefault to handle the case where it's not?
> else:
> car_dict[manufacturer] = [value]
>
>
> elif len(temp)==2:
> year, manufacturer, model = temp[0], 'Unknown' , temp2[1]
> value = (year, model)
> if manufacturer in car_dict:
> car_dict.setdefault(manufacturer,[]).append(value)
> else:
> car_dict[manufacturer] = [value]
>
>
> car_file.close()
>
> print (car_dict)
>
> It may not be the most pythonic way of doing this, but it works for me. I am learning python, and this problem was problem the most challenging so far. Thank you all, again!
>
--
DaveA
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web