Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'syntax': 0.04; 'column': 0.07; 'tries': 0.07; 'variables': 0.07; 'subject:help': 0.08; 'string': 0.09; 'append': 0.09; 'clause': 0.09; 'content:': 0.09; 'extracted': 0.09; 'here?': 0.09; 'item,': 0.09; 'repeated': 0.09; 'python': 0.11; 'before.': 0.16; 'combinations': 0.16; 'defined,': 0.16; 'dict': 0.16; 'guys,': 0.16; 'iterating': 0.16; 'loop.': 0.16; 'nesting': 0.16; 'outdated': 0.16; 'rdbms': 0.16; 'readable': 0.16; 'received:74.208.4.195': 0.16; 'skip:q 30': 0.16; 'storing': 0.16; 'stuff,': 0.16; 'variable.': 0.16; 'variables,': 0.16; 'skip:= 10': 0.16; 'thursday,': 0.16; 'wrote:': 0.18; 'code.': 0.18; 'file,': 0.19; 'later': 0.20; 'meant': 0.20; 'select': 0.22; 'example': 0.22; 'header:User- Agent:1': 0.23; 'sorry,': 0.24; 'guys': 0.24; '(see': 0.26; 'equivalent': 0.26; 'mention': 0.26; 'possibly': 0.26; 'values': 0.27; 'header:In-Reply-To:1': 0.27; 'fixed': 0.29; 'on,': 0.29; 'am,': 0.29; 'needed.': 0.30; 'waste': 0.30; 'subject:list': 0.30; "i'm": 0.30; 'code': 0.31; 'getting': 0.31; 'comments': 0.31; 'lines': 0.31; "skip:' 10": 0.31; 'apparently': 0.31; 'bunch': 0.31; 'description,': 0.31; 'disable': 0.31; 'exceptions': 0.31; 'follows': 0.31; 'keys': 0.31; 'skip:q 20': 0.31; 'stands': 0.31; 'with,': 0.31; 'file': 0.32; 'lists': 0.32; 'there.': 0.32; 'probably': 0.32; 'skip:c 30': 0.32; 'run': 0.32; 'text': 0.33; 'guess': 0.33; 'sense': 0.34; 'maybe': 0.34; 'could': 0.34; 'problem': 0.35; 'subject:with': 0.35; 'but': 0.35; 'version': 0.36; 'really': 0.36; 'data,': 0.36; 'subject:data': 0.36; 'done': 0.36; 'doing': 0.36; 'entry': 0.36; 'should': 0.36; 'list': 0.37; 'list.': 0.37; 'clear': 0.37; 'expected': 0.38; 'ends': 0.38; 'to:addr:python-list': 0.38; 'track': 0.38; 'previous': 0.38; 'expect': 0.39; 'to:addr:python.org': 0.39; 'read': 0.60; 'mentioned': 0.61; 'new': 0.61; 'entire': 0.61; 'simply': 0.61; "you're": 0.61; 'times': 0.62; 'show': 0.63; 'more': 0.64; 'total': 0.65; 'different': 0.65; 'due': 0.66; 'here': 0.66; 'sample': 0.67; 'received:74.208': 0.68; 'further,': 0.74; 'now:': 0.74; '.....': 0.78; '100': 0.79; 'captures': 0.84; 'dict.': 0.84; 'fifty': 0.84; "it'd": 0.84; 'presumably': 0.84; 'remarks': 0.84; 'total,': 0.84; 'aging': 0.91; 'picture': 0.97; '2013': 0.98 Date: Thu, 09 May 2013 08:26:03 -0400 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: python-list@python.org Subject: Re: help on Implementing a list of dicts with no data pattern References: <826082ef-43e2-4835-8621-6ef677eb922c@googlegroups.com> In-Reply-To: <826082ef-43e2-4835-8621-6ef677eb922c@googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:0pZZG2L7QhoNfEFx1l20AAhvCm2ZRjcKCMHCD/x8/r1 mEXccyGUPoMLixRx8kTWozgPeVlv+S/IW8dzpJGShUS7RDz9wz l2AJiAjpqSZU6ShCO/SC+ruXs+GSACaUilm+WhzxfYf0I82MnA oBJaxmHj4CWsJsQDUTb2/aOB1Q00bH4vKPgEZ29Q30GTBukuG1 kPNcmX22J9adKmmFNLdwmUEEapZOlxINVqvLDVXbRriwnBOHMA aqmSoJ4gcdxemJaUxYreIu6+MPkuoQ2Yc+TMf8Sd6lVTadQGCW DYHXrETmzXUXvB4VPZ6h7TXXV1jyE59M8ggo4Sj89Ya6SxruQ= = X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 117 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1368102385 news.xs4all.nl 15887 [2001:888:2000:d::a6]:34003 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:45030 On 05/09/2013 05:57 AM, rlelis wrote: > On Thursday, May 9, 2013 12:47:47 AM UTC+1, rlelis wrote: >> Hi guys, >> >> >> >> I'm working on this long file, where i have to keep reading and >> >> storing different excerpts of text (data) in different variables (list). >> >> >> >> Once done that i want to store in dicts the data i got from the lists mentioned before. I want them on a list of dicts for later RDBMs purpose's. >> >> >> >> The data i'm working with, don't have fixed pattern (see example bellow), so what i'm doing is for each row, i want to store combinations of word/value (Key-value) to keep track of all the data. >> >> >> >> My problem is that once i'm iterating over the list (original one a.k.a file_content in the link), then i'm nesting several if clause to match >> >> the keys i want. Done that i select the keys i want to give them values and lastly i append that dict into a new list. The problem here is that i end up always with the last line repeated several times for each row it found's. >> >> >> >> Please take a look on what i have now: >> >> http://pastebin.com/A9eka7p9 > > Sorry, i thought that a link to pastebin could be helpfully since it captures the syntax highlights and spacings. I don't have a fifty line code there. The 25 lines below, where to show you guys a picture of what is going on, to be more intuitive. > This is what i have for now: > The entire following set of comments is probably outdated since you apparently did NOT use readlines() or equivalent to get file_content. So you'd better give us some sample data, a program that can actually run without getting exceptions due to misnamed variables, and a description of just what you expected to be in each result variable. It'd also be smart to mention what version of Python you're targeting. .... what follows was a waste of my time ... file_content is not defined, but we can guess you have read it from a text file with readlines(), or more efficiently that it's simply a file object for a file opened with "r". Can we see sample data, maybe for 3 or four lines? file_content = [ "A4 value2 aging", "b8 value99 paging", "-1 this is aging a test", "B2 repeaagingts", ] The sample, or the description, should indicate if repeats of the "columns" column are allowed, as with b and B above. > highway_dict = {} > aging_dict = {} > queue_row = [] > for content in file_content: > if 'aging' in content: > # aging 0 100 > collumns = ''.join(map(str, content[:1])).replace('-','_').lower() > total_values =''.join(map(str, content[1:2])) > aging_values = ''.join(map(str, content[2:])) Those three lines would be much more reasonable and readable if you eliminated all the list stuff, and just did what was needed. Also, calling a one-character string "collumns" or "total_values" makes no sense to me. collumns = content[:1].replace('-','_').lower() total_values = content[1:2] aging_values = content[2:] > > aging_dict['total'], aging_dict[collumns] = total, aging_values That line tries to get clever, and ends up obscuring what's really happening. Further, the value in total, if any is NOT what you just extracted in total_values. aging_dict['total'] = total aging_dict[collumns] = aging_values > queue_row.append(aging_dict) Just what do you expect to be in the aging_dict here? If you intended that each item of queue_row contains a dict with just one item, then you need to clear aging_dict each time through the loop. As it stands the list ends up with a bunch of dicts, each with possibly one more entry than the previous dict. All the same remarks apply to the following code. Additionally, you don't use collumns for anything, and you use lanes and state when you presumably meant lanes_values and state_values. > > if 'highway' in content: > #highway | 4 | disable | 25 > collumns = ''.join(map(str, content[:1])).replace('-','_').lower() > lanes_values =''.join(map(str, content[1:2])) > state_values = ''.join(map(str, content[2:3])).strip('') > limit_values = ''.join(map(str, content[3:4])).strip('') > > highway_dict['lanes'], highway_dict['state'], highway_dict['limit(mph)'] = lanes, state, limit_values > queue_row.append(highway_dict) > Now, when -- DaveA