Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!1.eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'root': 0.04; 'subject:text': 0.04; 'key.': 0.07; 'overflow': 0.07; 'occurrences': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:keys': 0.09; 'python': 0.11; 'algorithm': 0.13; 'stack': 0.13; '(regardless': 0.16; 'csv': 0.16; 'message-id:@4ax.com': 0.16; 'parse,': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'subject:CSV': 0.16; 'subject:XML': 0.16; 'subject:headers': 0.16; 'subject:values': 0.16; 'url:home': 0.18; 'subject:will': 0.22; 'defined': 0.23; '2015': 0.23; 'wrote': 0.23; 'mon,': 0.24; 'header:X-Complaints- To:1': 0.26; 'this.': 0.28; 'subject:that': 0.29; 'values': 0.30; 'relative': 0.31; "i'd": 0.31; 'post': 0.32; 'compiled': 0.32; 'subject:) ': 0.32; '-0700': 0.33; 'to:addr:python-list': 0.35; 'something': 0.35; 'being': 0.36; 'subject:" ': 0.36; 'should': 0.37; 'client': 0.37; 'subject:: ': 0.37; 'charset:us-ascii': 0.37; 'received:org': 0.38; 'feedback': 0.38; 'to:addr:python.org': 0.39; 'takes': 0.39; 'data': 0.40; 'sure': 0.40; 'subject:with': 0.40; 'subject: (': 0.40; 'your': 0.60; 'hope': 0.61; 'hours': 0.63; 'backup': 0.66; 'hate': 0.66; 'day': 0.70; '>from': 0.76; 'insight,': 0.84; 'subject:write': 0.84; 'dennis': 0.91; 'received:108': 0.93; 'placement': 0.95 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Dennis Lee Bieber Subject: Re: enumerate XML tags (keys that will become headers) along with text (values) and write to CSV in one row (as opposed to "stacked" values with one header) Date: Mon, 29 Jun 2015 21:26:37 -0400 Organization: IISS Elusive Unicorn References: <14aeae7a-41ab-4619-8331-7995e2420e54@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: adsl-108-68-177-131.dsl.klmzmi.sbcglobal.net X-Newsreader: Forte Agent 6.00/32.1186 X-No-Archive: YES X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 20 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1435627617 news.xs4all.nl 2845 [2001:888:2000:d::a6]:60783 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:93306 On Mon, 29 Jun 2015 07:52:07 -0700 (PDT), Sahlusar declaimed the following: > >From what I understand, therefore, based on your constructive insight, is that the 14 occurrences of the same tag (regardless of placement relative to neighbouring children and the root are all being defined as the same key. However, their individual values are also being treated as the same (from the algorithm that I wrote in my Stack Overflow post (please see above)). The constraint is that I am anticipating terabytes of data every day from the client in the coming months. The algorithm should be able to parse, and write out to CSV in the most efficient manner. That is my design constraint. I welcome your feedback on this. > I sure hope that "terabytes of data" is hyperbole... My system takes something like three hours just to generate a 500GB backup (one partition each week -- I have a 4TB backup drive with only 740GB free; the other drives are only half full or I'd need an 8TB backup). And that's using a compiled backup program -- I'd hate to consider what Python would require to backup the partition. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/