Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #105588
| Path | csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Bruce Kirk <bruce.kirk24@gmail.com> |
| Newsgroups | comp.lang.python |
| Subject | Re: Python to do CDC on XML files |
| Date | Wed, 23 Mar 2016 19:57:12 -0400 |
| Lines | 18 |
| Message-ID | <mailman.79.1458801774.2244.python-list@python.org> (permalink) |
| References | <833ad88a-4840-4a23-8ab3-b736068b49fe@googlegroups.com> <CAP1rxO79Rzo3tAhR9E5djkhWB79x2QrHB-+0rStW_girQumobg@mail.gmail.com> |
| Mime-Version | 1.0 (1.0) |
| Content-Type | text/plain; charset=us-ascii |
| Content-Transfer-Encoding | quoted-printable |
| X-Trace | news.uni-berlin.de dUROKxdCVmstycdlPTwSgg6028drIsO5NXoqHjrE/BUQ== |
| Return-Path | <bruce.kirk24@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.021 |
| X-Spam-Evidence | '*H*': 0.96; '*S*': 0.00; 'subject:Python': 0.05; 'cc:addr:python-list': 0.09; 'agree,': 0.09; 'subject:files': 0.09; 'files.': 0.13; '2016': 0.16; '23,': 0.16; 'cc:name:python list': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:XML': 0.16; 'wrote:': 0.16; '>': 0.18; 'email addr:gmail.com>': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'xml': 0.24; 'header:In-Reply-To:1': 0.24; 'compare': 0.27; 'received:192.168.10': 0.29; 'anyone': 0.32; 'message- id:@gmail.com': 0.34; 'received:google.com': 0.35; 'too': 0.36; 'should': 0.36; 'received:209.85': 0.36; 'structures': 0.36; 'volume': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'charset:us- ascii': 0.37; 'received:209': 0.38; 'files': 0.38; 'data': 0.39; 'does': 0.39; 'received:192': 0.39; 'challenge': 0.61; 'header :Message-Id:1': 0.61; 'within': 0.64; 'mar': 0.65; 'capture': 0.66; 'million': 0.74; 'same,': 0.91; 'ipad': 0.95 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=CNF/bELWvzmx25WrykVp30OgMnHfjNkNt+VeHm7NeN0=; b=cV5PcENksKzeZYLK7SvP2aohg3xUgpSglK1CJvhCW+Czin8cYWsk637/oGkRaTpbhD HwIXePNQGQJHL9Pysgwuy2cAmPGh0zxmZpw5ABk8lx4mSLWlER8xdMCRpx1UBjY/K8sU 5N5oBEfiKQgJmNcfs9WYZUP6xTBpY6qWMTCSwts+NaqmY/mMe9exOc6F7RN/mkhzjng/ XUNNomilxAcwUILn/r7RmKnGM+qPOb6K6o9Ynw4j91zORLV82mUGpRAVibPLs7qwvgjx 9cm88pL2XqEXJ7+BLNbF7Be/NiM4+mfq51GJkE5l0iiJZ/1pUbTaO70YqY2/xgv6qiKi dRNg== |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=CNF/bELWvzmx25WrykVp30OgMnHfjNkNt+VeHm7NeN0=; b=Lz3JlaA1Agd91QknFGgYAG1Q+9RO/hddptXen6nlRl0tZW+T989P08dwhr8DLDJx7X +QNfibUd4Pf2CPRSZoXFssKUK01YmFEmqYf8+bekENPp7oh2O/EH77lKohb0KqLJ5xX9 /DnT5abRTpv3sK2HnEnDIqlrStVD9BPOpto2h3klSUFBuusljWcC/4OEadx3hsjkQc74 wh6hn+n/9hzOABwVFxx1CLTNcD+Rge7G7pxkaHuOdrj8UvQJJIK27Wc122mXhBDhH9k+ qcfCQObBNOK3SQouYbif02i4Q/cJMYDfEW5f0r7BlN+bTwJysPL/Gfg4wq7Rmje/HVeu qGrw== |
| X-Gm-Message-State | AD7BkJIzwluBj1b7WjNIe3dq/k3A9vrx+uaoW0ZtcdhDCBqYJUgzt6adxM7gKbXrFiK2HQ== |
| X-Received | by 10.140.18.168 with SMTP id 37mr7048537qgf.59.1458777433879; Wed, 23 Mar 2016 16:57:13 -0700 (PDT) |
| X-Mailer | iPad Mail (13E233) |
| In-Reply-To | <CAP1rxO79Rzo3tAhR9E5djkhWB79x2QrHB-+0rStW_girQumobg@mail.gmail.com> |
| X-Mailman-Approved-At | Thu, 24 Mar 2016 02:42:53 -0400 |
| X-Content-Filtered-By | Mailman/MimeDel 2.1.21 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.21 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Xref | csiph.com comp.lang.python:105588 |
Show key headers only | View raw
I agree, the challenge is the volume of the data to compare is 13. Million records. So it needs to be very fast Sent from my iPad > On Mar 23, 2016, at 4:47 PM, Bob Gailer <bgailer@gmail.com> wrote: > > > On Mar 23, 2016 4:20 PM, "Bruce Kirk" <bruce.kirk24@gmail.com> wrote: > > > > Does anyone know of any existing projects on how to generate a change data capture on 2 very large xml files. > > > > The xml structures are the same, it is the data within the files that may differ. > > > It should not be too difficult to write a program that locates the tags delimiting each record, then compare them.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Python to do CDC on XML files Bruce Kirk <bruce.kirk24@gmail.com> - 2016-03-23 13:16 -0700 Re: Python to do CDC on XML files Bob Gailer <bgailer@gmail.com> - 2016-03-23 16:47 -0400 Re: Python to do CDC on XML files Bruce Kirk <bruce.kirk24@gmail.com> - 2016-03-23 19:57 -0400 Re: Python to do CDC on XML files Chris Angelico <rosuav@gmail.com> - 2016-03-24 18:00 +1100 Re: Python to do CDC on XML files Peter Otten <__peter__@web.de> - 2016-03-24 09:19 +0100
csiph-web