Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'processed': 0.05; 'extracted': 0.07; 'great.': 0.07; 'pypy': 0.07; 'subject:ANN': 0.07; 'from:addr:ethan': 0.09; 'from:addr:stoneleaf.us': 0.09; 'from:name:ethan furman': 0.09; 'indexes': 0.09; 'message-id:@stoneleaf.us': 0.09; 'project?': 0.09; 'received:184.172': 0.09; 'received:gator410.hostgator.com': 0.09; 'slow.': 0.09; 'subset': 0.09; 'supported.': 0.09; '~ethan~': 0.09; 'bug': 0.10; 'stored': 0.10; 'thread': 0.11; 'files.': 0.13; 'index': 0.13; '(but': 0.15; 'in-memory': 0.16; 'presume': 0.16; 'record,': 0.16; 'wrote:': 0.17; 'typical': 0.17; 'module': 0.19; 'work.': 0.23; 'purposes': 0.23; 'tables': 0.23; 'third-party': 0.23; "i've": 0.23; 'allows': 0.25; 'header:In- Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; '(which': 0.26; 'creating': 0.26; 'common': 0.26; 'format,': 0.27; 'primarily': 0.27; 'all.': 0.28; 'initial': 0.28; 'record': 0.28; 'gis': 0.29; 'in-house': 0.29; 'searches': 0.29; 'selecting': 0.29; 'included': 0.29; 'point': 0.31; 'url:python': 0.32; 'extract': 0.33; 'legacy': 0.33; 'to:addr:python-list': 0.33; 'largest': 0.35; 'table': 0.35; 'list.': 0.35; 'add': 0.36; 'created': 0.36; 'but': 0.36; 'url:org': 0.36; 'visual': 0.36; 'being': 0.37; '(for': 0.37; 'quite': 0.37; 'far': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'comment': 0.38; 'supports': 0.38; 'some': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'help': 0.40; 'most': 0.61; 'first': 0.61; 'provide': 0.62; 'day.': 0.63; 'mentioned': 0.63; 'more': 0.63; 'our': 0.65; 'total': 0.65; 'received:67.18': 0.65; 'records': 0.68; 'million': 0.72; '300,000': 0.84; 'ethan,': 0.84; 'furman': 0.84; 'memos': 0.84; 'to-do': 0.84; 'disposal': 0.91; 'ethan': 0.91; 'folk': 0.91; 'iii': 0.91; 'ultimate': 0.93 Date: Fri, 27 Jul 2012 03:50:39 -0700 From: Ethan Furman User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: python-list@python.org Subject: Re: ANN: dbf.py 0.94.003 References: <50119AF5.6020904@stoneleaf.us> <5011ED43.1050809@fossworkflowguides.com> In-Reply-To: <5011ED43.1050809@fossworkflowguides.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator410.hostgator.com X-AntiAbuse: Original Domain - python.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stoneleaf.us X-BWhitelist: no X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: ([192.168.74.2]) [50.137.155.36]:2773 X-Source-Auth: ethan+stoneleaf.us X-Email-Count: 1 X-Source-Cap: dG9idWs7dG9idWs7Z2F0b3I0MTAuaG9zdGdhdG9yLmNvbQ== X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 53 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1343387770 news.xs4all.nl 6952 [2001:888:2000:d::a6]:45014 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:26141 Simon Cropper wrote: > On 27/07/12 05:31, Ethan Furman wrote: >> A few more bug fixes, and I actually included the documentation this >> time. :) It can be found at http://python.org/pypi/dbf, and has been >> tested on CPythons 2.4 - 2.7, and PyPy 1.8. > > [snip] > > Ethan, > > That's great. > > Can you comment on the ultimate aim of the project? To provide read/write access to the dbf format, from dBase III to dBase 7, including memos and index files. Currently supports dBase III, FoxPro, Clipper, and Visual Foxpro (but not autoincrement nor varchar). > Is this package primarily a "universal dbf translator" that allows the > data stored in DBFs (which I might add I have many in legacy VFP > applications and GIS Shapefiles) to be accessed and extracted or is the > module being designed to be used interactively to extract data from and > update tables? Some folk use it as a dbf translator, some folk use it for interactive work. I use it for both those purposes as well as creating new dbf files which get processed by our in-house software as well as third-party software every day. > I remember on the last thread that someone mentioned that indexes are > not supported. I presume then that moving around a table with a couple > of million records might be a tad slow. Have you tested the package on > large datasets, both DBFs with a large number of records as well as a > large number of fields? The largest tables I've had at my disposal so far were about 300,000 records with roughly 50 fields with a total record length of about 1,500. Processing (for me) involves going through every single record, and yes it was a tad slow. This is my most common scenario, and index files would not help at all. For more typical work (for others) of selecting and using a subset of the dbf, an in-memory index can be created -- initial creation can take a few moments, but searches afterwards are quite quick. This is a pure-python implementation, so speed is not the first goal. At some point in the future I would like to create a C accelerator, but that's pretty far down the to-do list. ~Ethan~