Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #3810

Finding empty columns. Is there a faster way?

Path csiph.com!x330-a1.tempe.blueboxinc.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!postnews.google.com!u12g2000vbf.googlegroups.com!not-for-mail
From nn <pruebauno@latinmail.com>
Newsgroups comp.lang.python
Subject Finding empty columns. Is there a faster way?
Date Thu, 21 Apr 2011 09:40:36 -0700 (PDT)
Organization http://groups.google.com
Lines 42
Message-ID <e6f7d142-0691-4a6f-91fe-401dcbe291c9@u12g2000vbf.googlegroups.com> (permalink)
NNTP-Posting-Host 67.98.187.69
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1
X-Trace posting.google.com 1303404037 18276 127.0.0.1 (21 Apr 2011 16:40:37 GMT)
X-Complaints-To groups-abuse@google.com
NNTP-Posting-Date Thu, 21 Apr 2011 16:40:37 +0000 (UTC)
Complaints-To groups-abuse@google.com
Injection-Info u12g2000vbf.googlegroups.com; posting-host=67.98.187.69; posting-account=2aOVSgkAAAAobcghlhtXDlbggz5lM5kz
User-Agent G2/1.0
X-HTTP-UserAgent Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0,gzip(gfe)
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:3810

Show key headers only | View raw


time head -1000000 myfile  >/dev/null

real    0m4.57s
user    0m3.81s
sys     0m0.74s

time ./repnullsalt.py '|' myfile
0 1 Null columns:
11, 20, 21, 22, 23, 24, 25, 26, 27, 30, 31, 33, 45, 50, 68

real    1m28.94s
user    1m28.11s
sys     0m0.72s



import sys
def main():
    with open(sys.argv[2],'rb') as inf:
        limit = sys.argv[3] if len(sys.argv)>3 else 1
        dlm = sys.argv[1].encode('latin1')
        nulls = [x==b'' for x in next(inf)[:-1].split(dlm)]
        enum = enumerate
        split = bytes.split
        out = sys.stdout
        prn = print
        for j, r in enum(inf):
            if j%1000000==0:
                prn(j//1000000,end=' ')
                out.flush()
                if j//1000000>=limit:
                    break
            for i, cur in enum(split(r[:-1],dlm)):
                nulls[i] |= cur==b''
    print('Null columns:')
    print(', '.join(str(i+1) for i,val in enumerate(nulls) if val))

if not (len(sys.argv)>2):
    sys.exit("Usage: "+sys.argv[0]+
         " <delimiter> <filename> <limit>")

main()

Back to comp.lang.python | Previous | NextNext in thread | Find similar


Thread

Finding empty columns. Is there a faster way? nn <pruebauno@latinmail.com> - 2011-04-21 09:40 -0700
  Re: Finding empty columns. Is there a faster way? Jon Clements <joncle@googlemail.com> - 2011-04-21 13:32 -0700
    Re: Finding empty columns. Is there a faster way? nn <pruebauno@latinmail.com> - 2011-04-22 07:46 -0700

csiph-web