Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #91029

Re: Best approach to create humongous amount of files

Path csiph.com!usenet.pasdenom.info!news.albasani.net!feeder.erje.net!1.eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <cfkaran2@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.072
X-Spam-Evidence '*H*': 0.86; '*S*': 0.00; 'test,': 0.05; '"w")': 0.09; 'subject:create': 0.09; 'subject:files': 0.09; 'cc:addr :python-list': 0.10; 'template': 0.11; 'everyone,': 0.15; '(say': 0.16; 'created.': 0.16; 'specified.': 0.16; 'uniquely': 0.16; 'wrote:': 0.16; 'thanks,': 0.19; 'cc:2**0': 0.21; 'cc:addr:python.org': 0.21; 'cc:no real name:2**0': 0.23; 'am,': 0.23; 'file.': 0.24; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'hour.': 0.29; "they'll": 0.29; 'folder': 0.31; 'code': 0.31; 'possibly': 0.32; 'point': 0.33; 'problem': 0.33; 'machine.': 0.33; 'file': 0.34; 'received:google.com': 0.34; 'message- id:@gmail.com': 0.35; 'something': 0.35; 'skip:o 20': 0.35; 'but': 0.36; 'possible': 0.36; 'subject:: ': 0.37; 'charset:us-ascii': 0.37; "won't": 0.38; 'say': 0.38; 'files': 0.38; 'performance': 0.39; 'received:192': 0.39; 'data': 0.40; 'where': 0.40; 'your': 0.60; 'header:Message-Id:1': 0.62; 'above,': 0.63; 'within': 0.64; '20,': 0.66; 'laptop': 0.67; 'strategy': 0.69; 'million': 0.73; 'old,': 0.83; 'absolutely': 0.87; 'to:none': 0.90; 'stamp': 0.91; 'subject:Best': 0.91; 'task,': 0.91; 'killed': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references; bh=y+pQ1o0dMnMg2Dor7L7vI4j0EbicQoWWG7YntpAUwgc=; b=b12moQU7VZy1DiOJ3+JmfbZfrYcVW+NybQiRnjA5h+it4cNtm9/1ZHSnETRXAvz0wW fEuZGOwx1SvlkgpfeopI49iXBTw4jHD6YQTxon53+nv2llA7sNEY7/7+AlAU8ITu06Ia /ihVhJ0EUi7gOZxJHkEQZ0jieIwW+wahXzGDVhFD9FGghcOAOFXFHnEQXaW1MYy1jOuI aEE17QfpoEn2ZPaxWZwWrnHefpTxpFZYMaLn/MkpJFfh0B9JLGy3mRjdFuvxyp+Vji6P VHvExnrkj3ABpfLqiL4NZ9kLK0L2vP91DqgBJVpEgXe89w6FroX9PdOxN9qAJheKlj07 vxhg==
X-Received by 10.52.14.200 with SMTP id r8mr5589960vdc.79.1432266153267; Thu, 21 May 2015 20:42:33 -0700 (PDT)
Content-Type text/plain; charset=us-ascii
Mime-Version 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject Re: Best approach to create humongous amount of files
From Cem Karan <cfkaran2@gmail.com>
In-Reply-To <CAPkZ3MS5SiGH9OCe9RSTmakF681O+qM572y49FuDBmBix=aiFg@mail.gmail.com>
Date Thu, 21 May 2015 23:42:31 -0400
Cc python-list@python.org
Content-Transfer-Encoding quoted-printable
References <CAPkZ3MS5SiGH9OCe9RSTmakF681O+qM572y49FuDBmBix=aiFg@mail.gmail.com>
X-Mailer Apple Mail (2.1510)
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.212.1432266161.17265.python-list@python.org> (permalink)
Lines 29
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1432266161 news.xs4all.nl 2920 [2001:888:2000:d::a6]:43890
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:91029

Show key headers only | View raw


On May 20, 2015, at 7:44 AM, Parul Mogra <scoria.799@gmail.com> wrote:

> Hello everyone,
> My objective is to create large amount of data files (say a million *.json files), using a pre-existing template file (*.json). Each file would have a unique name, possibly by incorporating time stamp information. The files have to be generated in a folder specified.
> 
> What is the best strategy to achieve this task, so that the files will be generated in the shortest possible time? Say within an hour.

If you absolutely don't care about the name, then something like the following will work:

    import uuid
    for counter in range(1000000):
        with open(uuid.uuid1().hex.upper() + ".json", "w") as f:
            f.write(templateString)

where templateString is the template you want to write to each file.  The only problem is that the files won't be in any particular order; they'll just be uniquely named.  As a test, I ran the code above, but I killed the loop after about 10 minutes, at which point about 500,000 files were created.  Note that my laptop is about 6 years old, so you might get better performance on your machine.

Thanks,
Cem Karan

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Best approach to create humongous amount of files Cem Karan <cfkaran2@gmail.com> - 2015-05-21 23:42 -0400

csiph-web