Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #97533

Re: change data in large excel file(more than 240 000 rows on sheet)

Path csiph.com!news.swapon.de!newsfeed.fsmpi.rwth-aachen.de!newsfeed.straub-nv.de!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!nzpost1.xs4all.net!not-for-mail
Return-Path <lac@openend.se>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'only,': 0.07; 'postgresql': 0.07; 'subject:file': 0.07; 'cc:addr:python- list': 0.09; 'csv': 0.09; 'cvs': 0.09; 'received:openend.se': 0.09; 'received:theraft.openend.se': 0.09; 'sqlite': 0.09; 'python': 0.10; 'cc:addr:lac': 0.16; 'cc:addr:openend.se': 0.16; 'csv,': 0.16; 'from:addr:lac': 0.16; 'from:addr:openend.se': 0.16; 'from:name:laura creighton': 0.16; 'message-id:@fido.openend.se': 0.16; 'need,': 0.16; 'optimised': 0.16; 'received:fido': 0.16; 'received:fido.openend.se': 0.16; 'sure.': 0.16; 'memory': 0.17; 'laura': 0.18; 'changes': 0.20; '2015': 0.20; 'cc:addr:python.org': 0.20; 'cc:2**1': 0.22; "haven't": 0.24; 'tried': 0.24; 'import': 0.24; 'module': 0.25; 'comfortable': 0.27; 'fri,': 0.27; 'connects': 0.27; 'this.': 0.28; '-0700,': 0.29; 'accepts': 0.29; 'directly,': 0.29; 'received:se': 0.29; 'cc:no real name:2**1': 0.29; 'convert': 0.29; 'excel': 0.29; 'that.': 0.30; "can't": 0.32; 'problem': 0.33; 'url:python': 0.33; 'subject:change': 0.33; 'file': 0.34; 'done': 0.35; 'problem.': 0.35; 'step': 0.36; 'but': 0.36; 'instead': 0.36; 'url:org': 0.36; 'data.': 0.36; 'url:library': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'there,': 0.37; 'charset:us-ascii': 0.37; 'wanted': 0.37; 'files': 0.38; 'skip:o 20': 0.38; 'means': 0.39; 'why': 0.39; 'data': 0.39; 'enough': 0.39; 'some': 0.40; 'url:3': 0.60; 'your': 0.60; 'header:Message-Id:1': 0.61; 'subject:more': 0.61; 'email addr:gmail.com': 0.62; 'export': 0.63; 'more': 0.63; 'reverse': 0.66; 'header:In-reply-to:1': 0.84; 'one).': 0.84; 'workbook': 0.84; 'sheet': 0.93; 'subject:000': 0.95
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=openend.se; s=default; t=1444388866; bh=8Uo7nPXsqjaAVxJQxa+zGBv+RTx2eaTgJkEmdZp/ei8=; h=To:cc:From:Subject:In-reply-to:References:Date:From; b=Lijj+v+/gEpox0KtOYzSpfnJlaH9iDXfEMBkmDlu/rIbs3v5CW5moV9tC8kU5DZ1r xmD3KmDeuGL/Of4nw0GHwud5tMx/ToDO+cfXKZZw4AsXGB8Ugip0yqv68cC4UokSky NTfmA/JjesoJYZOosGihgQyhmMHAhR/9gnwwZN2Q=
To gall.pavgal.gall@gmail.com
cc python-list@python.org, lac@openend.se
From Laura Creighton <lac@openend.se>
Subject Re: change data in large excel file(more than 240 000 rows on sheet)
In-reply-to <3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com>
References <9c23c9d1-4257-475b-a7f6-811e85fe83f3@googlegroups.com> <caed8133-f4cb-420b-b9fd-4512fbe11400@googlegroups.com> <mailman.468.1444300069.28679.python-list@python.org> <3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com>
Comments In-reply-to gall.pavgal.gall@gmail.com message dated "Fri, 09 Oct 2015 01:21:37 -0700."
MIME-Version 1.0
Content-Type text/plain; charset="us-ascii"
Content-ID <28471.1444388864.1@fido>
Content-Transfer-Encoding quoted-printable
Date Fri, 09 Oct 2015 13:07:44 +0200
X-Greylist Sender IP whitelisted, not delayed by milter-greylist-4.3.9 (theraft.openend.se [82.96.5.2]); Fri, 09 Oct 2015 13:07:46 +0200 (CEST)
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.506.1444388875.28679.python-list@python.org> (permalink)
Lines 57
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1444388875 news.xs4all.nl 23735 [2001:888:2000:d::a6]:60263
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:97533

Show key headers only | View raw


In a message of Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall@gmail.com wri
tes:
>Thanks Laura!
>But i need to change existing excel file and if i use the optimised reader, i can read data only, but i can't change data.  

Hmmm.  When I have this problem I have other problems as well,
and they can pretty well all be sumarised as 'I never wanted 
an excel file anyway'. 

So first I convert the xlsx to a CVS.

import openpyxl
import csv

workbook = openpyxl.load_workbook('my_big_file.xlsx', read_only=True)
sheet = workbook['big_data']

with open('my_big_file.csv', 'wb') as f:
    c = csv.writer(f)
    for r in sh.rows:
        c.writerow([cell.value for cell in r])

This may be enough for you, and you may be able to use the Python
csv module to make the changes you want.
https://docs.python.org/3.5/library/csv.html
This may be all you need, if you can process your problem one line
at a time.  Reading it all into memory will give you the same problem.

But I have never actually done this.  Instead I convert the cvs to a 
postgresql database, and then make my changes there, and then
export it again as a csv, and then reverse the process to get an
xlsx again (if I need one).  I don't know any way to get postgresql
to take the xlsx directly, which is why I have the two step process
with the csv in the middle.

If you are more comfortable with MySQL or SQLite or some other database,
by all means use that.  They might even have a way to import xlsx directly.
I only do this because it is what I am used to.

The other thing to consider is pandas.
https://pypi.python.org/pypi/pandas

It uses openpyxl under the hood, but is set up for big data.  It accepts
xlxs files and also connects to databases.  It may be exactly what
you are looking for -- but I haven't tried this myself for such large
datasets, so  I don't know for sure.

Laura

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-07 07:13 -0700
  Re: change data in large excel file(more than 240 000 rows on sheet) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-10-07 15:18 +0100
  Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-08 02:58 -0700
    Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-08 12:27 +0200
      Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-09 01:21 -0700
        Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-09 13:07 +0200
        Re: change data in large excel file(more than 240 000 rows on sheet) alister <alister.nospam.ware@ntlworld.com> - 2015-10-09 11:33 +0000
  Re: change data in large excel file(more than 240 000 rows on sheet) harirammanohar159@gmail.com - 2015-10-09 04:19 -0700

csiph-web