Path: csiph.com!news.swapon.de!newsfeed.fsmpi.rwth-aachen.de!newsfeed.straub-nv.de!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!nzpost1.xs4all.net!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'url:pypi': 0.03; 'only,': 0.07; 'postgresql': 0.07; 'subject:file': 0.07; 'cc:addr:python- list': 0.09; 'csv': 0.09; 'cvs': 0.09; 'received:openend.se': 0.09; 'received:theraft.openend.se': 0.09; 'sqlite': 0.09; 'python': 0.10; 'cc:addr:lac': 0.16; 'cc:addr:openend.se': 0.16; 'csv,': 0.16; 'from:addr:lac': 0.16; 'from:addr:openend.se': 0.16; 'from:name:laura creighton': 0.16; 'message-id:@fido.openend.se': 0.16; 'need,': 0.16; 'optimised': 0.16; 'received:fido': 0.16; 'received:fido.openend.se': 0.16; 'sure.': 0.16; 'memory': 0.17; 'laura': 0.18; 'changes': 0.20; '2015': 0.20; 'cc:addr:python.org': 0.20; 'cc:2**1': 0.22; "haven't": 0.24; 'tried': 0.24; 'import': 0.24; 'module': 0.25; 'comfortable': 0.27; 'fri,': 0.27; 'connects': 0.27; 'this.': 0.28; '-0700,': 0.29; 'accepts': 0.29; 'directly,': 0.29; 'received:se': 0.29; 'cc:no real name:2**1': 0.29; 'convert': 0.29; 'excel': 0.29; 'that.': 0.30; "can't": 0.32; 'problem': 0.33; 'url:python': 0.33; 'subject:change': 0.33; 'file': 0.34; 'done': 0.35; 'problem.': 0.35; 'step': 0.36; 'but': 0.36; 'instead': 0.36; 'url:org': 0.36; 'data.': 0.36; 'url:library': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'there,': 0.37; 'charset:us-ascii': 0.37; 'wanted': 0.37; 'files': 0.38; 'skip:o 20': 0.38; 'means': 0.39; 'why': 0.39; 'data': 0.39; 'enough': 0.39; 'some': 0.40; 'url:3': 0.60; 'your': 0.60; 'header:Message-Id:1': 0.61; 'subject:more': 0.61; 'email addr:gmail.com': 0.62; 'export': 0.63; 'more': 0.63; 'reverse': 0.66; 'header:In-reply-to:1': 0.84; 'one).': 0.84; 'workbook': 0.84; 'sheet': 0.93; 'subject:000': 0.95 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openend.se; s=default; t=1444388866; bh=8Uo7nPXsqjaAVxJQxa+zGBv+RTx2eaTgJkEmdZp/ei8=; h=To:cc:From:Subject:In-reply-to:References:Date:From; b=Lijj+v+/gEpox0KtOYzSpfnJlaH9iDXfEMBkmDlu/rIbs3v5CW5moV9tC8kU5DZ1r xmD3KmDeuGL/Of4nw0GHwud5tMx/ToDO+cfXKZZw4AsXGB8Ugip0yqv68cC4UokSky NTfmA/JjesoJYZOosGihgQyhmMHAhR/9gnwwZN2Q= To: gall.pavgal.gall@gmail.com cc: python-list@python.org, lac@openend.se From: Laura Creighton Subject: Re: change data in large excel file(more than 240 000 rows on sheet) In-reply-to: <3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com> References: <9c23c9d1-4257-475b-a7f6-811e85fe83f3@googlegroups.com> <3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com> Comments: In-reply-to gall.pavgal.gall@gmail.com message dated "Fri, 09 Oct 2015 01:21:37 -0700." MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <28471.1444388864.1@fido> Content-Transfer-Encoding: quoted-printable Date: Fri, 09 Oct 2015 13:07:44 +0200 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.9 (theraft.openend.se [82.96.5.2]); Fri, 09 Oct 2015 13:07:46 +0200 (CEST) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 57 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1444388875 news.xs4all.nl 23735 [2001:888:2000:d::a6]:60263 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:97533 In a message of Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall@gmail.co= m wri tes: >Thanks Laura! >But i need to change existing excel file and if i use the optimised reade= r, i can read data only, but i can't change data. = Hmmm. When I have this problem I have other problems as well, and they can pretty well all be sumarised as 'I never wanted = an excel file anyway'. = So first I convert the xlsx to a CVS. import openpyxl import csv workbook =3D openpyxl.load_workbook('my_big_file.xlsx', read_only=3DTrue) sheet =3D workbook['big_data'] with open('my_big_file.csv', 'wb') as f: c =3D csv.writer(f) for r in sh.rows: c.writerow([cell.value for cell in r]) This may be enough for you, and you may be able to use the Python csv module to make the changes you want. https://docs.python.org/3.5/library/csv.html This may be all you need, if you can process your problem one line at a time. Reading it all into memory will give you the same problem. But I have never actually done this. Instead I convert the cvs to a = postgresql database, and then make my changes there, and then export it again as a csv, and then reverse the process to get an xlsx again (if I need one). I don't know any way to get postgresql to take the xlsx directly, which is why I have the two step process with the csv in the middle. If you are more comfortable with MySQL or SQLite or some other database, by all means use that. They might even have a way to import xlsx directly= . I only do this because it is what I am used to. The other thing to consider is pandas. https://pypi.python.org/pypi/pandas It uses openpyxl under the hood, but is set up for big data. It accepts xlxs files and also connects to databases. It may be exactly what you are looking for -- but I haven't tried this myself for such large datasets, so I don't know for sure. Laura