Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #97481 > unrolled thread

change data in large excel file(more than 240 000 rows on sheet)

Started bygall.pavgal.gall@gmail.com
First post2015-10-07 07:13 -0700
Last post2015-10-09 04:19 -0700
Articles 8 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-07 07:13 -0700
    Re: change data in large excel file(more than 240 000 rows on sheet) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-10-07 15:18 +0100
    Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-08 02:58 -0700
      Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-08 12:27 +0200
        Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-09 01:21 -0700
          Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-09 13:07 +0200
          Re: change data in large excel file(more than 240 000 rows on sheet) alister <alister.nospam.ware@ntlworld.com> - 2015-10-09 11:33 +0000
    Re: change data in large excel file(more than 240 000 rows on sheet) harirammanohar159@gmail.com - 2015-10-09 04:19 -0700

#97481 — change data in large excel file(more than 240 000 rows on sheet)

Fromgall.pavgal.gall@gmail.com
Date2015-10-07 07:13 -0700
Subjectchange data in large excel file(more than 240 000 rows on sheet)
Message-ID<9c23c9d1-4257-475b-a7f6-811e85fe83f3@googlegroups.com>
Hi Guys,
I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ...
Please, could you advise some-thing suitable!

best regards, Pavel.

[toc] | [next] | [standalone]


#97482

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-10-07 15:18 +0100
Message-ID<mailman.454.1444227551.28679.python-list@python.org>
In reply to#97481
On 07/10/2015 15:13, gall.pavgal.gall@gmail.com wrote:
> Hi Guys,
> I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ...
> Please, could you advise some-thing suitable!
>
> best regards, Pavel.
>

http://www.python-excel.org/

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#97497

Fromgall.pavgal.gall@gmail.com
Date2015-10-08 02:58 -0700
Message-ID<caed8133-f4cb-420b-b9fd-4512fbe11400@googlegroups.com>
In reply to#97481
thanks Guys!
xlrd it's cool, but i need change existing file, which contains some sheets with large data...)
So, if i use xlwt(i.e. create new excel document), i will lose data from other sheets.
I tried use openpyxl, but got error :

Traceback (most recent call last):
  File "D:/WebPython/oneMoreTest.py", line 15, in <module>
    wb = load_workbook(filename='D:\WebPython\COX.DSG.Offering Product Catalog VANE.xlsx')
  File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 149, in load_workbook
    _load_workbook(wb, archive, filename, read_only, keep_vba)
  File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 236, in _load_workbook
    color_index=wb._colors)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 327, in read_worksheet
    fast_parse(ws, xml_source, shared_strings, style_table, color_index)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 315, in fast_parse
    parser.parse()
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 88, in parse
    stream = _get_xml_iter(self.source)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 36, in _get_xml_iter
    xml_source = xml_source.encode("utf-8")
MemoryError

[toc] | [prev] | [next] | [standalone]


#97499

FromLaura Creighton <lac@openend.se>
Date2015-10-08 12:27 +0200
Message-ID<mailman.468.1444300069.28679.python-list@python.org>
In reply to#97497
You need to use the optimised reader.
https://openpyxl.readthedocs.org/en/latest/optimized.html

Laura

[toc] | [prev] | [next] | [standalone]


#97531

Fromgall.pavgal.gall@gmail.com
Date2015-10-09 01:21 -0700
Message-ID<3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com>
In reply to#97499
Thanks Laura!
But i need to change existing excel file and if i use the optimised reader, i can read data only, but i can't change data.  

[toc] | [prev] | [next] | [standalone]


#97533

FromLaura Creighton <lac@openend.se>
Date2015-10-09 13:07 +0200
Message-ID<mailman.506.1444388875.28679.python-list@python.org>
In reply to#97531
In a message of Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall@gmail.com wri
tes:
>Thanks Laura!
>But i need to change existing excel file and if i use the optimised reader, i can read data only, but i can't change data.  

Hmmm.  When I have this problem I have other problems as well,
and they can pretty well all be sumarised as 'I never wanted 
an excel file anyway'. 

So first I convert the xlsx to a CVS.

import openpyxl
import csv

workbook = openpyxl.load_workbook('my_big_file.xlsx', read_only=True)
sheet = workbook['big_data']

with open('my_big_file.csv', 'wb') as f:
    c = csv.writer(f)
    for r in sh.rows:
        c.writerow([cell.value for cell in r])

This may be enough for you, and you may be able to use the Python
csv module to make the changes you want.
https://docs.python.org/3.5/library/csv.html
This may be all you need, if you can process your problem one line
at a time.  Reading it all into memory will give you the same problem.

But I have never actually done this.  Instead I convert the cvs to a 
postgresql database, and then make my changes there, and then
export it again as a csv, and then reverse the process to get an
xlsx again (if I need one).  I don't know any way to get postgresql
to take the xlsx directly, which is why I have the two step process
with the csv in the middle.

If you are more comfortable with MySQL or SQLite or some other database,
by all means use that.  They might even have a way to import xlsx directly.
I only do this because it is what I am used to.

The other thing to consider is pandas.
https://pypi.python.org/pypi/pandas

It uses openpyxl under the hood, but is set up for big data.  It accepts
xlxs files and also connects to databases.  It may be exactly what
you are looking for -- but I haven't tried this myself for such large
datasets, so  I don't know for sure.

Laura

[toc] | [prev] | [next] | [standalone]


#97536

Fromalister <alister.nospam.ware@ntlworld.com>
Date2015-10-09 11:33 +0000
Message-ID<mv88m3$cnv$1@speranza.aioe.org>
In reply to#97531
On Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall wrote:

> Thanks Laura!
> But i need to change existing excel file and if i use the optimised
> reader, i can read data only, but i can't change data.

I know it does not help but if you have this amount of data being stored 
in a spreadsheet you are probably using the wrong technology.

This sounds like a typical example of miss-using a spreadsheet as a 
database.

once your current issue is resolved you may want to investigate moving 
the data to a more sensible format.




-- 
 _________________________________________
/ We Klingons believe as you do -- the    \
| sick should die. Only the strong should |
| live.                                   |
|                                         |
| -- Kras, "Friday's Child", stardate     |
\ 3497.2                                  /
 -----------------------------------------
   \
    \
        .--.
       |o_o |
       |:_/ |
      //   \ \
     (|     | )
    /'\_   _/`\
    \___)=(___/

[toc] | [prev] | [next] | [standalone]


#97534

Fromharirammanohar159@gmail.com
Date2015-10-09 04:19 -0700
Message-ID<e6a732d4-12a0-4fb7-b250-21bf1e055224@googlegroups.com>
In reply to#97481
On Wednesday, 7 October 2015 19:44:12 UTC+5:30, gall.pav...@gmail.com  wrote:
> Hi Guys,
> I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ...
> Please, could you advise some-thing suitable!
> 
> best regards, Pavel.

Hi Pavel,

is there is no possibility to write a definition or macro in excel itself to change the rows which makes bit easier than the script.. :)

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web