Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #97481 > unrolled thread
| Started by | gall.pavgal.gall@gmail.com |
|---|---|
| First post | 2015-10-07 07:13 -0700 |
| Last post | 2015-10-09 04:19 -0700 |
| Articles | 8 — 5 participants |
Back to article view | Back to comp.lang.python
change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-07 07:13 -0700
Re: change data in large excel file(more than 240 000 rows on sheet) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-10-07 15:18 +0100
Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-08 02:58 -0700
Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-08 12:27 +0200
Re: change data in large excel file(more than 240 000 rows on sheet) gall.pavgal.gall@gmail.com - 2015-10-09 01:21 -0700
Re: change data in large excel file(more than 240 000 rows on sheet) Laura Creighton <lac@openend.se> - 2015-10-09 13:07 +0200
Re: change data in large excel file(more than 240 000 rows on sheet) alister <alister.nospam.ware@ntlworld.com> - 2015-10-09 11:33 +0000
Re: change data in large excel file(more than 240 000 rows on sheet) harirammanohar159@gmail.com - 2015-10-09 04:19 -0700
| From | gall.pavgal.gall@gmail.com |
|---|---|
| Date | 2015-10-07 07:13 -0700 |
| Subject | change data in large excel file(more than 240 000 rows on sheet) |
| Message-ID | <9c23c9d1-4257-475b-a7f6-811e85fe83f3@googlegroups.com> |
Hi Guys, I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ... Please, could you advise some-thing suitable! best regards, Pavel.
[toc] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-10-07 15:18 +0100 |
| Message-ID | <mailman.454.1444227551.28679.python-list@python.org> |
| In reply to | #97481 |
On 07/10/2015 15:13, gall.pavgal.gall@gmail.com wrote: > Hi Guys, > I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ... > Please, could you advise some-thing suitable! > > best regards, Pavel. > http://www.python-excel.org/ -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | gall.pavgal.gall@gmail.com |
|---|---|
| Date | 2015-10-08 02:58 -0700 |
| Message-ID | <caed8133-f4cb-420b-b9fd-4512fbe11400@googlegroups.com> |
| In reply to | #97481 |
thanks Guys!
xlrd it's cool, but i need change existing file, which contains some sheets with large data...)
So, if i use xlwt(i.e. create new excel document), i will lose data from other sheets.
I tried use openpyxl, but got error :
Traceback (most recent call last):
File "D:/WebPython/oneMoreTest.py", line 15, in <module>
wb = load_workbook(filename='D:\WebPython\COX.DSG.Offering Product Catalog VANE.xlsx')
File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 149, in load_workbook
_load_workbook(wb, archive, filename, read_only, keep_vba)
File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 236, in _load_workbook
color_index=wb._colors)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 327, in read_worksheet
fast_parse(ws, xml_source, shared_strings, style_table, color_index)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 315, in fast_parse
parser.parse()
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 88, in parse
stream = _get_xml_iter(self.source)
File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 36, in _get_xml_iter
xml_source = xml_source.encode("utf-8")
MemoryError
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-10-08 12:27 +0200 |
| Message-ID | <mailman.468.1444300069.28679.python-list@python.org> |
| In reply to | #97497 |
You need to use the optimised reader. https://openpyxl.readthedocs.org/en/latest/optimized.html Laura
[toc] | [prev] | [next] | [standalone]
| From | gall.pavgal.gall@gmail.com |
|---|---|
| Date | 2015-10-09 01:21 -0700 |
| Message-ID | <3d355596-b1cb-4a1e-b234-21bf1e637f63@googlegroups.com> |
| In reply to | #97499 |
Thanks Laura! But i need to change existing excel file and if i use the optimised reader, i can read data only, but i can't change data.
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-10-09 13:07 +0200 |
| Message-ID | <mailman.506.1444388875.28679.python-list@python.org> |
| In reply to | #97531 |
In a message of Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall@gmail.com wri
tes:
>Thanks Laura!
>But i need to change existing excel file and if i use the optimised reader, i can read data only, but i can't change data.
Hmmm. When I have this problem I have other problems as well,
and they can pretty well all be sumarised as 'I never wanted
an excel file anyway'.
So first I convert the xlsx to a CVS.
import openpyxl
import csv
workbook = openpyxl.load_workbook('my_big_file.xlsx', read_only=True)
sheet = workbook['big_data']
with open('my_big_file.csv', 'wb') as f:
c = csv.writer(f)
for r in sh.rows:
c.writerow([cell.value for cell in r])
This may be enough for you, and you may be able to use the Python
csv module to make the changes you want.
https://docs.python.org/3.5/library/csv.html
This may be all you need, if you can process your problem one line
at a time. Reading it all into memory will give you the same problem.
But I have never actually done this. Instead I convert the cvs to a
postgresql database, and then make my changes there, and then
export it again as a csv, and then reverse the process to get an
xlsx again (if I need one). I don't know any way to get postgresql
to take the xlsx directly, which is why I have the two step process
with the csv in the middle.
If you are more comfortable with MySQL or SQLite or some other database,
by all means use that. They might even have a way to import xlsx directly.
I only do this because it is what I am used to.
The other thing to consider is pandas.
https://pypi.python.org/pypi/pandas
It uses openpyxl under the hood, but is set up for big data. It accepts
xlxs files and also connects to databases. It may be exactly what
you are looking for -- but I haven't tried this myself for such large
datasets, so I don't know for sure.
Laura
[toc] | [prev] | [next] | [standalone]
| From | alister <alister.nospam.ware@ntlworld.com> |
|---|---|
| Date | 2015-10-09 11:33 +0000 |
| Message-ID | <mv88m3$cnv$1@speranza.aioe.org> |
| In reply to | #97531 |
On Fri, 09 Oct 2015 01:21:37 -0700, gall.pavgal.gall wrote:
> Thanks Laura!
> But i need to change existing excel file and if i use the optimised
> reader, i can read data only, but i can't change data.
I know it does not help but if you have this amount of data being stored
in a spreadsheet you are probably using the wrong technology.
This sounds like a typical example of miss-using a spreadsheet as a
database.
once your current issue is resolved you may want to investigate moving
the data to a more sensible format.
--
_________________________________________
/ We Klingons believe as you do -- the \
| sick should die. Only the strong should |
| live. |
| |
| -- Kras, "Friday's Child", stardate |
\ 3497.2 /
-----------------------------------------
\
\
.--.
|o_o |
|:_/ |
// \ \
(| | )
/'\_ _/`\
\___)=(___/
[toc] | [prev] | [next] | [standalone]
| From | harirammanohar159@gmail.com |
|---|---|
| Date | 2015-10-09 04:19 -0700 |
| Message-ID | <e6a732d4-12a0-4fb7-b250-21bf1e055224@googlegroups.com> |
| In reply to | #97481 |
On Wednesday, 7 October 2015 19:44:12 UTC+5:30, gall.pav...@gmail.com wrote: > Hi Guys, > I need to change data in large excel file(more than 240 000 rows on sheet), it's possible through win32com.client, but i need use Linux OS ... > Please, could you advise some-thing suitable! > > best regards, Pavel. Hi Pavel, is there is no possibility to write a definition or macro in excel itself to change the rows which makes bit easier than the script.. :)
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web