Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #17608
| Date | 2011-12-20 21:03 +0100 |
|---|---|
| From | Jérôme <jerome@jolimont.fr> |
| Subject | Re: Text Processing |
| References | <209c2abf-dd56-4a7f-839b-fad92920d457@m7g2000vbc.googlegroups.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3879.1324411263.27778.python-list@python.org> (permalink) |
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
> Hi all,
>
> I have a text file containing such data ;
>
> A B C
> -------------------------------------------------------
> -2.0100e-01 8.000e-02 8.000e-05
> -2.0000e-01 0.000e+00 4.800e-04
> -1.9900e-01 4.000e-02 1.600e-04
>
> But I only need Section B, and I need to change the notation to ;
>
> 8.000e-02 = 0.08
> 0.000e+00 = 0.00
> 4.000e-02 = 0.04
>
> Text file is approximately 10MB in size. I looked around to see if
> there is a quick and dirty workaround but there are lots of modules,
> lots of options.. I am confused.
>
> Which module is most suitable for this task ?
You could try to do it yourself.
You'd need to know what seperates the datas. Tabulation character ? Spaces ?
Exemple :
Input file
----------
A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04
Python code
-----------
# Open file
with open('test1.plt','r') as f:
b_values = []
# skip as many lines as needed
line = f.readline()
line = f.readline()
line = f.readline()
while line:
#start = line.find(u"\u0009", 0) + 1 #seek Tab
start = line.find(" ", 0) + 4 #seek 4 spaces
#end = line.find(u"\u0009", start)
end = line.find(" ", start)
b_values.append(float(line[start:end].strip()))
line = f.readline()
print b_values
It gets trickier if the amount of spaces is not constant. I would then try
with regular expressions. Perhaps would regexp be more efficient in any case.
--
Jérôme
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Text Processing Yigit Turgut <y.turgut@gmail.com> - 2011-12-20 11:17 -0800
Re: Text Processing Dave Angel <d@davea.name> - 2011-12-20 14:57 -0500
Re: Text Processing Jérôme <jerome@jolimont.fr> - 2011-12-20 21:03 +0100
Re: Text Processing Nick Dokos <nicholas.dokos@hp.com> - 2011-12-20 16:04 -0500
Re: Text Processing Alexander Kapps <alex.kapps@web.de> - 2011-12-21 01:01 +0100
Re: Text Processing Yigit Turgut <y.turgut@gmail.com> - 2011-12-22 03:11 -0800
csiph-web