Groups > comp.lang.python > #77855 > unrolled thread

CSV methodology

Started by	jetrn@newsguy.com
First post	2014-09-13 21:34 -0400
Last post	2014-09-15 12:53 -0400
Articles	20 — 11 participants

Back to article view | Back to comp.lang.python

  CSV methodology jetrn@newsguy.com - 2014-09-13 21:34 -0400
    Re: CSV methodology kjs <bfb@riseup.net> - 2014-09-14 02:51 +0000
    Re: CSV methodology Terry Reedy <tjreedy@udel.edu> - 2014-09-14 03:02 -0400
      Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-14 12:56 -0400
        Re: CSV methodology Chris Angelico <rosuav@gmail.com> - 2014-09-15 03:10 +1000
        Re: CSV methodology Terry Reedy <tjreedy@udel.edu> - 2014-09-14 14:42 -0400
          Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-14 16:19 -0400
        Re: CSV methodology Peter Otten <__peter__@web.de> - 2014-09-15 09:29 +0200
          Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-15 12:33 -0400
            Re: CSV methodology Peter Otten <__peter__@web.de> - 2014-09-16 13:22 +0200
              Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-16 14:03 -0400
              Works perfectly (was Re: CSV methodology) jayte <jetrn@newsguy.com> - 2014-09-22 20:27 -0400
                Re: Works perfectly (was Re: CSV methodology) Peter Otten <__peter__@web.de> - 2014-09-23 09:59 +0200
    Re: CSV methodology Cameron Simpson <cs@zip.com.au> - 2014-09-14 18:38 +1000
      Re: CSV methodology Rustom Mody <rustompmody@gmail.com> - 2014-09-14 01:56 -0700
        Re: CSV methodology Cameron Simpson <cs@zip.com.au> - 2014-09-15 09:28 +1000
    Re: CSV methodology Akira Li <4kir4.1i@gmail.com> - 2014-09-15 11:12 +0400
      Re: CSV methodology pH <high@cidity.level> - 2014-09-15 12:40 -0400
    Re:CSV methodology Dave Angel <davea@davea.name> - 2014-09-15 09:29 -0400
      Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-15 12:53 -0400

#77855 — CSV methodology

From	jetrn@newsguy.com
Date	2014-09-13 21:34 -0400
Subject	CSV methodology
Message-ID	<b2q91ada6b59ept81ac65vtnnu6sdklp1h@4ax.com>

Hello.  Back in the '80s, I wrote a fractal generator, which, over the years,
I've modified/etc to run under Windows.  I've been an Assembly Language
programmer for decades.  Recently, I decided to learn a new language,
and decided on Python, and I just love it, and the various IDEs.

Anyway, something I thought would be interesting, would be to export
some data from my fractal program (I call it MXP), and write something
in Python and its various scientific data analysis and plotting modules,
and... well, see what's in there.

An example of the data:
1.850358651774470E-0002
32
22
27
... (this format repeats)

So, I wrote a procedure in MXP which converts "the data" and exports
a csv file.  So far, here's what I've started with:

-----------------------------------------------
import csv

fname = 'E:/Users/jayte/Documents/Python Scripts/XportTestBlock.csv'

f = open(fname)

reader = csv.reader(f)

for flt in reader:
    x = len(flt)
file.close(f)
-----------------------------------------------

This will get me an addressable array, as:

flt[0], flt[1], flt[350], etc...  from which values can be assigned to
other variables, converted...

My question:  Is there a better way?  Do I need to learn more about
how csv file are organized?  Perhaps I know far too little of Python
to be attempting something like this, just yet.

Advice?

Jeff

[toc] | [next] | [standalone]

#77858

From	kjs <bfb@riseup.net>
Date	2014-09-14 02:51 +0000
Message-ID	<mailman.14004.1410663113.18130.python-list@python.org>
In reply to	#77855

[Multipart message — attachments visible in raw view] — view raw

jetrn@newsguy.com wrote:
> 
> Hello.  Back in the '80s, I wrote a fractal generator, which, over the years,
> I've modified/etc to run under Windows.  I've been an Assembly Language
> programmer for decades.  Recently, I decided to learn a new language,
> and decided on Python, and I just love it, and the various IDEs.
> 
> Anyway, something I thought would be interesting, would be to export
> some data from my fractal program (I call it MXP), and write something
> in Python and its various scientific data analysis and plotting modules,
> and... well, see what's in there.
> 
> An example of the data:
> 1.850358651774470E-0002
> 32
> 22
> 27
> ... (this format repeats)
> 
> So, I wrote a procedure in MXP which converts "the data" and exports
> a csv file.  So far, here's what I've started with:
> 
> -----------------------------------------------
> import csv
> 
> fname = 'E:/Users/jayte/Documents/Python Scripts/XportTestBlock.csv'
> 
> f = open(fname)
> 
> reader = csv.reader(f)
> 
> for flt in reader:
>     x = len(flt)
> file.close(f)
> -----------------------------------------------

The csv.reader(f) object creates an iterable that will create lists from
lines in f. The list will have values at indexes based on the commas in
the file. EX:

my_header_1, my_header_2
111, 0001
101, 1010
100, 1001

The csv.reader will lazily make lists like ['my_header_1',
'my_header_2'], ['111', '0001'], ... and so forth.

Your program above will take the length of those lists, and assign that
value to x. For every line in the file, f will get rewritten with a new
value, the length of list which is derived from the number of commas in
the csv.

Also note that the csv.reader speaks many dialects, and can do similar
work on files with different quote characters and delimiters.

Generally, I prefer to read in csv files with the standard readline()
method of open files. I do like to use csv.DictWriter, which helps me to
keep my csv output tabular.

-Kevin

> 
> This will get me an addressable array, as:
> 
> flt[0], flt[1], flt[350], etc...  from which values can be assigned to
> other variables, converted...
> 
> My question:  Is there a better way?  Do I need to learn more about
> how csv file are organized?  Perhaps I know far too little of Python
> to be attempting something like this, just yet.
> 
> Advice?
> 
> Jeff
>

[toc] | [prev] | [next] | [standalone]

#77861

From	Terry Reedy <tjreedy@udel.edu>
Date	2014-09-14 03:02 -0400
Message-ID	<mailman.14005.1410678164.18130.python-list@python.org>
In reply to	#77855

On 9/13/2014 9:34 PM, jetrn@newsguy.com wrote:
>
> Hello.  Back in the '80s, I wrote a fractal generator, which, over the years,
> I've modified/etc to run under Windows.  I've been an Assembly Language
> programmer for decades.  Recently, I decided to learn a new language,
> and decided on Python, and I just love it, and the various IDEs.
>
> Anyway, something I thought would be interesting, would be to export
> some data from my fractal program (I call it MXP), and write something
> in Python and its various scientific data analysis and plotting modules,
> and... well, see what's in there.

First you need to think about (and document) what your numbers mean and 
how they should be organized for analysis.

> An example of the data:
> 1.850358651774470E-0002

Why is this so smaller than the next numbers.  Are all those digits 
significant, or are they mostly just noise -- and best dropped by 
rounding the number to a few significant digits.

> 32
> 22
> 27
> ... (this format repeats)

After exactly 3 numbers in this range?

> So, I wrote a procedure in MXP which converts "the data" and exports
> a csv file.

Answer the questions above before writing code.  .csf is likely not the 
best format to use.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#77866

From	jayte <jetrn@newsguy.com>
Date	2014-09-14 12:56 -0400
Message-ID	<i5db1a9s5mvgsmg99q8kqf4l672fbrvp0n@4ax.com>
In reply to	#77861

On Sun, 14 Sep 2014 03:02:12 -0400, Terry Reedy <tjreedy@udel.edu> wrote:

>On 9/13/2014 9:34 PM, jetrn@newsguy.com wrote:

[...]

>First you need to think about (and document) what your numbers mean and 
>how they should be organized for analysis.
>
>> An example of the data:
>> 1.850358651774470E-0002
>
>Why is this so smaller than the next numbers.  Are all those digits 
>significant, or are they mostly just noise -- and best dropped by 
>rounding the number to a few significant digits.

Sorry, I neglected to mention the values' significance.  The MXP program
uses the "distance estimate" algorithm in its fractal data generation.  The
values are thus, for each point in a 1778 x 1000 image:

Distance,   (an extended double)
Iterations,  (a 16 bit int)
zc_x,        (a 16 bit int)
zc_y         (a 16 bit int)

(Durring the "orbit" calculations, the result of each iteration will be positive
or negative.  The "zc" is a "zero crossing" count, or frequency, and is
used in the eventual coloring algorithm.  "Distance" can range from zero
to very small.)

>> 32
>> 22
>> 27
>> ... (this format repeats)
>
>After exactly 3 numbers in this range?

As one zooms into the image, Iterations and the "zc" values will increase.

>> So, I wrote a procedure in MXP which converts "the data" and exports
>> a csv file.
>
>Answer the questions above before writing code.  .csf is likely not the 
>best format to use.

Initially, I tried exporting the raw binary (hex) data, and no matter
what I tried, could not get Python to read it in any useful way (which
I attribute to my lack of knowledge in Python)

Anyway, thanks (everyone) for responding.  I'm very anxious to
try some data analysis (what I'm hoping, is to discover some new
approaches / enhancements to coloring, as I'm not convinced we've
seen all there is to see, from The Mandelbrot Set)

Jeff

[toc] | [prev] | [next] | [standalone]

#77867

From	Chris Angelico <rosuav@gmail.com>
Date	2014-09-15 03:10 +1000
Message-ID	<mailman.14009.1410714615.18130.python-list@python.org>
In reply to	#77866

On Mon, Sep 15, 2014 at 2:56 AM, jayte <jetrn@newsguy.com> wrote:
> Anyway, thanks (everyone) for responding.  I'm very anxious to
> try some data analysis (what I'm hoping, is to discover some new
> approaches / enhancements to coloring, as I'm not convinced we've
> seen all there is to see, from The Mandelbrot Set)

I agree! For one thing, I'd like to see a Mandelbrot set built out of
ice. Spiralling all around... and one thought crystallizes... *goes
off humming*

But seriously, yes. There's so much to be found in it. Python can
probably help you more easily than you think - reading the raw binary
isn't too hard. Maybe we can help you get it into a more useful data
structure. How's the file laid out?

ChrisA

[toc] | [prev] | [next] | [standalone]

#77869

From	Terry Reedy <tjreedy@udel.edu>
Date	2014-09-14 14:42 -0400
Message-ID	<mailman.14011.1410720202.18130.python-list@python.org>
In reply to	#77866

On 9/14/2014 12:56 PM, jayte wrote:
> On Sun, 14 Sep 2014 03:02:12 -0400, Terry Reedy <tjreedy@udel.edu> wrote:
>
>> On 9/13/2014 9:34 PM, jetrn@newsguy.com wrote:
>
> [...]
>
>> First you need to think about (and document) what your numbers mean and
>> how they should be organized for analysis.
>>
>>> An example of the data:
>>> 1.850358651774470E-0002
>>
>> Why is this so smaller than the next numbers.  Are all those digits
>> significant, or are they mostly just noise -- and best dropped by
>> rounding the number to a few significant digits.
>
> Sorry, I neglected to mention the values' significance.  The MXP program
> uses the "distance estimate" algorithm in its fractal data generation.  The
> values are thus, for each point in a 1778 x 1000 image:
>
> Distance,   (an extended double)
> Iterations,  (a 16 bit int)
> zc_x,        (a 16 bit int)
> zc_y         (a 16 bit int)

> (Durring the "orbit" calculations, the result of each iteration will be positive
> or negative.  The "zc" is a "zero crossing" count, or frequency, and is
> used in the eventual coloring algorithm.  "Distance" can range from zero
> to very small.)
>
>>> 32
>>> 22
>>> 27

If you can output Distance as an 8 byte double in IEEE 754 binary64 
format, you could read a binary file with Python using the struct module.
https://docs.python.org/3/library/struct.html#module-struct
This would be the fastest way for output and subsequent input.

If you want a text file either for human readability or because you 
cannot write a readable binary file, each line would have the 4 fields 
listed above (with image row/col implied by line number). I strongly 
suggest a fixed column format with spaces between the fields.  It might 
be easiest to put the float as the end instead of the beginning. 
Allowing max iterations = 999, your first line would look like

  32  22  27 1.850358651774470E-0002

for line in file
     iternum, zc_x, zc_y, distance = line.split()

This will be faster for both you and the machine to read.

> Initially, I tried exporting the raw binary (hex) data,

'hex' usually means a hex string representation of the raw binary, so I 
am not sure what you actually did.

 > and no matter
> what I tried, could not get Python to read it in any useful way (which
> I attribute to my lack of knowledge in Python)

Again, see struct module.  Your mention of 'zooming' implies that you 
might want to write and read multiple multi-megabyte images in a single 
session.  If so, true binary would be best.

Another suggestion is to use the numpy package to read your image files 
into binary arrays (rather than arrays of Python number objects).  This 
would be much faster (and use less memory). Numpy arrays are 'standard' 
within Pythonland and can be used by scipy and other programs to display 
and analyze the arrays.

A third idea is to make your generator directly callable from Python. 
If you can put it in a dll, you could access it with ctypes.  If you can 
wrap it in a C program, you could use cython to make a Python extension 
modules.

Now I have probably given you too much to think about, so time to stop ;-).

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#77870

From	jayte <jetrn@newsguy.com>
Date	2014-09-14 16:19 -0400
Message-ID	<v7sb1at62fr6h8ljnq0kdjhd9dddqoa8mc@4ax.com>
In reply to	#77869

On Sun, 14 Sep 2014 14:42:52 -0400, Terry Reedy <tjreedy@udel.edu> wrote:

>On 9/14/2014 12:56 PM, jayte wrote:
>> On Sun, 14 Sep 2014 03:02:12 -0400, Terry Reedy <tjreedy@udel.edu> wrote:

[...]

>If you can output Distance as an 8 byte double in IEEE 754 binary64 
>format, you could read a binary file with Python using the struct module.
>https://docs.python.org/3/library/struct.html#module-struct
>This would be the fastest way for output and subsequent input.

This isn't a problem... in fact, I had tried that, at one point, though
I don't remember what I tried to read the data with, in Python.

>If you want a text file either for human readability or because you 
>cannot write a readable binary file,

It was just to try the csv approach; couldn't get anything else
to work <g>

> each line would have the 4 fields 
>listed above (with image row/col implied by line number). I strongly 
>suggest a fixed column format with spaces between the fields.  It might 
>be easiest to put the float as the end instead of the beginning. 
>Allowing max iterations = 999, your first line would look like
>
>  32  22  27 1.850358651774470E-0002
>
>for line in file
>     iternum, zc_x, zc_y, distance = line.split()
>
>This will be faster for both you and the machine to read.
>
>> Initially, I tried exporting the raw binary (hex) data,
>
>'hex' usually means a hex string representation of the raw binary, so I 
>am not sure what you actually did.

I just created a file from the buffer in memory, which held the unaltered
data.

> > and no matter
>> what I tried, could not get Python to read it in any useful way (which
>> I attribute to my lack of knowledge in Python)
>
>Again, see struct module.  Your mention of 'zooming' implies that you 
>might want to write and read multiple multi-megabyte images in a single 
>session.  If so, true binary would be best.

I absolutely will explore the struct module, thank you!

>Another suggestion is to use the numpy package to read your image files 
>into binary arrays (rather than arrays of Python number objects).  This 
>would be much faster (and use less memory). Numpy arrays are 'standard' 
>within Pythonland and can be used by scipy and other programs to display 
>and analyze the arrays.

This is the goal.  It's just... getting to point A.

>A third idea is to make your generator directly callable from Python. 
>If you can put it in a dll, you could access it with ctypes.  If you can 
>wrap it in a C program, you could use cython to make a Python extension 
>modules.

This would be good, as I would also like to perform various analysis
operations on the "orbit" data, itself (the "zc" stuff is already one type,
but it is performed durring the iterations), though the "orbit data" set
can grow to gigabytes, very quickly.  So, doing as you suggest would
make that *much* more feasible.

>Now I have probably given you too much to think about, so time to stop ;-).

That's one of the things I love about programming, there's *always*
too much to think about.

Seriously, your input is very appreciated - thank you.

Jeff

[toc] | [prev] | [next] | [standalone]

#77877

From	Peter Otten <__peter__@web.de>
Date	2014-09-15 09:29 +0200
Message-ID	<mailman.14017.1410766159.18130.python-list@python.org>
In reply to	#77866

jayte wrote:

> Sorry, I neglected to mention the values' significance.  The MXP program
> uses the "distance estimate" algorithm in its fractal data generation. 
> The values are thus, for each point in a 1778 x 1000 image:
> 
> Distance,   (an extended double)
> Iterations,  (a 16 bit int)
> zc_x,        (a 16 bit int)
> zc_y         (a 16 bit int)
> 

Probably a bit too early in your "Python career", but you can read raw data 
with numpy. Something like

with open(filename, "rb") as f:
    a = numpy.fromfile(f, dtype=[
        ("distance", "f16"),
        ("iterations", "i2"), 
        ("zc_x", "i2"),
        ("zc_y", "i2"),
    ]).reshape(1778, 1000)

might do, assuming "extended double" takes 16 bytes.

[toc] | [prev] | [next] | [standalone]

#77896

From	jayte <jetrn@newsguy.com>
Date	2014-09-15 12:33 -0400
Message-ID	<u53e1aha9hno97cpqd0ff6gt9otrtfjsof@4ax.com>
In reply to	#77877

On Mon, 15 Sep 2014 09:29:02 +0200, Peter Otten <__peter__@web.de> wrote:

>jayte wrote:
>
>> Sorry, I neglected to mention the values' significance.  The MXP program
>> uses the "distance estimate" algorithm in its fractal data generation. 
>> The values are thus, for each point in a 1778 x 1000 image:
>> 
>> Distance,   (an extended double)
>> Iterations,  (a 16 bit int)
>> zc_x,        (a 16 bit int)
>> zc_y         (a 16 bit int)
>> 
>
>Probably a bit too early in your "Python career",

Absolutely, just thought it would be interesting to start experimenting,
while learning (plus, can't help but be anxious) <g>

> but you can read raw data 
>with numpy. Something like
>
>with open(filename, "rb") as f:
>    a = numpy.fromfile(f, dtype=[
>        ("distance", "f16"),
>        ("iterations", "i2"), 
>        ("zc_x", "i2"),
>        ("zc_y", "i2"),
>    ]).reshape(1778, 1000)
>
>might do, assuming "extended double" takes 16 bytes.

Will try.  Double extended precision is ten bytes, but I assume
changing  the "f16" to "f10" would account for that...

Jeff

[toc] | [prev] | [next] | [standalone]

#77927

From	Peter Otten <__peter__@web.de>
Date	2014-09-16 13:22 +0200
Message-ID	<mailman.14051.1410866541.18130.python-list@python.org>
In reply to	#77896

jayte wrote:

> On Mon, 15 Sep 2014 09:29:02 +0200, Peter Otten <__peter__@web.de> wrote:
> 
>>jayte wrote:
>>
>>> Sorry, I neglected to mention the values' significance.  The MXP program
>>> uses the "distance estimate" algorithm in its fractal data generation.
>>> The values are thus, for each point in a 1778 x 1000 image:
>>> 
>>> Distance,   (an extended double)
>>> Iterations,  (a 16 bit int)
>>> zc_x,        (a 16 bit int)
>>> zc_y         (a 16 bit int)
>>> 
>>
>>Probably a bit too early in your "Python career",
> 
> Absolutely, just thought it would be interesting to start experimenting,
> while learning (plus, can't help but be anxious) <g>
> 
>> but you can read raw data
>>with numpy. Something like
>>
>>with open(filename, "rb") as f:
>>    a = numpy.fromfile(f, dtype=[
>>        ("distance", "f16"),
>>        ("iterations", "i2"),
>>        ("zc_x", "i2"),
>>        ("zc_y", "i2"),
>>    ]).reshape(1778, 1000)
>>
>>might do, assuming "extended double" takes 16 bytes.
> 
> Will try.  Double extended precision is ten bytes, but I assume
> changing  the "f16" to "f10" would account for that...

Unfortunately it seems that numpy doesn't support "f10" 

>>> numpy.dtype("f8")
dtype('float64')
>>> numpy.dtype("f16")
dtype('float128')
>>> numpy.dtype("f10")
dtype('float32') # looks strange to me

But you better ask for confirmation (and possible workarounds) in a 
specialist forum.

[toc] | [prev] | [next] | [standalone]

#77940

From	jayte <jetrn@newsguy.com>
Date	2014-09-16 14:03 -0400
Message-ID	<f1og1a10rrnnigus6mlfk7gvqp6lnoeohk@4ax.com>
In reply to	#77927

On Tue, 16 Sep 2014 13:22:02 +0200, Peter Otten <__peter__@web.de> wrote:

>jayte wrote:

[...]

>> Will try.  Double extended precision is ten bytes, but I assume
>> changing  the "f16" to "f10" would account for that...
>
>Unfortunately it seems that numpy doesn't support "f10" 

I noticed that, and it *is* unfortunate, as this has gotten me closer
than anything, to reading the raw data.
>
>>>> numpy.dtype("f8")
>dtype('float64')

I could, as far as that goes, convert to double, though I'd be giving
up 16 bits of precision

>>>> numpy.dtype("f16")
>dtype('float128')
>>>> numpy.dtype("f10")
>dtype('float32') # looks strange to me
>
>But you better ask for confirmation (and possible workarounds) in a 
>specialist forum.

Another possibility, would be to write a module that reads double
extended...  

At any rate, thanks (to everyone) for all your help,

Jeff

[toc] | [prev] | [next] | [standalone]

#78187 — Works perfectly (was Re: CSV methodology)

From	jayte <jetrn@newsguy.com>
Date	2014-09-22 20:27 -0400
Subject	Works perfectly (was Re: CSV methodology)
Message-ID	<otl02apfhfao8uq4ci11l7qnprg70lk49j@4ax.com>
In reply to	#77927

On Tue, 16 Sep 2014 13:22:02 +0200, Peter Otten <__peter__@web.de> wrote:

>jayte wrote:
>
>> On Mon, 15 Sep 2014 09:29:02 +0200, Peter Otten <__peter__@web.de> wrote:

[...]

>>> but you can read raw data
>>>with numpy. Something like
>>>
>>>with open(filename, "rb") as f:
>>>    a = numpy.fromfile(f, dtype=[
>>>        ("distance", "f16"),
>>>        ("iterations", "i2"),
>>>        ("zc_x", "i2"),
>>>        ("zc_y", "i2"),
>>>    ]).reshape(1778, 1000)
>>>
>>>might do, assuming "extended double" takes 16 bytes.
>> 
>> Will try.  Double extended precision is ten bytes, but I assume
>> changing  the "f16" to "f10" would account for that...
>
>Unfortunately it seems that numpy doesn't support "f10" 

Thus far, this appears to work perfectly:

with open(filename, "rb") as f:
    a = numpy.fromfile(f, dtype=[
        ("distance", "f8"),
        ("iterations", "u2"),
        ("zc_x", "u2"),
        ("zc_y", "u2"),
    ])

file.close(f)

d = a["distance"]
i = a["iterations"]
x = a["zc_x"]
y = a["zc_y"]

(except, of course, for the loss of precision)

"reshape()" does not appear to be necessary; the various variables
d, i, x, y return a len() of 1778000

Again, thank you very much,

Jeff

[toc] | [prev] | [next] | [standalone]

#78198 — Re: Works perfectly (was Re: CSV methodology)

From	Peter Otten <__peter__@web.de>
Date	2014-09-23 09:59 +0200
Subject	Re: Works perfectly (was Re: CSV methodology)
Message-ID	<mailman.14247.1411459220.18130.python-list@python.org>
In reply to	#78187

jayte wrote:

> On Tue, 16 Sep 2014 13:22:02 +0200, Peter Otten <__peter__@web.de> wrote:
> 
>>jayte wrote:
>>
>>> On Mon, 15 Sep 2014 09:29:02 +0200, Peter Otten <__peter__@web.de>
>>> wrote:
> 
> [...]
> 
>>>> but you can read raw data
>>>>with numpy. Something like
>>>>
>>>>with open(filename, "rb") as f:
>>>>    a = numpy.fromfile(f, dtype=[
>>>>        ("distance", "f16"),
>>>>        ("iterations", "i2"),
>>>>        ("zc_x", "i2"),
>>>>        ("zc_y", "i2"),
>>>>    ]).reshape(1778, 1000)
>>>>
>>>>might do, assuming "extended double" takes 16 bytes.
>>> 
>>> Will try.  Double extended precision is ten bytes, but I assume
>>> changing  the "f16" to "f10" would account for that...
>>
>>Unfortunately it seems that numpy doesn't support "f10"
> 
> Thus far, this appears to work perfectly:
> 
> with open(filename, "rb") as f:
>     a = numpy.fromfile(f, dtype=[
>         ("distance", "f8"),
>         ("iterations", "u2"),
>         ("zc_x", "u2"),
>         ("zc_y", "u2"),
>     ])
> 
> file.close(f)
> 
> d = a["distance"]
> i = a["iterations"]
> x = a["zc_x"]
> y = a["zc_y"]
> 
> (except, of course, for the loss of precision)
> 
> "reshape()" does not appear to be necessary; the various variables
> d, i, x, y return a len() of 1778000

Assuming adjacent pixels are in the same row after 

b = a.reshape(h, w).T 

you can access a pixel as

b[x, y] # without the .T (transposition) it would be b[y, x]

instead of

a[y*w + x]

and a square of 9 pixels with

b[left:left+3, top:top+3]

> Again, thank you very much,
> 
> Jeff

[toc] | [prev] | [next] | [standalone]

#77862

From	Cameron Simpson <cs@zip.com.au>
Date	2014-09-14 18:38 +1000
Message-ID	<mailman.14006.1410683932.18130.python-list@python.org>
In reply to	#77855

On 13Sep2014 21:34, jetrn@newsguy.com <jetrn@newsguy.com> wrote:
>Hello.  Back in the '80s, I wrote a fractal generator, [...]
>Anyway, something I thought would be interesting, would be to export
>some data from my fractal program (I call it MXP), and write something
>in Python and its various scientific data analysis and plotting modules,
>and... well, see what's in there.
>
>An example of the data:
>1.850358651774470E-0002
>32
>22
>27
>... (this format repeats)
>
>So, I wrote a procedure in MXP which converts "the data" and exports
>a csv file.  So far, here's what I've started with:

Normally a CSV file will have multiple values per row. Echoing Terry, what 
shape did you intend your CSV data to be? i.e. what values appear on a row?

>import csv
>fname = 'E:/Users/jayte/Documents/Python Scripts/XportTestBlock.csv'
>f = open(fname)
>reader = csv.reader(f)
>for flt in reader:
>    x = len(flt)
>file.close(f)
>
>This will get me an addressable array, as:
>
>flt[0], flt[1], flt[350], etc...  from which values can be assigned to
>other variables, converted...
>
>My question:  Is there a better way?  Do I need to learn more about
>how csv file are organized?  Perhaps I know far too little of Python
>to be attempting something like this, just yet.

If you have a nice regular CSV file, with say 3 values per row, you can go:

   reader = csv.reader(f)
   for row in reader:
       a, b, c - row

and proceed with a, b and c directly from there. But of course, that requires 
your export format to be usable that way.

Cheers,
Cameron Simpson <cs@zip.com.au>

For a good prime, call:  391581 * 2^216193 -1

[toc] | [prev] | [next] | [standalone]

#77863

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-09-14 01:56 -0700
Message-ID	<85cca2f1-47e3-490c-92e8-66c0461cf8ff@googlegroups.com>
In reply to	#77862

On Sunday, September 14, 2014 2:09:51 PM UTC+5:30, Cameron Simpson wrote:

> If you have a nice regular CSV file, with say 3 values per row, you can go:

>    reader = csv.reader(f)
>    for row in reader:
>        a, b, c - row

I guess you meant:  a, b, c = row  
?

Also you will want to do appropriate type conversions up front.
eg if the first field is string the second field is int and the third a float

something like

a,b,c = row[0], int(row[1]), float(row[2])

More generally as Terry said csv is not necessarily a appropriate standard
(because there is no standard!!)
If inputting outputting to/from a spreadsheet is required thats ok.

Else there are dozens of others.
My favorite is yml
Json is close enough and getting more traction nowadays
Then of course there's XML which everyone loves to hate -- for reasons similar
to csv it may be required

[toc] | [prev] | [next] | [standalone]

#77871

From	Cameron Simpson <cs@zip.com.au>
Date	2014-09-15 09:28 +1000
Message-ID	<mailman.14012.1410737288.18130.python-list@python.org>
In reply to	#77863

On 14Sep2014 01:56, rusi <rustompmody@gmail.com> wrote:
>On Sunday, September 14, 2014 2:09:51 PM UTC+5:30, Cameron Simpson wrote:
>
>> If you have a nice regular CSV file, with say 3 values per row, you can go:
>
>>    reader = csv.reader(f)
>>    for row in reader:
>>        a, b, c - row
>
>I guess you meant:  a, b, c = row
>?

Yeah :-(

Cheers,
Cameron Simpson <cs@zip.com.au>

[toc] | [prev] | [next] | [standalone]

#77876

From	Akira Li <4kir4.1i@gmail.com>
Date	2014-09-15 11:12 +0400
Message-ID	<mailman.14016.1410765161.18130.python-list@python.org>
In reply to	#77855

jetrn@newsguy.com writes:

> Hello.  Back in the '80s, I wrote a fractal generator, which, over the years,
> I've modified/etc to run under Windows.  I've been an Assembly Language
> programmer for decades.  Recently, I decided to learn a new language,
> and decided on Python, and I just love it, and the various IDEs.
>
> Anyway, something I thought would be interesting, would be to export
> some data from my fractal program (I call it MXP), and write something
> in Python and its various scientific data analysis and plotting modules,
> and... well, see what's in there.
>

Tools that are worth mentioning: ipython notebook, pandas

For example,

http://nbviewer.ipython.org/github/twiecki/financial-analysis-python-tutorial/blob/master/1.%20Pandas%20Basics.ipynb


--
Akira

[toc] | [prev] | [next] | [standalone]

#77897

From	pH <high@cidity.level>
Date	2014-09-15 12:40 -0400
Message-ID	<ta5e1at43a4ot9jj376tg5vapj342o4gfb@4ax.com>
In reply to	#77876

On Mon, 15 Sep 2014 11:12:27 +0400, Akira Li <4kir4.1i@gmail.com> wrote:

[...]

>Tools that are worth mentioning: ipython notebook, pandas
>
>For example,
>
>http://nbviewer.ipython.org/github/twiecki/financial-analysis-python-tutorial/blob/master/1.%20Pandas%20Basics.ipynb

Thanks, Akira.  Whenever I started my journey into Python,
I installed the Anaconda distribution, which seems to have included
a very nice assortment.

Jeff

[toc] | [prev] | [next] | [standalone]

#77885

From	Dave Angel <davea@davea.name>
Date	2014-09-15 09:29 -0400
Message-ID	<mailman.14025.1410787674.18130.python-list@python.org>
In reply to	#77855

jetrn@newsguy.com Wrote in message:
> 
> Hello.  Back in the '80s, I wrote a fractal generator, which, over the years,
> I've modified/etc to run under Windows.  I've been an Assembly Language
> programmer for decades.  Recently, I decided to learn a new language,
> and decided on Python, and I just love it, and the various IDEs.
> 
> Anyway, something I thought would be interesting, would be to export
> some data from my fractal program (I call it MXP), and write something
> in Python and its various scientific data analysis and plotting modules,
> and... well, see what's in there.
> 
> An example of the data:
> 1.850358651774470E-0002
> 32
> 22
> 27
> ... (this format repeats)
> 
> So, I wrote a procedure in MXP which converts "the data" and exports
> a csv file.  So far, here's what I've started with:
> 
> -----------------------------------------------
> import csv
> 
> fname = 'E:/Users/jayte/Documents/Python Scripts/XportTestBlock.csv'
> 
> f = open(fname)
> 
> reader = csv.reader(f)
> 
> for flt in reader:
>     x = len(flt)
> file.close(f)
> -----------------------------------------------
> 
> This will get me an addressable array, as:
> 
> flt[0], flt[1], flt[350], etc...  from which values can be assigned to
> other variables, converted...
> 
> My question:  Is there a better way?  Do I need to learn more about
> how csv file are organized?  Perhaps I know far too little of Python
> to be attempting something like this, just yet.
> 
> 

Looks to me like your MXP has produced a single line file, with
 all the values on that single line separated by commas. If the
 data is really uniform,  then it'd be more customary to put one
 item per line. But your sample seems to imply the data is a float
 followed by 3 ints. If so, then I'd expect to see a line for each
 group of 4.

The only advantage of a csv is if the data is rectangular.  If
 it's really a single column, it should be one per line,  and
 you'd use readline instead. 

-- 
DaveA

[toc] | [prev] | [next] | [standalone]

#77898

From	jayte <jetrn@newsguy.com>
Date	2014-09-15 12:53 -0400
Message-ID	<uk5e1a9qpnc3tpako2khqao5bqmssuk4bq@4ax.com>
In reply to	#77885

On Mon, 15 Sep 2014 09:29:48 -0400 (EDT), Dave Angel <davea@davea.name> wrote:

[...]

>Looks to me like your MXP has produced a single line file, with
> all the values on that single line separated by commas.

Yes, that's exactly right.

> If the
> data is really uniform,  then it'd be more customary to put one
> item per line. But your sample seems to imply the data is a float
> followed by 3 ints. If so, then I'd expect to see a line for each
> group of 4.

See, that's why I figured there was something I was missing with
regard to csv files, in general.  Specifically, what the end-of-line
character would be;  traditional CR/LF pair, or... something else.

>The only advantage of a csv is if the data is rectangular.  If
> it's really a single column, it should be one per line,  and
> you'd use readline instead. 

Thank you very much for your input.

Jeff

[toc] | [prev] | [standalone]

csiph-web

CSV methodology

Contents

#77855 — CSV methodology

#77858

#77861

#77866

#77867

#77869

#77870

#77877

#77896

#77927

#77940

#78187 — Works perfectly (was Re: CSV methodology)

#78198 — Re: Works perfectly (was Re: CSV methodology)

#77862

#77863

#77871

#77876

#77897

#77885

#77898