Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #77869

Re: CSV methodology

From Terry Reedy <tjreedy@udel.edu>
Subject Re: CSV methodology
Date 2014-09-14 14:42 -0400
References <b2q91ada6b59ept81ac65vtnnu6sdklp1h@4ax.com> <mailman.14005.1410678164.18130.python-list@python.org> <i5db1a9s5mvgsmg99q8kqf4l672fbrvp0n@4ax.com>
Newsgroups comp.lang.python
Message-ID <mailman.14011.1410720202.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 9/14/2014 12:56 PM, jayte wrote:
> On Sun, 14 Sep 2014 03:02:12 -0400, Terry Reedy <tjreedy@udel.edu> wrote:
>
>> On 9/13/2014 9:34 PM, jetrn@newsguy.com wrote:
>
> [...]
>
>> First you need to think about (and document) what your numbers mean and
>> how they should be organized for analysis.
>>
>>> An example of the data:
>>> 1.850358651774470E-0002
>>
>> Why is this so smaller than the next numbers.  Are all those digits
>> significant, or are they mostly just noise -- and best dropped by
>> rounding the number to a few significant digits.
>
> Sorry, I neglected to mention the values' significance.  The MXP program
> uses the "distance estimate" algorithm in its fractal data generation.  The
> values are thus, for each point in a 1778 x 1000 image:
>
> Distance,   (an extended double)
> Iterations,  (a 16 bit int)
> zc_x,        (a 16 bit int)
> zc_y         (a 16 bit int)

> (Durring the "orbit" calculations, the result of each iteration will be positive
> or negative.  The "zc" is a "zero crossing" count, or frequency, and is
> used in the eventual coloring algorithm.  "Distance" can range from zero
> to very small.)
>
>>> 32
>>> 22
>>> 27

If you can output Distance as an 8 byte double in IEEE 754 binary64 
format, you could read a binary file with Python using the struct module.
https://docs.python.org/3/library/struct.html#module-struct
This would be the fastest way for output and subsequent input.

If you want a text file either for human readability or because you 
cannot write a readable binary file, each line would have the 4 fields 
listed above (with image row/col implied by line number). I strongly 
suggest a fixed column format with spaces between the fields.  It might 
be easiest to put the float as the end instead of the beginning. 
Allowing max iterations = 999, your first line would look like

  32  22  27 1.850358651774470E-0002

for line in file
     iternum, zc_x, zc_y, distance = line.split()

This will be faster for both you and the machine to read.

> Initially, I tried exporting the raw binary (hex) data,

'hex' usually means a hex string representation of the raw binary, so I 
am not sure what you actually did.

 > and no matter
> what I tried, could not get Python to read it in any useful way (which
> I attribute to my lack of knowledge in Python)

Again, see struct module.  Your mention of 'zooming' implies that you 
might want to write and read multiple multi-megabyte images in a single 
session.  If so, true binary would be best.

Another suggestion is to use the numpy package to read your image files 
into binary arrays (rather than arrays of Python number objects).  This 
would be much faster (and use less memory). Numpy arrays are 'standard' 
within Pythonland and can be used by scipy and other programs to display 
and analyze the arrays.

A third idea is to make your generator directly callable from Python. 
If you can put it in a dll, you could access it with ctypes.  If you can 
wrap it in a C program, you could use cython to make a Python extension 
modules.

Now I have probably given you too much to think about, so time to stop ;-).

-- 
Terry Jan Reedy

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

CSV methodology jetrn@newsguy.com - 2014-09-13 21:34 -0400
  Re: CSV methodology kjs <bfb@riseup.net> - 2014-09-14 02:51 +0000
  Re: CSV methodology Terry Reedy <tjreedy@udel.edu> - 2014-09-14 03:02 -0400
    Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-14 12:56 -0400
      Re: CSV methodology Chris Angelico <rosuav@gmail.com> - 2014-09-15 03:10 +1000
      Re: CSV methodology Terry Reedy <tjreedy@udel.edu> - 2014-09-14 14:42 -0400
        Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-14 16:19 -0400
      Re: CSV methodology Peter Otten <__peter__@web.de> - 2014-09-15 09:29 +0200
        Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-15 12:33 -0400
          Re: CSV methodology Peter Otten <__peter__@web.de> - 2014-09-16 13:22 +0200
            Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-16 14:03 -0400
            Works perfectly (was Re: CSV methodology) jayte <jetrn@newsguy.com> - 2014-09-22 20:27 -0400
              Re: Works perfectly (was Re: CSV methodology) Peter Otten <__peter__@web.de> - 2014-09-23 09:59 +0200
  Re: CSV methodology Cameron Simpson <cs@zip.com.au> - 2014-09-14 18:38 +1000
    Re: CSV methodology Rustom Mody <rustompmody@gmail.com> - 2014-09-14 01:56 -0700
      Re: CSV methodology Cameron Simpson <cs@zip.com.au> - 2014-09-15 09:28 +1000
  Re: CSV methodology Akira Li <4kir4.1i@gmail.com> - 2014-09-15 11:12 +0400
    Re: CSV methodology pH <high@cidity.level> - 2014-09-15 12:40 -0400
  Re:CSV methodology Dave Angel <davea@davea.name> - 2014-09-15 09:29 -0400
    Re: CSV methodology jayte <jetrn@newsguy.com> - 2014-09-15 12:53 -0400

csiph-web