Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.ruby > #3412
| From | William Rutiser <wruyahoo05@comcast.net> |
|---|---|
| Newsgroups | comp.lang.ruby |
| Subject | Re: Binary file: SAT |
| Date | 2011-04-23 13:20 -0500 |
| Organization | Service de news de lacave.net |
| Message-ID | <4DB31887.20803@comcast.net> (permalink) |
| References | (3 earlier) <BANLkTimZvkXz=7bTAcdnwdU2AsH6MzYd8w@mail.gmail.com> <2cddb6943b03d69eca28d3dffeba1374@ruby-forum.com> <b1b002b0b933deeeb92ca664c4f7b79f@ruby-forum.com> <a9a17c857d094cdaea68f811a6bdfa1c@ruby-forum.com> <d05142eefdc15ef8496889ba2fdc2918@ruby-forum.com> |
On 2011-04-22 2:49 PM, 7stud -- wrote: > Alessandro Barracco wrote in post #994473: >>> Do not think of binary files as containing lines. A binary file is a >>> long continuous sequence of integers contained in a varying number of >>> bytes. >> That's OK. but the file I need to parse is a special txt file (DXF >> format) that consist of couple-of-line: > Binary files do not have lines. Until you can understand that, you > cannot proceed. Binary files consist of blocks of bytes. Each block > contains some data. Each block consists of a different number of bytes. > Its not to helpful to someone trying to deal with DXF files to make such a strong distinction between binary and text files. I haven't worked with them and hope I never have to. A quick look at the Wikipedia article and the most recent Autocad spec suggests that the files may be best thought of as a mixture of binary and ASCII data. The original DXF files were text files where each line was a key value pair with the value generally a decimal representation of a floating point number. There is now an optional file format that contains binary representations of the numbers to reduce precision losses caused by repeated conversions and save some space. Most of the 270 page specification appears to describe the ASCII format with the binary format introduced on page 242. You can get a recent DXF spec at: http://images.autodesk.com/adsk/files/autocad_2012_pdf_dxf-reference_enu.pdf This may give a helpful overview: http://en.wikipedia.org/wiki/Dxf Alessandro's problem is to read and parse a file that contains small fields to be interpreted as ASCII text, binary integers, floating point numbers, etc. Just what will come next is determined by what came just before with reference to a 270 page document which has a few examples in Visual Basic 6. I would proceed as follows: * Figure out which kinds of primitive data are expected in the files of interest. * For each kind, write and test a function to read and convert one such item. * Write a function to read the next entity record from the file. Its likely that this function should return a Ruby object that represents the particular kind of entity. The ACIS spec says "The header is followed by a sequence of entity records. Each entity record consists of a sequence number (optional), an entity type identifier, the entity data, and a terminator." So to read an entity record, first read the sequence number if present, then read the type identifier. The type identifier should be used to select an appropriate function to read the data part of the entity record. Then read the terminator unless it was already used to end the entity data. Essential tools: Something to examine and print pieces of the data in hexadecimal. Use this to explore the data and resolve questions about byte order, number encoding, etc. The ruby String pack and unpack functions. Possibly an assortment of colored pencils to mark up printed hex dumps of the data. There may be some Ruby tools specifically intended for this kind of work. Caveat: I may have written more than I know about some of the details but I think the general ideas are correct. -- Bill
Back to comp.lang.ruby | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Binary file: SAT Alessandro Barracco <bomastudio@gmail.com> - 2011-04-20 17:00 -0500
Re: Binary file: SAT 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-20 19:02 -0500
Re: Binary file: SAT 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-20 19:45 -0500
Re: Binary file: SAT Roger Braun <roger@rogerbraun.net> - 2011-04-20 21:53 -0500
Re: Binary file: SAT Alessandro Barracco <bomastudio@gmail.com> - 2011-04-21 03:06 -0500
Re: Binary file: SAT 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-21 12:25 -0500
Re: Binary file: SAT Alessandro Barracco <bomastudio@gmail.com> - 2011-04-22 04:51 -0500
Re: Binary file: SAT 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-22 13:49 -0500
Re: Binary file: SAT William Rutiser <wruyahoo05@comcast.net> - 2011-04-23 13:20 -0500
Re: Binary file: SAT 7stud -- <bbxx789_05ss@yahoo.com> - 2011-04-21 12:53 -0500
csiph-web