Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2865

Re: Does the theory and algorithms of compiler design also apply to data formats?

From "matt.ti...@gmail.com" <matt.timmermans@gmail.com>
Newsgroups comp.compilers
Subject Re: Does the theory and algorithms of compiler design also apply to data formats?
Date 2022-01-23 06:58 -0800
Organization Compilers Central
Message-ID <22-01-104@comp.compilers> (permalink)
References <22-01-100@comp.compilers>

Show all headers | View raw


On Saturday, 22 January 2022 at 20:54:52 UTC-5, Roger L Costello wrote:
> Hello Compiler Experts!
>
> The books that I've read always talk about applying compiler theory and
> algorithms to programming languages. But there are other kinds of languages
> such as XML, JSON, Comma-Separated-Values (CSV). And aren't data formats
such
> as JPEG, Powerpoint (ppt), Excel (xls) also languages? Does the rich theory
> and vast algorithms of compilers apply to these non-programming languages?
Has
> anyone created a Bison parser for JPEG? For JSON? For CSV?

As the moderator indicates, these kinds of data formats are designed to be
simple, and so its not usually useful to use grammar-based parser generators
for the data format itself.

SGML is a notable exception to this.  The standard that defines it is large
and its grammar is complicated.  It wouldn't be crazy to use a parser
generator for XML either.

For a lot of these data formats, though, you can apply schemas of some sort to
the data (SGML DTDs, XML schema, JSON schema, etc.), and when the data is
anticipated to represent a *document*, as in SGML or XML, these schemas are
basically a graph of nested regular expressions much like a grammar, and a lot
of parsing theory applies.

Furthermore, document *processing*, as in generating a printed manual from the
structure document that defines its parts, involves applying rules to
structures that are recognized in the content.  This is syntax directed
translation (https://en.wikipedia.org/wiki/Syntax-directed_translation), and
all the related compiler theory applies.  In some ways it is easier, because
the content you're translating is a tree instead of flat text, but in some
ways it is more difficult, because the job is to implement a manual human
process instead of a language that was designed to be parsed.

Back to comp.compilers | Previous | NextPrevious in thread | Find similar


Thread

Does the theory and algorithms of compiler design also apply to data formats? Roger L Costello <costello@mitre.org> - 2022-01-22 23:54 +0000
  Re: Does the theory and algorithms of compiler design also apply to data formats? gah4 <gah4@u.washington.edu> - 2022-01-22 20:33 -0800
    Re: Does the theory and algorithms of compiler design also apply to data formats? Thomas Koenig <tkoenig@netcologne.de> - 2022-01-23 21:05 +0000
  Re: Does the theory and algorithms of compiler design also apply to data formats? "matt.ti...@gmail.com" <matt.timmermans@gmail.com> - 2022-01-23 06:58 -0800

csiph-web