Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Roger L Costello Newsgroups: comp.compilers Subject: Does the theory and algorithms of compiler design also apply to data formats? Date: Sat, 22 Jan 2022 23:54:30 +0000 Organization: Compilers Central Lines: 18 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-01-100@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="28076"; mail-complaints-to="abuse@iecc.com" Keywords: parse, question, comment Posted-Date: 22 Jan 2022 20:54:50 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-US authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=mitre.org; Xref: csiph.com comp.compilers:2862 Hello Compiler Experts! The books that I've read always talk about applying compiler theory and algorithms to programming languages. But there are other kinds of languages such as XML, JSON, Comma-Separated-Values (CSV). And aren't data formats such as JPEG, Powerpoint (ppt), Excel (xls) also languages? Does the rich theory and vast algorithms of compilers apply to these non-programming languages? Has anyone created a Bison parser for JPEG? For JSON? For CSV? /Roger [You could, but for the most part their syntax is so simple that a formal parser would be overkill. For example, JSON has a handful of atoms and only two data structures, a sequential list and a key:value object. Everything else is the semantics. The Microsoft formats like docx, xlsx, and pptx are in fact zip files containing XML files. Unzip one and take a look. Also look at XDR, a widely used network data format and rpcgen which compiles an XDR description into code to read and write it. -John]