Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #3092

Re: What does it mean to "move characters" in the lexer?

Path csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From Thomas Koenig <tkoenig@netcologne.de>
Newsgroups comp.compilers
Subject Re: What does it mean to "move characters" in the lexer?
Date Wed, 22 Jun 2022 11:45:22 -0000 (UTC)
Organization news.netcologne.de
Lines 24
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <22-06-071@comp.compilers> (permalink)
References <22-06-057@comp.compilers> <22-06-058@comp.compilers> <22-06-064@comp.compilers> <22-06-066@comp.compilers>
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="77766"; mail-complaints-to="abuse@iecc.com"
Keywords parse, performance, parallel
Posted-Date 22 Jun 2022 10:01:48 EDT
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:3092

Show key headers only | View raw


Kaz Kylheku <480-992-1380@kylheku.com> schrieb:

> I remember reading some article some years ago whereby some Javascript
> programmer discovered it was faster to read JSON from a file using
> dedicated JSON routines available in Javascript, than to declare the
> same syntax in the Javascript program as a literal and let it be
> scanned along with the program and available to it that way.

This came up on comp.arch recently.

There is an insanely fast JSON parser ad UTF-8 validator based
on SIMD to be found at https://github.com/simdjson/simdjson .
They select a different length of vector according to
the CPU version they find.  The algorithm is described at
https://arxiv.org/pdf/1902.08318.pdf.  It
heavily relies on special-casing for JSON and for the SIMD
instructions that are available.

A general SIMD-based parser generator is likely to be even harder
to write and will probably not outperform the package above (nor,
for that case, a traditional character-at-a-time approach).

Is there research on this?

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

What does it mean to "move characters" in the lexer? Roger L Costello <costello@mitre.org> - 2022-06-21 10:27 +0000
  Re: What does it mean to "move characters" in the lexer? gah4 <gah4@u.washington.edu> - 2022-06-21 10:30 -0700
    Re: What does it mean to "move characters" in the lexer? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-22 00:44 +0300
      Re: What does it mean to "move characters" in the lexer? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-06-22 01:13 +0000
        Re: What does it mean to "move characters" in the lexer? Thomas Koenig <tkoenig@netcologne.de> - 2022-06-22 11:45 +0000
  Re: What does it mean to "move characters" in the lexer? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-06-22 01:05 +0000

csiph-web