Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4@u.washington.edu Newsgroups: comp.compilers Subject: Re: Spell checking identifiers Date: Tue, 23 Jun 2020 16:51:38 -0700 (PDT) Organization: Compilers Central Lines: 26 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <20-06-012@comp.compilers> References: <20-06-010@comp.compilers> <20-06-011@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="40178"; mail-complaints-to="abuse@iecc.com" Keywords: lex, errors Posted-Date: 24 Jun 2020 14:45:22 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <20-06-011@comp.compilers> Xref: csiph.com comp.compilers:2533 On Tuesday, June 23, 2020 at 12:59:35 PM UTC-7, Johann 'Myrkraverk' Oskarsson wrote: (snip) > This clang blog specifically mentions Levenshtein, > http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#spell_checker > and it looks like what people do is to go through the entire symbol > table and compute it against the individual erroneous identifier. > I thought that'd be a bit on the expensive side, With either constant weighting or character dependent weighting it is easy to do with dynamic programming. The time is then O(m n) where m and n are the two lengths. It seems most obvious to do only variable that are in the appropriate scope to be misspelled, but I suspect catching variables used out of scope is also worth doing. Well, in the latter case, you could hope that they at least spell them the same. I think you should turn it off for one character names, though, even though I suspect those are more likely. Too many false positives!