Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <480-992-1380@kylheku.com> Newsgroups: comp.compilers Subject: Re: Question about regex with negated character class Date: Mon, 25 Apr 2022 23:46:44 -0000 (UTC) Organization: A noiseless patient Spider Lines: 27 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-04-021@comp.compilers> References: <22-04-015@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="97910"; mail-complaints-to="abuse@iecc.com" Keywords: lex Posted-Date: 25 Apr 2022 22:40:32 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2987 On 2022-04-25, Roger L Costello wrote: > Hi Folks, > > On page 12 of the Flex specification it says this: > > "A negated character class such as [^A-Z] will match a newline > unless \n (or an equivalent escape sequence) is one of the characters > explicitly present > in the negated character class (e.g., [^A-Z\n]). This is unlike how many other > regular expression tools treat negated character classes ..." I suspect this is a documentation mistake (in terms of the the remark it makes about other regex implementations). There is something special in Flex with regard to newlines: namely the any-character regular expression . (dot) does not match any character: it excludes the newline. The documenter might have momentarily gotten their wires crossed, misremembering what is the special behavior. Or else, I also agree with John that it may in fact be a remark about regex implementations in line-oriented text processing utilities, which (in their standrad forms, e.g. POSIX) don't have multi-line matching features in which \n appears as a character. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal