Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #14554
| Path | csiph.com!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail |
|---|---|
| From | Eric Blake <eblake@redhat.com> |
| Newsgroups | gnu.bash.bug |
| Subject | Re: built-in regex matches wrong character |
| Date | Thu, 6 Sep 2018 09:23:33 -0500 |
| Organization | Red Hat, Inc. |
| Lines | 30 |
| Approved | bug-bash@gnu.org |
| Message-ID | <mailman.444.1536243821.1284.bug-bash@gnu.org> (permalink) |
| References | <201809051850.w85IoClP001449@mamatb-laptop> <5d3e2655-9b29-563e-a3aa-f96f6563f9fc@redhat.com> <cdf3707d-9e10-4be3-94f9-4cb5f5d9b9ed@case.edu> |
| NNTP-Posting-Host | lists.gnu.org |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=utf-8; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-Trace | usenet.stanford.edu 1536243822 15836 208.118.235.17 (6 Sep 2018 14:23:42 GMT) |
| X-Complaints-To | action@cs.stanford.edu |
| To | chet.ramey@case.edu, bug-bash@gnu.org, amatbaeza@gmail.com |
| Envelope-to | bug-bash@gnu.org |
| User-Agent | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
| In-Reply-To | <cdf3707d-9e10-4be3-94f9-4cb5f5d9b9ed@case.edu> |
| Content-Language | en-US |
| X-Scanned-By | MIMEDefang 2.78 on 10.11.54.3 |
| X-Greylist | Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 06 Sep 2018 14:23:34 +0000 (UTC) |
| X-Greylist | inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 06 Sep 2018 14:23:34 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'eblake@redhat.com' RCPT:'' |
| X-detected-operating-system | by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] |
| X-Received-From | 66.187.233.73 |
| X-BeenThere | bug-bash@gnu.org |
| X-Mailman-Version | 2.1.21 |
| Precedence | list |
| List-Id | Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org> |
| List-Unsubscribe | <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe> |
| List-Archive | <http://lists.gnu.org/archive/html/bug-bash/> |
| List-Post | <mailto:bug-bash@gnu.org> |
| List-Help | <mailto:bug-bash-request@gnu.org?subject=help> |
| List-Subscribe | <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe> |
| Xref | csiph.com gnu.bash.bug:14554 |
Show key headers only | View raw
On 09/06/2018 09:17 AM, Chet Ramey wrote: > On 9/5/18 4:39 PM, Eric Blake wrote: > >> Or, you can use bash's 'shopt -s globasciiranges' which is >> supposed to enable Rational Range Interpretation, where even in non-C >> locales, a character range bounded by two ASCII characters takes on the C >> locale definition of only the ASCII characters in that range, rather than >> the locale's definition of whatever other characters might also be >> equivalent (actually, while I know that shopt affects globbing, I don't >> know if it also affects regex matching - but if it doesn't, that's probably >> a bug that should be fixed). > > Since bash uses the C library's regexp engine, and most C libraries don't > implement RRI, much less expose it as a flags option available via > regcomp(), there's no reason to expect that globasciiranges would have > any effect on regular expression matching. But bash could be taught to convert any regex that contains a range with both endpoints ASCII into a different bracket expression before handing things over to regcomp(). That is, if the user is matching against [a-d], bash hands [abcd] to regcomp() instead. You don't need a flag in regcomp() to get RRI, just merely some pre-processing (and often memory allocation, as the expansion of a range into a non-range tends to require more characters). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Back to gnu.bash.bug | Previous | Next — Next in thread | Find similar | Unroll thread
Re: built-in regex matches wrong character Eric Blake <eblake@redhat.com> - 2018-09-06 09:23 -0500
Re: built-in regex matches wrong character arnold@skeeve.com (Aharon Robbins) - 2018-09-06 17:39 +0000
Re: built-in regex matches wrong character Eric Blake <eblake@redhat.com> - 2018-09-06 12:58 -0500
csiph-web