Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #14304

Word boundary anchors \< and \> not parsed correctly on the right side of =~

From marcelpaulo@gmail.com
Newsgroups gnu.bash.bug
Subject Word boundary anchors \< and \> not parsed correctly on the right side of =~
Date 2018-07-09 22:46 -0300
Message-ID <mailman.3347.1531187182.1292.bug-bash@gnu.org> (permalink)

Show all headers | View raw


Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I../. -I.././include -I.././lib  -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=/build/bash-vEMnMR/bash-4.4.18=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Wno-parentheses -Wno-format-security
uname output: Linux monk 4.15.0-24-generic #26-Ubuntu SMP Wed Jun 13 08:44:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.4
Patch Level: 19
Release Status: release

Description:
Word boundary anchors \< and \> are not parsed correctly on the right side of a =~ regex match expression. 

This evaluates as false:

    [[ 'foo bar' =~ \<foo\> ]]

>From the bash reference manual:

    An additional binary operator, ‘=~’, is available, with the same precedence
    as ‘==’ and ‘!=’. When it is used, the string to the right of the operator
    is consid- ered an extended regular expression and matched accordingly (as
    in regex 3)).

Reading regex(3), I presumed the regexes would be parsed as C strings, so the
backslashes would need to be escaped:

    [[ 'foo bar' =~ \\<foo\\> ]]

but this results in:

    bash: syntax error in conditional expression: unexpected token `<'
    bash: syntax error near `\\<f'
    
If the regex is stored in a variable, the expression evaluates as true:

    re='\<foo\>'
    [[ 'foo bar' =~ $re ]]

Treating the regex as C strings works for the \b anchor, so that this evaluates as true:

    [[ 'foo bar' =~ \\bfoo\\b ]]

Repeat-By:
    This is evaluated as false:

	[[ 'foo bar' =~ \<foo\> ]]

    but this evaluates as true:

	re='\<foo\>'
	[[ 'foo bar' =~ $re ]]
	
    and this results in a syntax error:

	[[ 'foo bar' =~ \\<foo\\> ]] 

    whereas this is parsed and evaluated correctly:

	[[ 'foo bar' =~ \\bfoo\\b ]]

Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Word boundary anchors \< and \> not parsed correctly on the right side of =~ marcelpaulo@gmail.com - 2018-07-09 22:46 -0300

csiph-web