Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15440 > unrolled thread

Re: Wildcard expansion can fail with nonprinting characters

Started byStephane Chazelas <stephane.chazelas@gmail.com>
First post2019-10-01 07:44 +0100
Last post2019-10-01 07:44 +0100
Articles 1 — 1 participant

Back to article view | Back to gnu.bash.bug

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Wildcard expansion can fail with nonprinting characters Stephane Chazelas <stephane.chazelas@gmail.com> - 2019-10-01 07:44 +0100

#15440 — Re: Wildcard expansion can fail with nonprinting characters

FromStephane Chazelas <stephane.chazelas@gmail.com>
Date2019-10-01 07:44 +0100
SubjectRe: Wildcard expansion can fail with nonprinting characters
Message-ID<mailman.554.1569912268.2651.bug-bash@gnu.org>
2019-09-30 15:35:21 -0400, Chet Ramey:
[...]
> The $'\361' is a unicode combining
> character, which ends up making the entire sequence of characters an
> invalid wide character string in a bunch of different locales.
[...]

No, $'\u0361', the unicode character 0x361 (hex) is "COMBINING
DOUBLE INVERTED BREVE" (encoded as \315\241 in UTF-8)

But $'\361' is byte value 0361 (octal). In UTF-8, on its own
it's an invalid byte sequence. That's 2#11110001, which would be
the first byte of a 4 byte-long character (of characters U+40000
to U+7FFFF). In latin1, that's ñ (LATIN SMALL LETTER N WITH
TILDE).

So $'foo\361bar' is not text in UTF-8, but that's an encoding
issue, not a problem with combining characters.

$ locale charmap
UTF-8
$ printf '\u361' | od -An -to1
 315 241
$ printf '\U40000' | od -An -vto1
 361 200 200 200
$ printf 'foo\361bar' | iconv -f utf8
fooiconv: illegal input sequence at position 3

-- 
Stephane

[toc] | [standalone]


Back to top | Article view | gnu.bash.bug


csiph-web