Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15440
| From | Stephane Chazelas <stephane.chazelas@gmail.com> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: Wildcard expansion can fail with nonprinting characters |
| Date | 2019-10-01 07:44 +0100 |
| Message-ID | <mailman.554.1569912268.2651.bug-bash@gnu.org> (permalink) |
| References | <pnih84x47ql.fsf@bow.cs.hmc.edu> <9e9454a8-35db-c426-5388-7426169c4d63@case.edu> <20191001064420.7rzjoascomqaxm53@chaz.gmail.com> |
2019-09-30 15:35:21 -0400, Chet Ramey: [...] > The $'\361' is a unicode combining > character, which ends up making the entire sequence of characters an > invalid wide character string in a bunch of different locales. [...] No, $'\u0361', the unicode character 0x361 (hex) is "COMBINING DOUBLE INVERTED BREVE" (encoded as \315\241 in UTF-8) But $'\361' is byte value 0361 (octal). In UTF-8, on its own it's an invalid byte sequence. That's 2#11110001, which would be the first byte of a 4 byte-long character (of characters U+40000 to U+7FFFF). In latin1, that's ñ (LATIN SMALL LETTER N WITH TILDE). So $'foo\361bar' is not text in UTF-8, but that's an encoding issue, not a problem with combining characters. $ locale charmap UTF-8 $ printf '\u361' | od -An -to1 315 241 $ printf '\U40000' | od -An -vto1 361 200 200 200 $ printf 'foo\361bar' | iconv -f utf8 fooiconv: illegal input sequence at position 3 -- Stephane
Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread
Re: Wildcard expansion can fail with nonprinting characters Stephane Chazelas <stephane.chazelas@gmail.com> - 2019-10-01 07:44 +0100
csiph-web