Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #11892

Re: bash-4.3: casemod word expansions broken with UTF-8

Path csiph.com!optima2.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!usenet.stanford.edu!not-for-mail
From isabella parakiss <izaberina@gmail.com>
Newsgroups gnu.bash.bug
Subject Re: bash-4.3: casemod word expansions broken with UTF-8
Date Tue, 17 Nov 2015 01:28:45 +0100
Lines 32
Approved bug-bash@gnu.org
Message-ID <mailman.25.1447720378.31583.bug-bash@gnu.org> (permalink)
References <22088.36043.764500.752406@a1i15.kph.uni-mainz.de>
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
X-Trace usenet.stanford.edu 1447720379 6231 208.118.235.17 (17 Nov 2015 00:32:59 GMT)
X-Complaints-To action@cs.stanford.edu
Cc bug-bash@gnu.org
To Ulrich Mueller <ulm@gentoo.org>
Envelope-to bug-bash@gnu.org
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1BG9YiZA0a8tXJa5QdaDk8WEk28NKZg144VCXK+WntM=; b=nuDC69uZHlF3vYfIUupeKgcJpFRfJUBaD9ncpItb2eYcO04UOpvN3kQ9NkjzK+uKlc tePD/WtRFy1mf+EpkFbUrRDKo3Tg+5yl7TeD8mqnCltdyIBVpGu3UcANzjtEXiy7QCfc KOIzF1Bf6Av0q5udHtKdoe2aBpoIJlhLU7rfrW8GeG5QAj6j0bl2oVmlTfpVYHL9EBfX 7XnqcK12JVbo3SKGkd/nYZEFodDJVE8DIRwix2y9Gsu0p29LpawlTXJoL/nR+qi51/B4 fMgWBpbz5uG6PtNRzL1tnVcY721iGvXOSsMTNRkVvymfetDbiHWZ22P2sed6vB2AtqwY Mwbg==
X-Received by 10.107.10.233 with SMTP id 102mr35318244iok.31.1447720126052; Mon, 16 Nov 2015 16:28:46 -0800 (PST)
In-Reply-To <22088.36043.764500.752406@a1i15.kph.uni-mainz.de>
X-detected-operating-system by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value).
X-Received-From 2607:f8b0:4001:c06::22b
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.14
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <http://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
Xref csiph.com gnu.bash.bug:11892

Show key headers only | View raw


On 11/15/15, Ulrich Mueller <ulm@gentoo.org> wrote:
> Description:
> 	In an UTF-8 locale like en_US.UTF-8, the case-modifying
> 	parameter expansions sometimes return invalid UTF-8 encodings.
>
> 	This seems to happen when the UTF-8 byte sequences that are
> 	encoding upper and lower case have different lengths.
>
> Repeat-By:
> 	$ LC_ALL=en_US.UTF-8
> 	$ x=$'\xc4\xb1' # LATIN SMALL LETTER DOTLESS I
> 	$ echo -n "${x^}" | od -t x1
> 	0000000 49 b1
> 	0000002
>
> 	This should have output "49" for "I" only. The "b1" is illegal
> 	as the first byte of an UTF-8 sequence.
>
> 	$ x=$'\xe1\xba\x9e' # LATIN CAPITAL LETTER SHARP S
> 	$ echo -n "${x,}" | od -t x1
> 	0000000 c3 9f 9e
> 	0000003
>
> 	This should have output "c3 9f" (for "sharp s") only.
>

Both examples should work as expected in 4.4-beta.


---
xoxo iza

Back to gnu.bash.bug | Previous | Next | Find similar


Thread

Re: bash-4.3: casemod word expansions broken with UTF-8 isabella parakiss <izaberina@gmail.com> - 2015-11-17 01:28 +0100

csiph-web