Path: csiph.com!weretis.net!feeder6.news.weretis.net!4.us.feeder.erje.net!feeder.erje.net!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail From: Chet Ramey Newsgroups: gnu.bash.bug Subject: Re: Locale not Obeyed by Parameter Expansion with Pattern Substitution Date: Wed, 20 Nov 2019 09:11:59 -0500 Lines: 42 Approved: bug-bash@gnu.org Message-ID: References: <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> Reply-To: chet.ramey@case.edu NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: usenet.stanford.edu 1574259131 11063 209.51.188.17 (20 Nov 2019 14:12:11 GMT) X-Complaints-To: action@cs.stanford.edu Cc: chet.ramey@case.edu To: Chris Carlen , bug-bash@gnu.org Envelope-to: bug-bash@gnu.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1574259123; bh=O0ulh8BiS5TkJXQ2R1321Jzrw0cDfTvGLEJ5Gky7ymg=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=y7SWKykNJERs1i3Q5y2RCh8vIAd8ZUR/2910iMKropElSG0HwYl81yOWoeNReO8wCn SI3w0uifZ2LotDevVBrqjkoiTuLh7CzEtLm667grdyO1blOV9oqTyC1JP329m0+6tka ErTnVBWmP9Pcm1iMz97JZ4DSL+s+qnJV1OjVfDEtaO9HBl5JDD94cmsh4gl5/um5C70 E6nx86RyfgPkYayn/jSsmh47vWonT4WrJKd2BzMHCxQGWDirdFcjp1h9y8OGO88GS4F be/Jqc6nFh9lwQCivdBHZyka11rkUYq/dCww1Y0mJWAsIr99yHFRPfFPGsbwGRlS7n6 8EG23zEA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1574259121; bh=O7xEy3DsoDcqhwke1K5WV6Isnvu5HCrYs/OWHuEezSU=; h=Reply-To:Cc:Subject:To:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=sCP6fctN2YDzVtBLoezc+jvSuVXoM2KZiRhTfVgdWusMCBbIkZ+4Y09yAI+bguLKRZ v7QZQmNRU8CDIwKXWxeb7RdoZeM92CtmAVqdiSGvukDdG371PmlQ407HotgN8Vp4RX8 sKn6t/DaPMKUMGM8ysg3jrU7EvEQ1g9kEdB92ZiIM8zKkc7HwIE5wkcMRMaZiZzixHP e9JM2PPDEmL5RL1ZrGVbi/JX4ksobcxaasUWcSOZhE12e4c8u0tEK1QJdWQFP7W7cs4 pPDv3sRHvkuz4rR4cjBLTQlwTcHr88A6d8zFtWEUsN8tN1yY+imREOHWds5zR++YzfC 8ULmen/Q== User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 In-Reply-To: <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> Content-Language: en-US X-Junkmail-Status: score=7/90, host=mpv2-2015.case.edu X-Junkmail-PrAS-Raw: score=7/90, refid=2.7.2:2019.11.20.125117:17:7.944, ip=, rules=DKIM_SIGNATURE, __HAS_REPLYTO, __HAS_CC_HDR, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __HAS_REFERENCES, __REFERENCES, __HAS_FROM, FROM_EDU_TLD, __HAS_MSGID, __SANE_MSGID, DATE_TZ_NA, __USER_AGENT, __MOZILLA_USER_AGENT, __MIME_VERSION, __IN_REP_TO, __CT, __CT_TEXT_PLAIN, __CTE, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC1, __FROM_DOMAIN_IN_ANY_CC2, __REPLYTO_SAMEAS_FROM_DOMAIN, __DKIM_ALIGNS_1, __DKIM_ALIGNS_2, __ANY_URI, __URI_MAILTO, __URI_WITH_PATH, __URI_NO_WWW, __HIGHBITS, __CP_URI_IN_BODY, __FRAUD_MONEY_CURRENCY_DOLLAR, __SUBJ_ALPHA_NEGATE, __URI_IN_BODY, __URI_NOT_IMG, __FORWARDED_MSG, __BODY_NO_MAILTO, __NO_HTML_TAG_RAW, BODY_SIZE_1100_1199, BODYTEXTP_SIZE_3000_LESS, __MIME_TEXT_P1, __MIME_TEXT_ONLY, __URI_NS, HTML_00_01, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 129.22.103.227 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Mailman-Original-Message-ID: X-Mailman-Original-References: <0acc4767-e87f-f163-b39e-f137effdfea2.ref@sbcglobal.net> <0acc4767-e87f-f163-b39e-f137effdfea2@sbcglobal.net> Xref: csiph.com gnu.bash.bug:15618 On 11/17/19 4:25 AM, Chris Carlen wrote: > Bash Version: 5.0 > Patch Level: 0 > Release Status: release > > Description: >   UTF-8 multibyte char string split into bytes rather than characters. > > Repeat-By: > > #!/bin/bash > > shopt -s extglob > LC_ALL="en_US.UTF-8" > > # E.g., normal/expected behavior: > > # Create a string: > A=abc > > # Replace left virtual empty strings with spaces, putting separated > # chars into positional parameters, then print them quoted: > set -- ${A//?()/ } > echo "${@@Q}"       #-> 'a' 'b' 'c' > > # E.g., abnormal behavior: > > # write 'REVERSE PILCROW SIGN' to B, then repeat as above: > printf -v B '\u204B' > set -- ${B//?()/ } > echo "${@@Q}"       #-> $'\342' $'\201' $'\213' Yes, this is a problem. The null match requires advancing through the string by one character, instead of one byte. I'll fix it. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/