Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15769

Re: Unicode range and enumeration support.

Path csiph.com!xmission!news.snarked.org!news.linkpendium.com!news.linkpendium.com!panix!usenet.stanford.edu!not-for-mail
From L A Walsh <bash@tlinx.org>
Newsgroups gnu.bash.bug
Subject Re: Unicode range and enumeration support.
Date Mon, 23 Dec 2019 12:52:00 -0800
Lines 29
Approved bug-bash@gnu.org
Message-ID <mailman.1335.1577134334.1979.bug-bash@gnu.org> (permalink)
References <9dd3a388-39b1-c059-de99-813f1e411764@case.edu> <5DF2987E.5000309@tlinx.org> <568aeaaa-22b3-c7b9-0e18-a92bef6d2ffb@iki.fi> <5DF2FE31.9070406@tlinx.org> <0ff3a920-94c2-b0c9-5631-0964955657aa@archlinux.org> <5DF3D78B.4090208@tlinx.org> <20191213184213.GO851@eeg.ccf.org> <5DF4BDF0.6000402@tlinx.org> <20191216163906.GV851@eeg.ccf.org> <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <2334eff4-8a88-18ee-b086-4ba4e80af01b@archlinux.org> <5E0128F0.5000901@tlinx.org>
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding base64
X-Trace usenet.stanford.edu 1577134334 11554 209.51.188.17 (23 Dec 2019 20:52:14 GMT)
X-Complaints-To action@cs.stanford.edu
To Eli Schwartz <eschwartz@archlinux.org>, bug-bash <bug-bash@gnu.org>
Envelope-to bug-bash@gnu.org
User-Agent Thunderbird
In-Reply-To <2334eff4-8a88-18ee-b086-4ba4e80af01b@archlinux.org>
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy]
X-Received-From 173.164.175.65
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.23
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <https://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
X-Mailman-Original-Message-ID <5E0128F0.5000901@tlinx.org>
X-Mailman-Original-References <9dd3a388-39b1-c059-de99-813f1e411764@case.edu> <5DF2987E.5000309@tlinx.org> <568aeaaa-22b3-c7b9-0e18-a92bef6d2ffb@iki.fi> <5DF2FE31.9070406@tlinx.org> <0ff3a920-94c2-b0c9-5631-0964955657aa@archlinux.org> <5DF3D78B.4090208@tlinx.org> <20191213184213.GO851@eeg.ccf.org> <5DF4BDF0.6000402@tlinx.org> <20191216163906.GV851@eeg.ccf.org> <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <2334eff4-8a88-18ee-b086-4ba4e80af01b@archlinux.org>
Xref csiph.com gnu.bash.bug:15769

Show key headers only | View raw


On 2019/12/21 22:38, Eli Schwartz wrote:
> On 12/20/19 7:35 PM, L A Walsh wrote:
>   
>>
>> ⁰⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉
>>
>> Q.E.D.
>>
>>
>> Is that sufficient proof?
>>     
>
> It's sufficient proof that you're wrong, yes.
>   
If you only knew how to use the tools you have on your machine.
> Given the discussion was about collation,
---
    But it wasn't.  It was about generating characters between two
characters that were given.  In unicode, that would be two code points.
Nothing about enumeration.
>  not simply enumerating
> codepoints in order of their codepoint values, it would be helpful to
> actually, you know, collate them.
>
> Given your sample text range:
>
> $ printf %s\\n ⁰ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ | sort -u
>
> ⁰
> ⁴
> ⁵
> ⁶
> ⁷
> ⁸
> ⁹
> ₀
> ₁
> ₂
> ₃
> ₄
> ₅
> ₆
> ₇
> ₈
> ₉
>
> This is plainly not in byte order.
>   
----
    It is in unicode code point order.  Which is what you would use
for unicode.  If you want to sort via unicode, use the -u switch.
> Now you need to ask yourself the question: which locale do you want to
> sort according to? I used en_US.UTF-8. Please don't say "C.UTF-8",
> because that's not actually a thing. And the plain C locale won't work
> for obvious reasons...
>   
----
    I don't need to ask myself what locale if I suggested using unicode
code point order.

Back to gnu.bash.bug | Previous | Next | Find similar


Thread

Re: Unicode range and enumeration support. L A Walsh <bash@tlinx.org> - 2019-12-23 12:52 -0800

csiph-web