Re: Unicode range and enumeration support.

From	L A Walsh <bash@tlinx.org>
Newsgroups	gnu.bash.bug
Subject	Re: Unicode range and enumeration support.
Date	2019-12-23 12:52 -0800
Message-ID	<mailman.1335.1577134334.1979.bug-bash@gnu.org> (permalink)
References	(9 earlier) <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <2334eff4-8a88-18ee-b086-4ba4e80af01b@archlinux.org> <5E0128F0.5000901@tlinx.org>

Show all headers | View raw

On 2019/12/21 22:38, Eli Schwartz wrote:
> On 12/20/19 7:35 PM, L A Walsh wrote:
>   
>>
>> ⁰⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉
>>
>> Q.E.D.
>>
>>
>> Is that sufficient proof?
>>     
>
> It's sufficient proof that you're wrong, yes.
>   
If you only knew how to use the tools you have on your machine.
> Given the discussion was about collation,
---
    But it wasn't.  It was about generating characters between two
characters that were given.  In unicode, that would be two code points.
Nothing about enumeration.
>  not simply enumerating
> codepoints in order of their codepoint values, it would be helpful to
> actually, you know, collate them.
>
> Given your sample text range:
>
> $ printf %s\\n ⁰ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ | sort -u
>
> ⁰
> ⁴
> ⁵
> ⁶
> ⁷
> ⁸
> ⁹
> ₀
> ₁
> ₂
> ₃
> ₄
> ₅
> ₆
> ₇
> ₈
> ₉
>
> This is plainly not in byte order.
>   
----
    It is in unicode code point order.  Which is what you would use
for unicode.  If you want to sort via unicode, use the -u switch.
> Now you need to ask yourself the question: which locale do you want to
> sort according to? I used en_US.UTF-8. Please don't say "C.UTF-8",
> because that's not actually a thing. And the plain C locale won't work
> for obvious reasons...
>   
----
    I don't need to ask myself what locale if I suggested using unicode
code point order.

Back to gnu.bash.bug | Previous | Next | Find similar

Thread

Re: Unicode range and enumeration support. L A Walsh <bash@tlinx.org> - 2019-12-23 12:52 -0800

csiph-web