Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #15769
| From | L A Walsh <bash@tlinx.org> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: Unicode range and enumeration support. |
| Date | 2019-12-23 12:52 -0800 |
| Message-ID | <mailman.1335.1577134334.1979.bug-bash@gnu.org> (permalink) |
| References | (9 earlier) <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <2334eff4-8a88-18ee-b086-4ba4e80af01b@archlinux.org> <5E0128F0.5000901@tlinx.org> |
On 2019/12/21 22:38, Eli Schwartz wrote:
> On 12/20/19 7:35 PM, L A Walsh wrote:
>
>>
>> ⁰⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉
>>
>> Q.E.D.
>>
>>
>> Is that sufficient proof?
>>
>
> It's sufficient proof that you're wrong, yes.
>
If you only knew how to use the tools you have on your machine.
> Given the discussion was about collation,
---
But it wasn't. It was about generating characters between two
characters that were given. In unicode, that would be two code points.
Nothing about enumeration.
> not simply enumerating
> codepoints in order of their codepoint values, it would be helpful to
> actually, you know, collate them.
>
> Given your sample text range:
>
> $ printf %s\\n ⁰ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ | sort -u
>
> ⁰
> ⁴
> ⁵
> ⁶
> ⁷
> ⁸
> ⁹
> ₀
> ₁
> ₂
> ₃
> ₄
> ₅
> ₆
> ₇
> ₈
> ₉
>
> This is plainly not in byte order.
>
----
It is in unicode code point order. Which is what you would use
for unicode. If you want to sort via unicode, use the -u switch.
> Now you need to ask yourself the question: which locale do you want to
> sort according to? I used en_US.UTF-8. Please don't say "C.UTF-8",
> because that's not actually a thing. And the plain C locale won't work
> for obvious reasons...
>
----
I don't need to ask myself what locale if I suggested using unicode
code point order.
Back to gnu.bash.bug | Previous | Next | Find similar
Re: Unicode range and enumeration support. L A Walsh <bash@tlinx.org> - 2019-12-23 12:52 -0800
csiph-web