Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > gnu.bash.bug > #15770

Re: Unicode range and enumeration support.

Path csiph.com!goblin1!goblin.stu.neva.ru!usenet.stanford.edu!not-for-mail
From L A Walsh <bash@tlinx.org>
Newsgroups gnu.bash.bug
Subject Re: Unicode range and enumeration support.
Date Mon, 23 Dec 2019 12:57:47 -0800
Lines 34
Approved bug-bash@gnu.org
Message-ID <mailman.1336.1577134680.1979.bug-bash@gnu.org> (permalink)
References <568aeaaa-22b3-c7b9-0e18-a92bef6d2ffb@iki.fi> <5DF2FE31.9070406@tlinx.org> <0ff3a920-94c2-b0c9-5631-0964955657aa@archlinux.org> <5DF3D78B.4090208@tlinx.org> <20191213184213.GO851@eeg.ccf.org> <5DF4BDF0.6000402@tlinx.org> <20191216163906.GV851@eeg.ccf.org> <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <20191223132049.GW851@eeg.ccf.org> <5E012A4B.1090304@tlinx.org>
NNTP-Posting-Host lists.gnu.org
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 7bit
X-Trace usenet.stanford.edu 1577134680 11567 209.51.188.17 (23 Dec 2019 20:58:00 GMT)
X-Complaints-To action@cs.stanford.edu
To bug-bash <bug-bash@gnu.org>
Envelope-to bug-bash@gnu.org
User-Agent Thunderbird
In-Reply-To <20191223132049.GW851@eeg.ccf.org>
X-detected-operating-system by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy]
X-Received-From 173.164.175.65
X-BeenThere bug-bash@gnu.org
X-Mailman-Version 2.1.23
Precedence list
List-Id Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive <https://lists.gnu.org/archive/html/bug-bash>
List-Post <mailto:bug-bash@gnu.org>
List-Help <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
X-Mailman-Original-Message-ID <5E012A4B.1090304@tlinx.org>
X-Mailman-Original-References <568aeaaa-22b3-c7b9-0e18-a92bef6d2ffb@iki.fi> <5DF2FE31.9070406@tlinx.org> <0ff3a920-94c2-b0c9-5631-0964955657aa@archlinux.org> <5DF3D78B.4090208@tlinx.org> <20191213184213.GO851@eeg.ccf.org> <5DF4BDF0.6000402@tlinx.org> <20191216163906.GV851@eeg.ccf.org> <5DFA7AE2.2060504@tlinx.org> <20191218194651.GH851@eeg.ccf.org> <5DFD68B9.3050202@tlinx.org> <20191223132049.GW851@eeg.ccf.org>
Xref csiph.com gnu.bash.bug:15770

Show key headers only | View raw


On 2019/12/23 05:20, Greg Wooledge wrote:
> On Fri, Dec 20, 2019 at 04:35:05PM -0800, L A Walsh wrote:=
>   
> You can't simply translate $start and $end to single Unicode code point
> values, enumerate the Unicode characters between those two points,
> and translate those characters back to the user's locale.  That doesn't
> give you the correct answer.  There will be extra characters in the
> Unicode code point range that don't fit the solution, 
You would have to limit your enumeration to the locale range a well --
i.e. seeing if a character match the locale you wanted.

But NOTE -- I never suggested doing locale matching.

I just suggested Unicode code-point enumeration in Unicode CP order as
a first delivered feature.  I thought that would be much easier.
 

> The only way to do it is to iterate over the ENTIRE code point space,
> however many millions or billions of characters that is today.
>   
It took less than a tenth of a second in perl, so probably a fraction
of that in 'C'.

> Is that what you are proposing bash should do, in order to get a working
> brace expansion outside of the C locale?  I don't believe this is an
> acceptable solution.
>   
I said I'd proably go with enumeration between two code points as a
first step, but even going through the entire unicode code space
is trivially fast on modern computers.



Back to gnu.bash.bug | Previous | Next | Find similar | Unroll thread


Thread

Re: Unicode range and enumeration support. L A Walsh <bash@tlinx.org> - 2019-12-23 12:57 -0800

csiph-web