Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52382

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

References <mailman.468.1376201912.1251.python-list@python.org> <520754d7$0$30000$c3e8da3$5496439d@news.astraweb.com> <mailman.474.1376214330.1251.python-list@python.org> <5207722c$0$30000$c3e8da3$5496439d@news.astraweb.com>
From Joshua Landau <joshua@landau.ws>
Date 2013-08-11 12:59 +0100
Subject Re: Could you verify this, Oh Great Unicode Experts of the Python-List?
Newsgroups comp.lang.python
Message-ID <mailman.480.1376222399.1251.python-list@python.org> (permalink)

Show all headers | View raw


On 11 August 2013 12:14, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sun, 11 Aug 2013 10:44:40 +0100, Joshua Landau wrote:
>
>> On 11 August 2013 10:09, Steven D'Aprano
>> <steve+comp.lang.python@pearwood.info> wrote:
>>> The reason some accented letters have single code point forms is to
>>> support legacy charsets; the reason some only exist as combining
>>> characters is due to the combinational explosion. Some languages allow
>>> you to add up to five or six different accent on any of dozens of
>>> different letters. If each combination needed its own unique code
>>> point, there wouldn't be enough code points. For bonus points, if there
>>> are five accents that can be placed in any combination of zero or more
>>> on any of four characters, how many code points would be needed?
>>
>> 52?
>
> More than double that.
>
> Consider a single character. It can have 0 to 5 accents, in any
> combination. Order doesn't matter, and there are no duplicates, so there
> are:
>
> 0 accent: take 0 from 5 = 1 combination;
> 1 accent: take 1 from 5 = 5 combinations;
> 2 accents: take 2 from 5 = 5!/(2!*3!) = 10 combinations;
> 3 accents: take 3 from 5 = 5!/(3!*2!) = 10 combinations;
> 4 accents: take 4 from 5 = 5 combinations;
> 5 accents: take 5 from 5 = 1 combination
>
> giving a total of 32 combinations for a single character. Since there are
> four characters in this hypothetical language that take accents, that
> gives a total of 4*32 = 128 distinct code points needed.

I didn't see "four characters", and I did (1 + 5 + 10) * 2 and came up
with 52...
Maybe I should get more sleep.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 07:17 +0100
  Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 09:09 +0000
    Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 10:44 +0100
      Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 11:14 +0000
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Chris Angelico <rosuav@gmail.com> - 2013-08-11 12:45 +0100
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 12:59 +0100
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-13 09:40 +0100
    Re: Could you verify this, Oh Great Unicode Experts of the Python-List? wxjmfauth@gmail.com - 2013-08-11 05:51 -0700
      Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 14:07 +0100

csiph-web