Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52380

Re: Could you verify this, Oh Great Unicode Experts of the Python-List?

References <mailman.468.1376201912.1251.python-list@python.org> <520754d7$0$30000$c3e8da3$5496439d@news.astraweb.com> <mailman.474.1376214330.1251.python-list@python.org> <5207722c$0$30000$c3e8da3$5496439d@news.astraweb.com>
Date 2013-08-11 12:45 +0100
Subject Re: Could you verify this, Oh Great Unicode Experts of the Python-List?
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.478.1376221550.1251.python-list@python.org> (permalink)

Show all headers | View raw


On Sun, Aug 11, 2013 at 12:14 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Consider a single character. It can have 0 to 5 accents, in any
> combination. Order doesn't matter, and there are no duplicates, so there
> are:
>
> 0 accent: take 0 from 5 = 1 combination;
> 1 accent: take 1 from 5 = 5 combinations;
> 2 accents: take 2 from 5 = 5!/(2!*3!) = 10 combinations;
> 3 accents: take 3 from 5 = 5!/(3!*2!) = 10 combinations;
> 4 accents: take 4 from 5 = 5 combinations;
> 5 accents: take 5 from 5 = 1 combination
>
> giving a total of 32 combinations for a single character. Since there are
> four characters in this hypothetical language that take accents, that
> gives a total of 4*32 = 128 distinct code points needed.

There's an easy way to calculate it. Instead of the "take N from 5"
notation, simply look at it as a set of independent bits - each of
your accents may be either present or absent. So it's 1<<5
combinations for a single character, which is the same 32 figure you
came up with, but easier to work with in the ridiculous case.

> In reality, Unicode has currently code points U+0300 to U+036F (112 code
> points) to combining characters. It's not really meaningful to combine
> all 112 of them, or even most of 112 of them...

If you *were* to use literally ANY combination, that would be 1<<112
which is... uhh... five billion yottacombinations. Don't bother
working that one out by the "take N" method, it'll take you too long
:)

Oh, and that's 1<<112 possible combining character combinations, so
you then need to multiply that by the number of base characters you
could use....

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 07:17 +0100
  Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 09:09 +0000
    Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 10:44 +0100
      Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-11 11:14 +0000
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Chris Angelico <rosuav@gmail.com> - 2013-08-11 12:45 +0100
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 12:59 +0100
        Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-13 09:40 +0100
    Re: Could you verify this, Oh Great Unicode Experts of the Python-List? wxjmfauth@gmail.com - 2013-08-11 05:51 -0700
      Re: Could you verify this, Oh Great Unicode Experts of the Python-List? Joshua Landau <joshua@landau.ws> - 2013-08-11 14:07 +0100

csiph-web