X-Received: by 10.157.34.117 with SMTP id o108mr1713807ota.7.1458574744923; Mon, 21 Mar 2016 08:39:04 -0700 (PDT)
X-Received: by 10.50.153.109 with SMTP id vf13mr229687igb.10.1458574744893; Mon, 21 Mar 2016 08:39:04 -0700 (PDT)
Path: csiph.com!weretis.net!feeder6.news.weretis.net!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!av4no1546830igc.0!news-out.google.com!pn7ni16290igb.0!nntp.google.com!nt3no3553009igb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.python
Date: Mon, 21 Mar 2016 08:39:04 -0700 (PDT)
In-Reply-To: <56effbc1$0$1622$c3e8da3$5496439d@news.astraweb.com>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=61.2.48.150; posting-account=mBpa7woAAAAGLEWUUKpmbxm-Quu5D8ui
NNTP-Posting-Host: 61.2.48.150
References: <mailman.8.1457732171.12893.python-list@python.org> <nbvqg5$3cm$1@dont-email.me> <mailman.23.1457749230.12893.python-list@python.org> <56e44258$0$1598$c3e8da3$5496439d@news.astraweb.com> <mailman.46.1457819135.12893.python-list@python.org> <8737rvxs89.fsf@elektro.pacujo.net> <nc2a7s$6dv$1@dont-email.me> <mailman.48.1457831488.12893.python-list@python.org> <nc4ffn$f6s$1@dont-email.me> <56e7483d$0$1608$c3e8da3$5496439d@news.astraweb.com> <nc7kjr$jj8$1@dont-email.me> <ncnhp9$tt7$1@dont-email.me> <mailman.426.1458526919.12893.python-list@python.org> <ncopi1$vgc$1@dont-email.me> <mailman.438.1458565186.12893.python-list@python.org> <56effbc1$0$1622$c3e8da3$5496439d@news.astraweb.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bae998f2-63e6-4a76-a5f6-4b1dc5addbef@googlegroups.com>
Subject: Re: The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)
From: Rustom Mody <rustompmody@gmail.com>
Injection-Date: Mon, 21 Mar 2016 15:39:04 +0000
Content-Type: text/plain; charset=ISO-8859-1
Lines: 48
Xref: csiph.com comp.lang.python:105364

On Monday, March 21, 2016 at 7:19:03 PM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 21 Mar 2016 11:59 pm, Chris Angelico wrote:
> 
> > On Mon, Mar 21, 2016 at 11:34 PM, BartC  wrote:
> >> For Python I would have used a table of 0..255 functions, indexed by the
> >> ord() code of each character. So all 52 letter codes map to the same
> >> name-handling function. (No Dict is needed at this point.)
> >>
> > 
> > Once again, you forget that there are not 256 characters - there are
> > 1114112. (Give or take.)
> 
> Pardon me, do I understand you correctly? You're saying that the C parser is
> Unicode-aware and allows you to use Unicode in C source code? Because
> Bart's test is for a (simplified?) C tokeniser, and expecting his tokeniser
> to support character sets that C does not would be, well, Not Cricket, my
> good chap.

Sticking to C and integer switches, one would expect that
switch (n)
{
  case 1000:...
  case 1001:
  case 1002:
  :
  :
  case 2000:
  default:
}
would compile into faster/tighter code than
switch (n)
{
  case 1:...
  case 100:
  case 200:
  case 1000:
  case 10000:
  default:
}

IOW if the compiler can detect an arithmetic progression or a reasonably dense
subset of one it can make a jump table.  If not it starts deteriorating into
if-else chains

Same applies to char even if char is full-unicode: if the switching is over a
small dense/contiguous subset, a jump table works well (at assembly level)
and so a switch at C level.

[And dicts/arrays of functions are ok approximations to that]