Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Terry Reedy <tjreedy@udel.edu>
Subject: Re: String concatenation benchmarking weirdness
Date: Sat, 12 Jan 2013 06:31:09 -0500
References: <kcpnmj$jpu$1@dont-email.me> <mailman.412.1357935875.2939.python-list@python.org> <kcpu0l$qdu$1@dont-email.me> <bec9b22e-b366-4505-be70-a745b5fbe0a7@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Thunderbird/17.0
In-Reply-To: <bec9b22e-b366-4505-be70-a745b5fbe0a7@googlegroups.com>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.441.1357990336.2939.python-list@python.org>
Lines: 478
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:36696

On 1/12/2013 3:38 AM, wxjmfauth@gmail.com wrote:
> from timeit import timeit, repeat
>
> size =3D 1000
>
> r =3D repeat("y =3D x + 'a'", setup =3D "x =3D 'a' * %i" % size)
> print('1:', r)
> r =3D repeat("y =3D x + '=C3=A9'", setup =3D "x =3D 'a' * %i" % size)
> print('2:', r)
> r =3D repeat("y =3D x + '=C5=93'", setup =3D "x =3D 'a' * %i" % size)
> print('3:', r)
> r =3D repeat("y =3D x + '=E2=82=AC'", setup =3D "x =3D 'a' * %i" % size=
)
> print('4:', r)
> r =3D repeat("y =3D x + '=E2=82=AC'", setup =3D "x =3D '=E2=82=AC' * %i=
" % size)
> print('5:', r)
> r =3D repeat("y =3D x + '=C5=93'", setup =3D "x =3D '=C5=93' * %i" % si=
ze)
> print('6:', r)
> r =3D repeat("y =3D =C3=A9 + '=C5=93'", setup =3D "=C3=A9 =3D '=C5=93' =
* %i" % size)
> print('7:', r)
> r =3D repeat("y =3D =C3=A9 + '=C5=93'", setup =3D "=C3=A9 =3D '=E2=82=AC=
' * %i" % size)
> print('8:', r)
>
>
>
>> c:\python32\pythonw -u "vitesse3.py"
> 1: [0.3603178435286996, 0.42901157137281515, 0.35459694357592086]
> 2: [0.3576409223543202, 0.4272010951864649, 0.3590055732104662]
> 3: [0.3552022735516487, 0.4256544908828328, 0.35824546465278573]
> 4: [0.35488168890607774, 0.4271707696118834, 0.36109528098614074]
> 5: [0.3560675370237849, 0.4261538782668417, 0.36138160167082134]
> 6: [0.3570182634788317, 0.4270155971913008, 0.35770629956705324]
> 7: [0.3556977225493485, 0.4264969117143753, 0.3645634239700426]
> 8: [0.35511247834379844, 0.4259628665308437, 0.3580737510097034]
>> Exit code: 0
>> c:\Python33\pythonw -u "vitesse3.py"
> 1: [0.3053600256152646, 0.3306491917840535, 0.3044963374976518]
> 2: [0.36252767208680514, 0.36937298133086727, 0.3685573415262271]
> 3: [0.7666293438924097, 0.7653473991487574, 0.7630926729867262]
> 4: [0.7636680712265038, 0.7647586103955284, 0.7631395397838059]
> 5: [0.44721085450773934, 0.3863234021671369, 0.45664368355696094]
> 6: [0.44699700013114807, 0.3873974001136613, 0.45167383387335036]
> 7: [0.4465200615491014, 0.387050034441188, 0.45459690419205856]
> 8: [0.44760587465455437, 0.3875261853459726, 0.45421212384964704]
>> Exit code: 0
>
>
> The difference between a correct (coherent) unicode handling and ...

By 'correct' Jim means 'speedy', for a subset of string operations*.=20
rather than 'accurate'. In 3.2 and before, CPython does not handle=20
extended plane characters correctly on Windows and other narrow builds.=20
This is, by the way, true of many other languages. For instance, Tcl 8.5 =

and before (not sure about the new 8.6) does not handle them at all. The =

same is true of Microsoft command windows.

* lets try another comparison:

from timeit import timeit
print(timeit("a.encode()", "a =3D 'a'*10000"))

3.2: 12.1 seconds
3.3    .7 seconds

3.3 is 15 times faster!!! (The factor increases with the length of a.)

A fairer comparison is the approximately 120 micro benchmarks in=20
Tools/stringbench.py. Here they are, uncensored, for 3.3.0 and 3.2.3. It =

is in the Tools directory of some distributions but not all (including=20
not Windows). It can be downloaded from
http://hg.python.org/cpython/file/6fe28afa6611/Tools/stringbench

In FireFox, Right-click on the stringbench.py link and 'Save link as...'
to somewhere you can run it from.

 >>>
stringbench v2.0
3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit=20
(AMD64)]
2013-01-12 06:17:51.685781
bytes	unicode
(in ms)	(in ms)	%	comment
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D case conversion -- dense
0.41	0.43	95.2	("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower()=20
(*1000)
0.42	0.43	95.8	("where in the world is carmen san deigo?"*10).upper()=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D case conversion -- rare
0.41	0.43	95.8	("Where in the world is Carmen San Deigo?"*10).lower()=20
(*1000)
0.42	0.43	96.3	("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper()=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D concat 20 strings of words length 4 to 15
1.83	1.95	94.1	s1+s2+s3+s4+...+s20 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D concat two strings
0.10	0.10	98.7	"Andrew"+"Dalke" (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D count AACT substrings in DNA example
2.46	2.44	100.9	dna.count("AACT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D count newlines
0.77	0.75	103.6	...text.with.2000.newlines.count("\n") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D early match, single character
0.30	0.27	110.5	("A"*1000).find("A") (*1000)
0.45	0.06	750.5	"A" in "A"*1000 (*1000)
0.30	0.27	110.4	("A"*1000).index("A") (*1000)
0.24	0.22	107.2	("A"*1000).partition("A") (*1000)
0.33	0.29	116.6	("A"*1000).rfind("A") (*1000)
0.32	0.29	107.9	("A"*1000).rindex("A") (*1000)
0.20	0.21	94.1	("A"*1000).rpartition("A") (*1000)
0.42	0.45	93.4	("A"*1000).rsplit("A", 1) (*1000)
0.39	0.41	95.9	("A"*1000).split("A", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D early match, two characters
0.32	0.27	121.1	("AB"*1000).find("AB") (*1000)
0.45	0.06	729.5	"AB" in "AB"*1000 (*1000)
0.30	0.27	111.2	("AB"*1000).index("AB") (*1000)
0.23	0.28	85.0	("AB"*1000).partition("AB") (*1000)
0.33	0.30	110.6	("AB"*1000).rfind("AB") (*1000)
0.33	0.30	110.5	("AB"*1000).rindex("AB") (*1000)
0.22	0.27	83.1	("AB"*1000).rpartition("AB") (*1000)
0.46	0.47	96.7	("AB"*1000).rsplit("AB", 1) (*1000)
0.44	0.48	90.9	("AB"*1000).split("AB", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith multiple characters
0.24	0.29	84.0	"Andrew".endswith("Andrew") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith multiple characters - not!
0.26	0.28	92.9	"Andrew".endswith("Anders") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith single character
0.25	0.28	90.0	"Andrew".endswith("w") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D formatting a string type with a dict
N/A	0.67	0.0	"The %(k1)s is %(k2)s the=20
%(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join empty string, with 1 character sep
N/A	0.06	0.0	"A".join("") (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join empty string, with 5 character sep
N/A	0.06	0.0	"ABCDE".join("") (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 100 words, with 1 character s=
ep
0.87	1.27	68.8	"A".join(["Bob"]*100)) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 100 words, with 5 character s=
ep
1.14	1.54	74.0	"ABCDE".join(["Bob"]*100)) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 26 characters, with 1 charact=
er sep
0.27	0.37	72.0	"A".join(list("ABC..Z")) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 26 characters, with 5 charact=
er sep
0.32	0.43	75.7	"ABCDE".join(list("ABC..Z")) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join string with 26 characters, with 1 cha=
racter sep
N/A	1.30	0.0	"A".join("ABC..Z") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join string with 26 characters, with 5 cha=
racter sep
N/A	1.37	0.0	"ABCDE".join("ABC..Z") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D late match, 100 characters
3.25	3.23	100.5	s=3D"ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
2.79	2.78	100.4	s=3D"ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
1.98	1.94	102.3	s=3D"ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
3.24	3.23	100.3	s=3D"ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
4.26	3.62	117.7	s=3D"ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)=

3.23	3.23	100.1	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
2.32	2.32	100.1	s=3D"ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
3.23	3.21	100.8	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
3.58	3.57	100.4	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100=
)
3.60	3.60	100.0	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)=

3.60	3.56	101.2	s=3D"ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D late match, two characters
0.62	0.58	106.3	("AB"*300+"C").find("BC") (*1000)
0.92	0.82	111.8	("AB"*300+"CA").find("CA") (*1000)
0.73	0.33	218.8	"BC" in ("AB"*300+"C") (*1000)
0.61	0.60	101.0	("AB"*300+"C").index("BC") (*1000)
0.54	0.82	66.4	("AB"*300+"C").partition("BC") (*1000)
0.66	0.63	104.6	("C"+"AB"*300).rfind("CA") (*1000)
0.91	0.88	102.3	("BC"+"AB"*300).rfind("BC") (*1000)
0.65	0.62	105.1	("C"+"AB"*300).rindex("CA") (*1000)
0.53	0.56	94.5	("C"+"AB"*300).rpartition("CA") (*1000)
0.75	0.77	96.6	("C"+"AB"*300).rsplit("CA", 1) (*1000)
0.65	0.67	97.0	("AB"*300+"C").split("BC", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D no match, single character
0.89	0.87	102.3	("A"*1000).find("B") (*1000)
1.03	0.64	159.1	"B" in "A"*1000 (*1000)
0.67	0.68	98.7	("A"*1000).partition("B") (*1000)
0.87	0.85	102.8	("A"*1000).rfind("B") (*1000)
0.67	0.68	98.5	("A"*1000).rpartition("B") (*1000)
0.87	0.87	99.2	("A"*1000).rsplit("B", 1) (*1000)
0.86	0.85	101.5	("A"*1000).split("B", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D no match, two characters
1.22	1.16	104.9	("AB"*1000).find("BC") (*1000)
1.93	2.02	95.2	("AB"*1000).find("CA") (*1000)
1.37	0.94	145.3	"BC" in "AB"*1000 (*1000)
1.39	2.14	65.1	("AB"*1000).partition("BC") (*1000)
2.32	2.31	100.7	("AB"*1000).rfind("BC") (*1000)
1.47	1.44	102.1	("AB"*1000).rfind("CA") (*1000)
2.26	2.27	99.7	("AB"*1000).rpartition("BC") (*1000)
2.46	2.45	100.2	("AB"*1000).rsplit("BC", 1) (*1000)
1.15	1.16	99.1	("AB"*1000).split("BC", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D quick replace multiple character match
0.13	0.12	105.0	("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D quick replace single character match
0.12	0.12	105.2	("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 1 character 10 times
0.08	0.10	80.6	"A"*10 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 1 character 1000 times
0.16	0.18	93.1	"A"*1000 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 5 characters 10 times
0.11	0.13	84.4	"ABCDE"*10 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 5 characters 1000 times
0.39	0.41	94.8	"ABCDE"*1000 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace and expand multiple characters, bi=
g string
2.02	2.36	85.6	"...text.with.2000.newlines...replace("\n", "\r\n") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace multiple characters, dna
3.12	3.23	96.6	dna.replace("ATC", "ATT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace single character
0.33	0.40	82.4	"This is a test".replace(" ", "\t") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace single character, big string
0.75	0.86	87.4	"...text.with.2000.lines...replace("\n", " ") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace/remove multiple characters
0.41	0.48	86.1	"When shall we three meet again?".replace("ee", "") (*1000=
)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split 1 whitespace
0.14	0.18	79.3	("Here are some words. "*2).partition(" ") (*1000)
0.11	0.14	75.1	("Here are some words. "*2).rpartition(" ") (*1000)
0.35	0.39	90.3	("Here are some words. "*2).rsplit(None, 1) (*1000)
0.32	0.38	83.9	("Here are some words. "*2).split(None, 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split 2000 newlines
1.74	2.02	86.3	"...text...".rsplit("\n") (*10)
1.69	1.97	85.5	"...text...".split("\n") (*10)
1.89	2.55	74.0	"...text...".splitlines() (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split newlines
0.35	0.39	88.9	"this\nis\na\ntest\n".rsplit("\n") (*1000)
0.34	0.40	86.4	"this\nis\na\ntest\n".split("\n") (*1000)
0.32	0.40	80.7	"this\nis\na\ntest\n".splitlines() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split on multicharacter separator (dna)
2.28	2.30	99.1	dna.rsplit("ACTAT") (*10)
2.63	2.66	98.9	dna.split("ACTAT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split on multicharacter separator (small)
0.55	0.69	79.0=20
"this--is--a--test--of--the--emergency--broadcast--system".rsplit("--")=20
(*1000)
0.58	0.70	82.9=20
"this--is--a--test--of--the--emergency--broadcast--system".split("--")=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split whitespace (huge)
1.51	2.12	71.4	human_text.rsplit() (*10)
1.51	2.05	73.6	human_text.split() (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split whitespace (small)
0.48	0.68	70.1	("Here are some words. "*2).rsplit() (*1000)
0.48	0.64	74.9	("Here are some words. "*2).split() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith multiple characters
0.24	0.25	95.9	"Andrew".startswith("Andrew") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith multiple characters - not!
0.24	0.25	95.7	"Andrew".startswith("Anders") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith single character
0.23	0.25	95.4	"Andrew".startswith("A") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D strip terminal newline
0.09	0.21	44.1	s=3D"Hello!\n"; s[:-1] if s[-1]=3D=3D"\n" else s (*1000)
0.09	0.12	74.0	"\nHello!".rstrip() (*1000)
0.09	0.12	74.0	"Hello!\n".rstrip() (*1000)
0.09	0.12	71.6	"\nHello!\n".strip() (*1000)
0.09	0.12	73.2	"\nHello!".strip() (*1000)
0.09	0.12	72.9	"Hello!\n".strip() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D strip terminal spaces and tabs
0.09	0.13	69.6	"\t   \tHello".rstrip() (*1000)
0.09	0.13	72.3	"Hello\t   \t".rstrip() (*1000)
0.07	0.08	86.8	"Hello\t   \t".strip() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D tab split
0.59	0.65	90.9	GFF3_example.rsplit("\t", 8) (*1000)
0.55	0.59	94.2	GFF3_example.rsplit("\t") (*1000)
0.52	0.57	90.7	GFF3_example.split("\t", 8) (*1000)
0.52	0.57	90.1	GFF3_example.split("\t") (*1000)
108.87	116.31	93.6	TOTAL
 >>>
stringbench v2.0
3.2.3 (default, Apr 11 2012, 07:12:16) [MSC v.1500 64 bit (AMD64)]
2013-01-12 06:23:05.994000
bytes	unicode
(in ms)	(in ms)	%	comment
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D case conversion -- dense
0.63	3.01	21.0	("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower()=20
(*1000)
0.63	2.90	21.5	("where in the world is carmen san deigo?"*10).upper()=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D case conversion -- rare
0.84	2.83	29.8	("Where in the world is Carmen San Deigo?"*10).lower()=20
(*1000)
0.50	3.47	14.3	("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper()=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D concat 20 strings of words length 4 to 15
1.82	1.75	103.9	s1+s2+s3+s4+...+s20 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D concat two strings
0.09	0.08	115.5	"Andrew"+"Dalke" (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D count AACT substrings in DNA example
2.40	2.64	91.1	dna.count("AACT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D count newlines
0.77	0.75	101.6	...text.with.2000.newlines.count("\n") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D early match, single character
0.19	0.18	101.9	("A"*1000).find("A") (*1000)
0.39	0.05	824.7	"A" in "A"*1000 (*1000)
0.19	0.19	96.3	("A"*1000).index("A") (*1000)
0.20	0.22	87.5	("A"*1000).partition("A") (*1000)
0.20	0.20	101.8	("A"*1000).rfind("A") (*1000)
0.20	0.20	101.2	("A"*1000).rindex("A") (*1000)
0.18	0.22	82.5	("A"*1000).rpartition("A") (*1000)
0.41	0.45	91.7	("A"*1000).rsplit("A", 1) (*1000)
0.42	0.43	99.0	("A"*1000).split("A", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D early match, two characters
0.19	0.19	102.3	("AB"*1000).find("AB") (*1000)
0.39	0.05	781.6	"AB" in "AB"*1000 (*1000)
0.19	0.20	97.9	("AB"*1000).index("AB") (*1000)
0.23	0.33	71.1	("AB"*1000).partition("AB") (*1000)
0.20	0.20	101.6	("AB"*1000).rfind("AB") (*1000)
0.20	0.20	100.1	("AB"*1000).rindex("AB") (*1000)
0.22	0.31	70.4	("AB"*1000).rpartition("AB") (*1000)
0.47	0.53	90.0	("AB"*1000).rsplit("AB", 1) (*1000)
0.45	0.52	85.0	("AB"*1000).split("AB", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith multiple characters
0.18	0.18	97.6	"Andrew".endswith("Andrew") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith multiple characters - not!
0.18	0.18	100.4	"Andrew".endswith("Anders") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D endswith single character
0.18	0.18	97.1	"Andrew".endswith("w") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D formatting a string type with a dict
N/A	0.53	0.0	"The %(k1)s is %(k2)s the=20
%(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join empty string, with 1 character sep
N/A	0.05	0.0	"A".join("") (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join empty string, with 5 character sep
N/A	0.05	0.0	"ABCDE".join("") (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 100 words, with 1 character s=
ep
1.02	1.02	99.6	"A".join(["Bob"]*100)) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 100 words, with 5 character s=
ep
1.25	1.48	84.4	"ABCDE".join(["Bob"]*100)) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 26 characters, with 1 charact=
er sep
0.31	0.25	122.9	"A".join(list("ABC..Z")) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join list of 26 characters, with 5 charact=
er sep
0.36	0.41	88.4	"ABCDE".join(list("ABC..Z")) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join string with 26 characters, with 1 cha=
racter sep
N/A	1.06	0.0	"A".join("ABC..Z") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D join string with 26 characters, with 5 cha=
racter sep
N/A	1.22	0.0	"ABCDE".join("ABC..Z") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D late match, 100 characters
2.52	2.68	94.0	s=3D"ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
2.35	3.06	76.9	s=3D"ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
1.55	1.61	96.2	s=3D"ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
2.51	2.68	94.0	s=3D"ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
3.57	4.66	76.7	s=3D"ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
3.23	3.24	99.8	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
2.35	2.56	91.7	s=3D"ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
3.23	3.24	99.8	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
3.58	3.92	91.4	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)=

3.62	3.96	91.4	s=3D"ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
2.89	3.38	85.4	s=3D"ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D late match, two characters
0.52	0.52	99.5	("AB"*300+"C").find("BC") (*1000)
0.69	0.90	76.5	("AB"*300+"CA").find("CA") (*1000)
0.67	0.37	179.2	"BC" in ("AB"*300+"C") (*1000)
0.51	0.53	96.8	("AB"*300+"C").index("BC") (*1000)
0.48	0.81	59.3	("AB"*300+"C").partition("BC") (*1000)
0.55	0.55	101.5	("C"+"AB"*300).rfind("CA") (*1000)
0.85	0.85	100.0	("BC"+"AB"*300).rfind("BC") (*1000)
0.55	0.55	100.3	("C"+"AB"*300).rindex("CA") (*1000)
0.52	0.60	87.1	("C"+"AB"*300).rpartition("CA") (*1000)
0.78	0.82	95.4	("C"+"AB"*300).rsplit("CA", 1) (*1000)
0.65	0.72	91.2	("AB"*300+"C").split("BC", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D no match, single character
0.77	0.77	100.6	("A"*1000).find("B") (*1000)
0.98	0.63	155.1	"B" in "A"*1000 (*1000)
0.66	0.66	99.7	("A"*1000).partition("B") (*1000)
0.77	0.77	100.4	("A"*1000).rfind("B") (*1000)
0.66	0.66	99.7	("A"*1000).rpartition("B") (*1000)
0.88	0.88	100.4	("A"*1000).rsplit("B", 1) (*1000)
0.88	0.87	101.2	("A"*1000).split("B", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D no match, two characters
1.19	1.21	98.1	("AB"*1000).find("BC") (*1000)
1.79	2.51	71.2	("AB"*1000).find("CA") (*1000)
1.28	1.08	119.1	"BC" in "AB"*1000 (*1000)
1.10	2.11	52.1	("AB"*1000).partition("BC") (*1000)
2.37	2.37	100.0	("AB"*1000).rfind("BC") (*1000)
1.36	1.36	100.5	("AB"*1000).rfind("CA") (*1000)
2.25	2.26	99.9	("AB"*1000).rpartition("BC") (*1000)
2.38	2.62	90.7	("AB"*1000).rsplit("BC", 1) (*1000)
1.18	1.30	90.1	("AB"*1000).split("BC", 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D quick replace multiple character match
0.12	0.32	37.1	("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D quick replace single character match
0.12	0.30	37.9	("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 1 character 10 times
0.08	0.09	90.3	"A"*10 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 1 character 1000 times
0.16	0.19	82.2	"A"*1000 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 5 characters 10 times
0.11	0.12	98.3	"ABCDE"*10 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D repeat 5 characters 1000 times
0.40	0.58	67.9	"ABCDE"*1000 (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace and expand multiple characters, bi=
g string
1.95	2.13	91.7	"...text.with.2000.newlines...replace("\n", "\r\n") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace multiple characters, dna
2.93	3.25	90.3	dna.replace("ATC", "ATT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace single character
0.25	0.26	96.6	"This is a test".replace(" ", "\t") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace single character, big string
0.73	1.01	72.0	"...text.with.2000.lines...replace("\n", " ") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D replace/remove multiple characters
0.30	0.34	89.0	"When shall we three meet again?".replace("ee", "") (*1000=
)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split 1 whitespace
0.12	0.13	93.3	("Here are some words. "*2).partition(" ") (*1000)
0.11	0.11	98.8	("Here are some words. "*2).rpartition(" ") (*1000)
0.32	0.37	86.5	("Here are some words. "*2).rsplit(None, 1) (*1000)
0.32	0.33	96.9	("Here are some words. "*2).split(None, 1) (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split 2000 newlines
1.76	2.19	80.5	"...text...".rsplit("\n") (*10)
1.72	2.10	81.9	"...text...".split("\n") (*10)
1.87	2.58	72.4	"...text...".splitlines() (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split newlines
0.36	0.34	103.9	"this\nis\na\ntest\n".rsplit("\n") (*1000)
0.35	0.33	105.9	"this\nis\na\ntest\n".split("\n") (*1000)
0.31	0.34	89.7	"this\nis\na\ntest\n".splitlines() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split on multicharacter separator (dna)
2.18	2.34	93.4	dna.rsplit("ACTAT") (*10)
2.50	2.64	94.5	dna.split("ACTAT") (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split on multicharacter separator (small)
0.59	0.62	95.3=20
"this--is--a--test--of--the--emergency--broadcast--system".rsplit("--")=20
(*1000)
0.55	0.59	93.1=20
"this--is--a--test--of--the--emergency--broadcast--system".split("--")=20
(*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split whitespace (huge)
1.54	2.34	65.5	human_text.rsplit() (*10)
1.51	2.22	68.3	human_text.split() (*10)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D split whitespace (small)
0.46	0.60	76.5	("Here are some words. "*2).rsplit() (*1000)
0.45	0.51	87.6	("Here are some words. "*2).split() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith multiple characters
0.18	0.18	97.3	"Andrew".startswith("Andrew") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith multiple characters - not!
0.18	0.18	100.1	"Andrew".startswith("Anders") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D startswith single character
0.17	0.18	96.8	"Andrew".startswith("A") (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D strip terminal newline
0.11	0.21	52.0	s=3D"Hello!\n"; s[:-1] if s[-1]=3D=3D"\n" else s (*1000)
0.06	0.07	92.1	"\nHello!".rstrip() (*1000)
0.06	0.07	92.2	"Hello!\n".rstrip() (*1000)
0.06	0.07	91.2	"\nHello!\n".strip() (*1000)
0.06	0.07	91.1	"\nHello!".strip() (*1000)
0.06	0.07	91.1	"Hello!\n".strip() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D strip terminal spaces and tabs
0.07	0.07	89.4	"\t   \tHello".rstrip() (*1000)
0.07	0.07	91.4	"Hello\t   \t".rstrip() (*1000)
0.04	0.05	88.7	"Hello\t   \t".strip() (*1000)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D tab split
0.57	0.56	100.8	GFF3_example.rsplit("\t", 8) (*1000)
0.53	0.53	100.7	GFF3_example.rsplit("\t") (*1000)
0.49	0.49	101.2	GFF3_example.split("\t", 8) (*1000)
0.51	0.49	103.5	GFF3_example.split("\t") (*1000)
102.13	125.57	81.3	TOTAL

--=20
Terry Jan Reedy