Groups > comp.lang.python > #72564 > unrolled thread

Unicode and Python - how often do you index strings?

Started by	Chris Angelico <rosuav@gmail.com>
First post	2014-06-04 10:39 +1000
Last post	2014-06-05 15:05 -0500
Articles	20 on this page of 40 — 21 participants

Back to article view | Back to comp.lang.python

  Unicode and Python - how often do you index strings? Chris Angelico <rosuav@gmail.com> - 2014-06-04 10:39 +1000
    Re: Unicode and Python - how often do you index strings? Roy Smith <roy@panix.com> - 2014-06-03 21:18 -0400
      Re: Unicode and Python - how often do you index strings? Chris Angelico <rosuav@gmail.com> - 2014-06-04 12:13 +1000
        Re: Unicode and Python - how often do you index strings? Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2014-06-04 18:48 +1200
          Re: Unicode and Python - how often do you index strings? alister <alister.nospam.ware@ntlworld.com> - 2014-06-04 10:57 +0000
      Re: Unicode and Python - how often do you index strings? alister <alister.nospam.ware@ntlworld.com> - 2014-06-04 10:50 +0000
        Re: Unicode and Python - how often do you index strings? Rustom Mody <rustompmody@gmail.com> - 2014-06-04 05:52 -0700
          Re: Unicode and Python - how often do you index strings? alister <alister.nospam.ware@ntlworld.com> - 2014-06-04 13:36 +0000
    Re: Unicode and Python - how often do you index strings? wxjmfauth@gmail.com - 2014-06-03 23:50 -0700
      Re: Unicode and Python - how often do you index strings? Michael Torrie <torriem@gmail.com> - 2014-06-04 08:50 -0600
        Re: Unicode and Python - how often do you index strings? wxjmfauth@gmail.com - 2014-06-05 00:06 -0700
          Re: Unicode and Python - how often do you index strings? Marko Rauhamaa <marko@pacujo.net> - 2014-06-05 10:20 +0300
          Re: Unicode and Python - how often do you index strings? alister <alister.nospam.ware@ntlworld.com> - 2014-06-05 15:39 +0000
            Re: Unicode and Python - how often do you index strings? Mark H Harris <harrismh777@gmail.com> - 2014-06-05 10:57 -0500
              Re: Unicode and Python - how often do you index strings? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-06-05 18:15 +0100
                Re: Unicode and Python - how often do you index strings? alister <alister.nospam.ware@ntlworld.com> - 2014-06-05 17:33 +0000
      Re: Unicode and Python - how often do you index strings? Joshua Landau <joshua@landau.ws> - 2014-06-05 18:18 +0100
    Re: Unicode and Python Rustom Mody <rustompmody@gmail.com> - 2014-06-04 21:25 -0700
      Re: Unicode and Python wxjmfauth@gmail.com - 2014-06-05 00:23 -0700
    Re: Unicode and Python - how often do you index strings? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-06-05 18:09 +0200
      Re: Unicode and Python - how often do you index strings? Paul Rubin <no.email@nospam.invalid> - 2014-06-05 11:16 -0700
        Re: Unicode and Python - how often do you index strings? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-06-05 20:42 +0200
          Re: Unicode and Python - how often do you index strings? Ryan Hiebert <ryan@ryanhiebert.com> - 2014-06-05 13:52 -0500
            Re: Unicode and Python - how often do you index strings? Paul Rubin <no.email@nospam.invalid> - 2014-06-05 12:58 -0700
              Re: Unicode and Python - how often do you index strings? Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-05 14:18 -0600
                Re: Unicode and Python - how often do you index strings? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-06-06 10:47 +0200
                  Re: Unicode and Python - how often do you index strings? Tim Chase <python.list@tim.thechases.com> - 2014-06-06 05:37 -0500
                  Re: Unicode and Python - how often do you index strings? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-06-06 11:52 +0000
              Re: Unicode and Python - how often do you index strings? Albert-Jan Roskam <fomcl@yahoo.com> - 2014-06-05 13:34 -0700
                Re: Unicode and Python - how often do you index strings? Roy Smith <roy@panix.com> - 2014-06-05 17:00 -0400
                  Re: Unicode and Python - how often do you index strings? Rustom Mody <rustompmody@gmail.com> - 2014-06-05 15:24 -0700
                    Re: Unicode and Python - how often do you index strings? Ned Deily <nad@acm.org> - 2014-06-05 15:57 -0700
                      Re: Unicode and Python - how often do you index strings? Roy Smith <roy@panix.com> - 2014-06-05 20:10 -0400
                        Re: Unicode and Python - how often do you index strings? Ned Deily <nad@acm.org> - 2014-06-05 17:43 -0700
                        Re: Unicode and Python - how often do you index strings? Grant Edwards <invalid@invalid.invalid> - 2014-06-06 14:20 +0000
              Re: Unicode and Python - how often do you index strings? Ian Kelly <ian.g.kelly@gmail.com> - 2014-06-05 18:05 -0600
            Re: Unicode and Python - how often do you index strings? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2014-06-06 10:42 +0200
              Re: Unicode and Python - how often do you index strings? Larry Hudson <orgnut@yahoo.com> - 2014-06-06 20:24 -0700
          Re: Unicode and Python - how often do you index strings? Chris Angelico <rosuav@gmail.com> - 2014-06-06 05:59 +1000
          Re: Unicode and Python - how often do you index strings? Ryan Hiebert <ryan@ryanhiebert.com> - 2014-06-05 15:05 -0500

Page 1 of 2 [1] 2 Next page →

#72564 — Unicode and Python - how often do you index strings?

From	Chris Angelico <rosuav@gmail.com>
Date	2014-06-04 10:39 +1000
Subject	Unicode and Python - how often do you index strings?
Message-ID	<mailman.10656.1401842403.18130.python-list@python.org>

A current discussion regarding Python's Unicode support centres (or
centers, depending on how close you are to the cent[er]{2} of the
universe) around one critical question: Is string indexing common?

Python strings can be indexed with integers to produce characters
(strings of length 1). They can also be iterated over from beginning
to end. Lots of operations can be built on either one of those two
primitives; the question is, how much can NOT be implemented
efficiently over iteration, and MUST use indexing? Theories are great,
but solid use-cases are better - ideally, examples from actual
production code (actual code optional).

I know the collective experience of python-list can't fail to bring up
a few solid examples here :)

Thanks in advance, all!!

ChrisA

[toc] | [next] | [standalone]

#72569

From	Roy Smith <roy@panix.com>
Date	2014-06-03 21:18 -0400
Message-ID	<roy-9D3770.21181203062014@news.panix.com>
In reply to	#72564

In article <mailman.10656.1401842403.18130.python-list@python.org>,
 Chris Angelico <rosuav@gmail.com> wrote:

> A current discussion regarding Python's Unicode support centres (or
> centers, depending on how close you are to the cent[er]{2} of the
> universe)

<sarcasm style="regex-pedant">Um, you mean cent(er|re), don't you?  The 
pattern you wrote also matches centee and centrr.</sarcasm>

> around one critical question: Is string indexing common?

Not in our code.  I've got 80008 non-blank lines of Python (2.7) source 
handy.  I tried a few heuristics to find patterns which might be string 
indexing.

$ find . -name '*.py' | xargs egrep '\[[^]][0-9]+\]'

and then looked them over manually.  I see this pattern a bunch of times 
(in a single-use script):

data['shard_key'] = hashlib.md5(str(id)).hexdigest()[:4]  

We do this once:

if tz_offset[0] == '-':

We do this somewhere in some command-line parsing:

process_match = args.process[:15]

There's this little gem:

return [dedup(x[1:-1].lower()) for x in 
re.findall('(\[[^\]\[]+\]|\([^\)\(]+\))',title)]

It appears I wrote this one, but I don't remember exactly what I had in 
mind at the time...

withhyphen = number if '-' in number else (number[:-2] + '-' + 
number[-2:]) # big assumption here

Anyway, there's a bunch more, but the bottom line is that in our code, 
indexing into a string (at least explicitly in application source code) 
is a pretty rare thing.

[toc] | [prev] | [next] | [standalone]

#72573

From	Chris Angelico <rosuav@gmail.com>
Date	2014-06-04 12:13 +1000
Message-ID	<mailman.10664.1401848034.18130.python-list@python.org>
In reply to	#72569

On Wed, Jun 4, 2014 at 11:18 AM, Roy Smith <roy@panix.com> wrote:
> In article <mailman.10656.1401842403.18130.python-list@python.org>,
>  Chris Angelico <rosuav@gmail.com> wrote:
>
>> A current discussion regarding Python's Unicode support centres (or
>> centers, depending on how close you are to the cent[er]{2} of the
>> universe)
>
> <sarcasm style="regex-pedant">Um, you mean cent(er|re), don't you?  The
> pattern you wrote also matches centee and centrr.</sarcasm>

Maybe there's someone who spells it that way! Let's not be excluding
people. That'd be rude.

>> around one critical question: Is string indexing common?
>
> Not in our code.  I've got 80008 non-blank lines of Python (2.7) source
> handy.  I tried a few heuristics to find patterns which might be string
> indexing.
>
> $ find . -name '*.py' | xargs egrep '\[[^]][0-9]+\]'
>
> and then looked them over manually.  I see this pattern a bunch of times
> (in a single-use script):
>
> data['shard_key'] = hashlib.md5(str(id)).hexdigest()[:4]

Slicing is a form of indexing too, although in this case (slicing from
the front) it could be implemented on top of UTF-8 without much
problem.

> withhyphen = number if '-' in number else (number[:-2] + '-' +
> number[-2:]) # big assumption here

This *definitely* counts; if strings were represented internally in
UTF-8, this would involve two scans (although a smart implementation
could probably count backward rather than forward). By the way, any
time you slice up to the third from the end, you win two extra awesome
points, just for putting [:-3] into your code and having it mean
something. But I digress.

> Anyway, there's a bunch more, but the bottom line is that in our code,
> indexing into a string (at least explicitly in application source code)
> is a pretty rare thing.

Thanks. Of course, the pattern you searched for is looking only for
literals; it's a bit harder to find cases where the index (or slice
position) comes from a variable or expression, and those situations
are also rather harder to optimize (the MD5 prefix is clearly better
scanned from the front, the number tail is clearly better scanned from
the back - but with a variable?).

ChrisA

[toc] | [prev] | [next] | [standalone]

#72602

From	Gregory Ewing <greg.ewing@canterbury.ac.nz>
Date	2014-06-04 18:48 +1200
Message-ID	<bv7tprFubtmU1@mid.individual.net>
In reply to	#72573

Chris Angelico wrote:
> On Wed, Jun 4, 2014 at 11:18 AM, Roy Smith <roy@panix.com> wrote:
> 
>><sarcasm style="regex-pedant">Um, you mean cent(er|re), don't you?  The
>>pattern you wrote also matches centee and centrr.</sarcasm>
> 
> Maybe there's someone who spells it that way!

Come visit Pirate Island, the centrr of the universe!

-- 
Pegleg Greg

[toc] | [prev] | [next] | [standalone]

#72625

From	alister <alister.nospam.ware@ntlworld.com>
Date	2014-06-04 10:57 +0000
Message-ID	<lYCjv.383812$Fw5.30307@fx28.am4>
In reply to	#72602

On Wed, 04 Jun 2014 18:48:29 +1200, Gregory Ewing wrote:

> Chris Angelico wrote:
>> On Wed, Jun 4, 2014 at 11:18 AM, Roy Smith <roy@panix.com> wrote:
>> 
>>><sarcasm style="regex-pedant">Um, you mean cent(er|re), don't you?  The
>>>pattern you wrote also matches centee and centrr.</sarcasm>
>> 
>> Maybe there's someone who spells it that way!
> 
> Come visit Pirate Island, the centrr of the universe!

that should be Cent-argh



-- 
I hope the ``Eurythmics'' practice birth control ...

[toc] | [prev] | [next] | [standalone]

#72624

From	alister <alister.nospam.ware@ntlworld.com>
Date	2014-06-04 10:50 +0000
Message-ID	<tRCjv.383811$Fw5.146365@fx28.am4>
In reply to	#72569

On Tue, 03 Jun 2014 21:18:12 -0400, Roy Smith wrote:

> In article <mailman.10656.1401842403.18130.python-list@python.org>,
>  Chris Angelico <rosuav@gmail.com> wrote:
> 
>> A current discussion regarding Python's Unicode support centres (or
>> centers, depending on how close you are to the cent[er]{2} of the
>> universe)
> 
> <sarcasm style="regex-pedant">Um, you mean cent(er|re), don't you?  The
> pattern you wrote also matches centee and centrr.</sarcasm>
>
<super pedant mode>
The language is ENGLISH so the correct spelling is Centre regional 
variations my be common but they are incorrect
</super pedant mode>
:-)
-- 
Prepare for tomorrow -- get ready.
		-- Edith Keeler, "The City On the Edge of Forever",
		   stardate unknown

[toc] | [prev] | [next] | [standalone]

#72634

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-06-04 05:52 -0700
Message-ID	<4759a008-b961-4fb2-9420-812ce1103547@googlegroups.com>
In reply to	#72624

On Wednesday, June 4, 2014 4:20:01 PM UTC+5:30, alister wrote:
> The language is ENGLISH so the correct spelling is Centre regional 
> variations my be common but they are incorrect

"my"?

O mee Oo my -- cockney (or Aussie) pedant??

[toc] | [prev] | [next] | [standalone]

#72636

From	alister <alister.nospam.ware@ntlworld.com>
Date	2014-06-04 13:36 +0000
Message-ID	<MhFjv.307530$vt2.122460@fx36.am4>
In reply to	#72634

On Wed, 04 Jun 2014 05:52:24 -0700, Rustom Mody wrote:

> On Wednesday, June 4, 2014 4:20:01 PM UTC+5:30, alister wrote:
>> The language is ENGLISH so the correct spelling is Centre regional
>> variations my be common but they are incorrect
> 
> "my"?
> 
> O mee Oo my -- cockney (or Aussie) pedant??

I made no claims about my typing or spelling being correct.
That post was actually quite good fro me usually my typing is worse.
 



-- 
The difference between genius and stupidity is that genius has its limits.

[toc] | [prev] | [next] | [standalone]

#72603

From	wxjmfauth@gmail.com
Date	2014-06-03 23:50 -0700
Message-ID	<2cefb490-5cec-43ab-8fa8-f99cf9388b5a@googlegroups.com>
In reply to	#72564

Le mercredi 4 juin 2014 02:39:54 UTC+2, Chris Angelico a écrit :
> A current discussion regarding Python's Unicode support centres (or
> 
> centers, depending on how close you are to the cent[er]{2} of the
> 
> universe) around one critical question: Is string indexing common?
> 
> 
> 
> Python strings can be indexed with integers to produce characters
> 
> (strings of length 1). They can also be iterated over from beginning
> 
> to end. Lots of operations can be built on either one of those two
> 
> primitives; the question is, how much can NOT be implemented
> 
> efficiently over iteration, and MUST use indexing? Theories are great,
> 
> but solid use-cases are better - ideally, examples from actual
> 
> production code (actual code optional).
> 
> 
> 
> I know the collective experience of python-list can't fail to bring up
> 
> a few solid examples here :)
> 
> 
> 
> Thanks in advance, all!!
> 
> 
> 
> ChrisA

=============

Like many, you are not understanding unicode because
you do not understand the coding of characters.

You do not understand the coding of the characters
because you do not understand the mathematics behind it.

You focussed on the wrong problem.

(All this stuff has been discussed, tested and worked on
>20 (twenty) years ago.)

Sorry.

jmf

[toc] | [prev] | [next] | [standalone]

#72641

From	Michael Torrie <torriem@gmail.com>
Date	2014-06-04 08:50 -0600
Message-ID	<mailman.10704.1401893469.18130.python-list@python.org>
In reply to	#72603

On 06/04/2014 12:50 AM, wxjmfauth@gmail.com wrote:
> Like many, you are not understanding unicode because
> you do not understand the coding of characters.

If that is true, then I'm sure a well-written paragraph or two can set
him straight.  You continually berate people for not understanding
unicode, but you've posted nothing to explain anything, nor demonstrate
your own understanding.  That's one reason your posts are so frustrating
and considered trolling.  You never ever explain yourself, instead just
flailing around and muttering about folks not understanding unicode,
just as you've done here, true to form.

> 
> You do not understand the coding of the characters
> because you do not understand the mathematics behind it.

flamebaiting here... FSR *is* UTF-32 internally, compresses off leading
zero bits during string creation.

> You focussed on the wrong problem.

Frankly it is you who is focused on the wrong problem, at least with
this particular thread.  I think you got distracted by the subject line.
 Chris's original post really has nothing to do with unicode at all.
He's simply asking for use cases for string indexing where O(1) is
desired or necessary.  Could be old Python 2 byte strings, or Python 3
unicode strings.  It does not matter.  Unicode is orthogonal to his
question.

Maybe his purpose in asking the question is to justify a fixed-length
encoding scheme (which is what FSR actually is), or maybe it is to
explore the costs of using a much slower, but more compact,
variable-length encoding scheme like UTF-8.  Particularly in the context
of low-memory applications where unicode support would be nice, but
memory is at a premium.  But either way, you got hung up on the wrong thing.

> 
> (All this stuff has been discussed, tested and worked on
> 20 (twenty) years ago.)
> 
> Sorry.

As am I.

[toc] | [prev] | [next] | [standalone]

#72682

From	wxjmfauth@gmail.com
Date	2014-06-05 00:06 -0700
Message-ID	<2c02e569-7127-423c-9718-a33cb96082f8@googlegroups.com>
In reply to	#72641

Le mercredi 4 juin 2014 16:50:59 UTC+2, Michael Torrie a écrit :
> On 06/04/2014 12:50 AM, wxjmfauth@gmail.com wrote:
> 
> > Like many, you are not understanding unicode because
> 
> > you do not understand the coding of characters.
> 
> 
> 
> If that is true, then I'm sure a well-written paragraph or two can set
> 
> him straight.  You continually berate people for not understanding
> 
> unicode, but you've posted nothing to explain anything, nor demonstrate
> 
> your own understanding.  That's one reason your posts are so frustrating
> 
> and considered trolling.  You never ever explain yourself, instead just
> 
> flailing around and muttering about folks not understanding unicode,
> 
> just as you've done here, true to form.
> 
> 
> 
> > 
> 
> > You do not understand the coding of the characters
> 
> > because you do not understand the mathematics behind it.
> 
> 
> 
> flamebaiting here... FSR *is* UTF-32 internally, compresses off leading
> 
> zero bits during string creation.
> 
> 
> 
> > You focussed on the wrong problem.
> 
> 
> 
> Frankly it is you who is focused on the wrong problem, at least with
> 
> this particular thread.  I think you got distracted by the subject line.
> 
>  Chris's original post really has nothing to do with unicode at all.
> 
> He's simply asking for use cases for string indexing where O(1) is
> 
> desired or necessary.  Could be old Python 2 byte strings, or Python 3
> 
> unicode strings.  It does not matter.  Unicode is orthogonal to his
> 
> question.
> 
> 
> 
> Maybe his purpose in asking the question is to justify a fixed-length
> 
> encoding scheme (which is what FSR actually is), or maybe it is to
> 
> explore the costs of using a much slower, but more compact,
> 
> variable-length encoding scheme like UTF-8.  Particularly in the context
> 
> of low-memory applications where unicode support would be nice, but
> 
> memory is at a premium.  But either way, you got hung up on the wrong thing.
> 
> 
> 
> > 
> 
> > (All this stuff has been discussed, tested and worked on
> 
> > 20 (twenty) years ago.)
> 
> > 
> 
> > Sorry.
> 
> 
> 
> As am I.

=========

Unicode ?
I have the feeling is similar as explaining,
i (the imaginary number) is not equal to
sqrt(-1).

jmf

PS Once I gave you a link pointing
to unicode.org doc, you obviously did not read it.

[toc] | [prev] | [next] | [standalone]

#72685

From	Marko Rauhamaa <marko@pacujo.net>
Date	2014-06-05 10:20 +0300
Message-ID	<8761kf7rh2.fsf@elektro.pacujo.net>
In reply to	#72682

wxjmfauth@gmail.com:

> Unicode ?
> I have the feeling is similar as explaining,
> i (the imaginary number) is not equal to
> sqrt(-1).
>
> jmf
>
> PS Once I gave you a link pointing
> to unicode.org doc, you obviously did not read it.

Sir, you are an artist, a poet even!

With admiration,


Marko

[toc] | [prev] | [next] | [standalone]

#72714

From	alister <alister.nospam.ware@ntlworld.com>
Date	2014-06-05 15:39 +0000
Message-ID	<Sa0kv.254870$ub6.61382@fx32.am4>
In reply to	#72682

On Thu, 05 Jun 2014 00:06:54 -0700, wxjmfauth wrote:

> Le mercredi 4 juin 2014 16:50:59 UTC+2, Michael Torrie a écrit :
>> On 06/04/2014 12:50 AM, wxjmfauth@gmail.com wrote:
>> 
>> > Like many, you are not understanding unicode because
>> 
>> > you do not understand the coding of characters.
>> 
>> 
>> 
>> If that is true, then I'm sure a well-written paragraph or two can set
>> 
>> him straight.  You continually berate people for not understanding
>> 
>> unicode, but you've posted nothing to explain anything, nor demonstrate
>> 
>> your own understanding.  That's one reason your posts are so
>> frustrating
>> 
>> and considered trolling.  You never ever explain yourself, instead just
>> 
>> flailing around and muttering about folks not understanding unicode,
>> 
>> just as you've done here, true to form.
>> 
>> 
>> 
>> 
>> > 
>> > You do not understand the coding of the characters
>> 
>> > because you do not understand the mathematics behind it.
>> 
>> 
>> 
>> flamebaiting here... FSR *is* UTF-32 internally, compresses off leading
>> 
>> zero bits during string creation.
>> 
>> 
>> 
>> > You focussed on the wrong problem.
>> 
>> 
>> 
>> Frankly it is you who is focused on the wrong problem, at least with
>> 
>> this particular thread.  I think you got distracted by the subject
>> line.
>> 
>>  Chris's original post really has nothing to do with unicode at all.
>> 
>> He's simply asking for use cases for string indexing where O(1) is
>> 
>> desired or necessary.  Could be old Python 2 byte strings, or Python 3
>> 
>> unicode strings.  It does not matter.  Unicode is orthogonal to his
>> 
>> question.
>> 
>> 
>> 
>> Maybe his purpose in asking the question is to justify a fixed-length
>> 
>> encoding scheme (which is what FSR actually is), or maybe it is to
>> 
>> explore the costs of using a much slower, but more compact,
>> 
>> variable-length encoding scheme like UTF-8.  Particularly in the
>> context
>> 
>> of low-memory applications where unicode support would be nice, but
>> 
>> memory is at a premium.  But either way, you got hung up on the wrong
>> thing.
>> 
>> 
>> 
>> 
>> > 
>> > (All this stuff has been discussed, tested and worked on
>> 
>> > 20 (twenty) years ago.)
>> 
>> 
>> > 
>> > Sorry.
>> 
>> 
>> 
>> As am I.
> 
> =========
> 
> Unicode ?
> I have the feeling is similar as explaining,
> i (the imaginary number) is not equal to sqrt(-1).
> 
> jmf
> 
> PS Once I gave you a link pointing to unicode.org doc, you obviously did
> not read it.



And you have may time been given a link explaining the problems with 
posting g=from google groups but deliberately choose to not make your 
replys readable.

-- 
If you're not part of the solution, you're part of the precipitate.

[toc] | [prev] | [next] | [standalone]

#72720

From	Mark H Harris <harrismh777@gmail.com>
Date	2014-06-05 10:57 -0500
Message-ID	<lmq40m$kdp$1@speranza.aioe.org>
In reply to	#72714

On 6/5/14 10:39 AM, alister wrote:
> {snipped all the mess}
>
> And you have may time been given a link explaining the problems with
> posting g=from google groups but deliberately choose to not make your
> replys readable.
>

The problem is that thing look fine in google groups. What helps is 
getting to see what the mess looks like from Thunderbird or equivalent.

[toc] | [prev] | [next] | [standalone]

#72733

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2014-06-05 18:15 +0100
Message-ID	<mailman.10747.1401988550.18130.python-list@python.org>
In reply to	#72720

On 05/06/2014 16:57, Mark H Harris wrote:
> On 6/5/14 10:39 AM, alister wrote:
>> {snipped all the mess}
>>
>> And you have may time been given a link explaining the problems with
>> posting g=from google groups but deliberately choose to not make your
>> replys readable.
>>
>
> The problem is that thing look fine in google groups. What helps is
> getting to see what the mess looks like from Thunderbird or equivalent.
>

Wrong.  99.99% of people when asked politely take action so there is no 
problem.  The remaining 0.01% consists of one complete ignoramus.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com

[toc] | [prev] | [next] | [standalone]

#72738

From	alister <alister.nospam.ware@ntlworld.com>
Date	2014-06-05 17:33 +0000
Message-ID	<FR1kv.237201$Mx1.138801@fx02.am4>
In reply to	#72733

On Thu, 05 Jun 2014 18:15:31 +0100, Mark Lawrence wrote:
>>>
>> The problem is that thing look fine in google groups. What helps is
>> getting to see what the mess looks like from Thunderbird or equivalent.
>>
>>
> Wrong.  99.99% of people when asked politely take action so there is no
> problem.  The remaining 0.01% consists of one complete ignoramus.

Who has actively stated he will not change.
pretty much the same attitude he has constantly saying pythons unicode 
implementation is broken* without any valid supporting evidence.  

* Not just incomplete or inefficient but irrevocably broken.

-- 
Yow!  It's some people inside the wall!  This is better than mopping!

[toc] | [prev] | [next] | [standalone]

#72735

From	Joshua Landau <joshua@landau.ws>
Date	2014-06-05 18:18 +0100
Message-ID	<mailman.10749.1401988750.18130.python-list@python.org>
In reply to	#72603

On 4 June 2014 15:50, Michael Torrie <torriem@gmail.com> wrote:
> On 06/04/2014 12:50 AM, wxjmfauth@gmail.com wrote:
>> [Things]
>
> [Reply to things]

Please. Just don't.

[toc] | [prev] | [next] | [standalone]

#72678 — Re: Unicode and Python

From	Rustom Mody <rustompmody@gmail.com>
Date	2014-06-04 21:25 -0700
Subject	Re: Unicode and Python
Message-ID	<6bff7112-af34-4ceb-a15b-160334d22d86@googlegroups.com>
In reply to	#72564

On Wednesday, June 4, 2014 6:09:54 AM UTC+5:30, Chris Angelico wrote:
> A current discussion regarding Python's Unicode support centres (or
> centers, depending on how close you are to the cent[er]{2} of the
> universe) around one critical question: Is string indexing common?

No exactly on-topic for this thread...
Still thought it might interest some:
http://www.unicodeit.net/

[toc] | [prev] | [next] | [standalone]

#72686 — Re: Unicode and Python

From	wxjmfauth@gmail.com
Date	2014-06-05 00:23 -0700
Subject	Re: Unicode and Python
Message-ID	<327091ca-86bf-438d-ade0-9245d76f213a@googlegroups.com>
In reply to	#72678

Le jeudi 5 juin 2014 06:25:49 UTC+2, Rustom Mody a écrit :
> 

%%%%%%%%%%

Stick with Xe(La)TeX and do not spend to much time on
the web, you will learn a lot about unicode.

Send me a private e-mail, I will explain how this
whole font configuration in a TeX unicode engine (not
utf-8 engine!) works.

jmf

[toc] | [prev] | [next] | [standalone]

#72721

From	Johannes Bauer <dfnsonfsduifb@gmx.de>
Date	2014-06-05 18:09 +0200
Message-ID	<lmq4mu$8nv$1@news.albasani.net>
In reply to	#72564

On 04.06.2014 02:39, Chris Angelico wrote:

> I know the collective experience of python-list can't fail to bring up
> a few solid examples here :)

Just also grepped lots of code and have surprisingly few instances of
index-search. Most are with constant indices. One particular example
that comes up a lot is

line = line[:-1]

Which truncates the trailing "\n" of a textfile line.

Then some indexing in the form of

negative = (line[0] == "-")

All in all I'm actually a bit surprised this isn't too common.

Cheers,
Johannes

-- 
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?
> Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
 - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

Unicode and Python - how often do you index strings?

Contents

#72564 — Unicode and Python - how often do you index strings?

#72569

#72573

#72602

#72625

#72624

#72634

#72636

#72603

#72641

#72682

#72685

#72714

#72720

#72733

#72738

#72735

#72678 — Re: Unicode and Python

#72686 — Re: Unicode and Python

#72721