Groups > comp.lang.python > #41834 > unrolled thread

Performance of int/long in Python 3

Started by	Chris Angelico <rosuav@gmail.com>
First post	2013-03-26 08:51 +1100
Last post	2013-03-28 12:39 +0000
Articles	20 on this page of 206 — 30 participants

Back to article view | Back to comp.lang.python

  Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-26 08:51 +1100
    Re: Performance of int/long in Python 3 Cousin Stanley <cousinstanley@gmail.com> - 2013-03-25 23:35 +0000
      Re: Performance of int/long in Python 3 Dan Stromberg <drsalists@gmail.com> - 2013-03-25 17:12 -0700
      Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-26 17:26 +1100
        Re: Performance of int/long in Python 3 Cousin Stanley <cousinstanley@gmail.com> - 2013-03-26 13:38 +0000
          Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-27 01:08 +1100
            Re: Performance of int/long in Python 3 Cousin Stanley <cousinstanley@gmail.com> - 2013-03-26 16:41 +0000
              Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-27 03:54 +1100
              Re: Performance of int/long in Python 3 Terry Reedy <tjreedy@udel.edu> - 2013-03-26 14:24 -0400
    Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-03-26 11:50 -0700
      Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-27 06:03 +1100
        Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-03-26 13:44 -0700
          Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-26 20:50 +0000
            Re: Performance of int/long in Python 3 Grant Edwards <invalid@invalid.invalid> - 2013-03-26 21:08 +0000
              Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-27 08:14 +1100
                Re: Performance of int/long in Python 3 Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2013-03-27 12:10 +1300
                  Re: Performance of int/long in Python 3 Dave Angel <davea@davea.name> - 2013-03-26 19:19 -0400
              Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-26 21:26 +0000
              Re: Performance of int/long in Python 3 Dave Angel <davea@davea.name> - 2013-03-26 17:28 -0400
              Re: Performance of int/long in Python 3 Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-26 23:14 -0400
              Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-03-27 13:30 -0700
          Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-03-27 07:52 +1100
          Re: Performance of int/long in Python 3 Ned Deily <nad@acm.org> - 2013-03-26 17:00 -0700
            Re: Performance of int/long in Python 3 rurpy@yahoo.com - 2013-03-26 21:31 -0700
          Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-27 00:20 +0000
          Re: Performance of int/long in Python 3 Ned Deily <nad@acm.org> - 2013-03-26 18:31 -0700
          Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-27 11:51 +0000
            Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 01:47 +0000
              flaming vs accuracy [was Re: Performance of int/long in Python 3] Ethan Furman <ethan@stoneleaf.us> - 2013-03-27 20:18 -0700
                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] rusi <rustompmody@gmail.com> - 2013-03-27 20:49 -0700
                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 05:20 +0000
                    Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] rusi <rustompmody@gmail.com> - 2013-03-27 22:42 -0700
                      Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 07:48 +0000
                        Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] rurpy@yahoo.com - 2013-03-28 12:54 -0700
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ethan Furman <ethan@stoneleaf.us> - 2013-03-28 13:31 -0700
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Grant Edwards <invalid@invalid.invalid> - 2013-03-29 14:52 +0000
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ethan Furman <ethan@stoneleaf.us> - 2013-03-29 08:51 -0700
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Grant Edwards <invalid@invalid.invalid> - 2013-03-29 16:50 +0000
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] rurpy@yahoo.com - 2013-03-29 14:26 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ethan Furman <ethan@stoneleaf.us> - 2013-03-29 16:07 -0700
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-31 00:35 -0700
                                  ASCII versus non-ASCII [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-31 08:22 +0000
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-31 13:55 +0100
                                    Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-03-31 22:33 -0700
                                      Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-31 23:52 -0600
                                      Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-01 16:57 +1100
                                      Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-01 08:14 +0000
                                        Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-01 08:15 -0400
                                          Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-01 06:11 -0700
                                            Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-01 17:02 +0000
                                          Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-01 17:07 +0000
                                            Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-02 04:20 +1100
                                            Re: Performance of int/long in Python 3 MRAB <python@mrabarnett.plus.com> - 2013-04-01 18:53 +0100
                                              Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-01 12:15 -0700
                                                Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-02 06:28 +1100
                                                  Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-01 13:28 -0700
                                                    Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-02 07:35 +1100
                                                    Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-01 22:38 +0100
                                                    Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-01 22:43 +0100
                                                      Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-02 10:43 +1100
                                                        Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-02 00:24 -0700
                                                          Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-02 19:03 +1100
                                                            Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-02 08:35 +0000
                                                              Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-02 02:24 -0700
                                                                Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-02 10:43 +0100
                                                                Re: Performance of int/long in Python 3 Steve Simmons <square.steve@gmail.com> - 2013-04-02 11:58 +0100
                                                                  Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 06:42 -0700
                                                                  Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-02 14:03 +0000
                                                                    Re: Performance of int/long in Python 3 Steve Simmons <square.steve@gmail.com> - 2013-04-02 15:39 +0100
                                                                    Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-02 16:02 +0100
                                                                    Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-02 08:12 -0700
                                                                      Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-02 16:43 +0100
                                                                      Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 10:08 -0700
                                                                      Re: Performance of int/long in Python 3 Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-02 17:33 -0400
                                                                      Re: Performance of int/long in Python 3 Joshua Landau <joshua.landau.ws@gmail.com> - 2013-04-02 23:40 +0100
                                                                    Re: Performance of int/long in Python 3 Ethan Furman <ethan@stoneleaf.us> - 2013-04-02 08:09 -0700
                                                                Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-02 15:12 +0100
                                                                Re: Performance of int/long in Python 3 Steve Simmons <square.steve@gmail.com> - 2013-04-02 16:03 +0100
                                                                Re: Performance of int/long in Python 3 Ethan Furman <ethan@stoneleaf.us> - 2013-04-02 08:17 -0700
                                                                  Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 09:57 -0700
                                                                    Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-02 11:22 -0700
                                                                      Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 11:50 -0700
                                                                      Re: Performance of int/long in Python 3 Lele Gaifax <lele@metapensiero.it> - 2013-04-03 00:52 +0200
                                                            Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-02 02:20 -0700
                                                              Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-02 13:44 -0600
                                                                Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 14:31 +1100
                                                                  Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 20:53 -0700
                                                                    Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 15:03 +1100
                                                                      Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-02 22:11 -0700
                                                                      Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-03 17:22 +1100
                                                                        Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 09:28 -0400
                                                                          Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-04 00:38 +1100
                                                                    Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 00:10 -0400
                                                                      Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 19:15 +1100
                                                                        Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 09:25 -0400
                                                                          Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-04 00:34 +1100
                                                                  Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 05:32 +0000
                                                                    Re: Performance of int/long in Python 3 Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-03 02:19 -0400
                                                                      Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 17:27 +1100
                                                                    Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-03 17:25 +1100
                                                                      Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 17:29 +1100
                                                                        Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-03 17:52 +1100
                                                                        Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-03 01:06 -0600
                                                                        Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-03 18:24 +1100
                                                                          Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 18:37 +1100
                                                                            Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-03 01:07 -0700
                                                                              Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 19:22 +1100
                                                                                Re: Performance of int/long in Python 3 Dave Angel <davea@davea.name> - 2013-04-03 06:20 -0400
                                                                                  Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-03 22:05 +1100
                                                                                    Re: Performance of int/long in Python 3 Dave Angel <davea@davea.name> - 2013-04-03 07:52 -0400
                                                                                      Sorting [was Re: Performance of int/long in Python 3] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 14:43 +0000
                                                                                        Re: Sorting [was Re: Performance of int/long in Python 3] Roy Smith <roy@panix.com> - 2013-04-03 11:00 -0400
                                                                                    Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-03 10:30 -0600
                                                                                    Re: Performance of int/long in Python 3 Dave Angel <davea@davea.name> - 2013-04-03 13:51 -0400
                                                                                    Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-04 09:58 +1100
                                                                          Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 07:53 +0000
                                                                            Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-03 19:02 +1100
                                                                              Re: Performance of int/long in Python 3 jmfauth <wxjmfauth@gmail.com> - 2013-04-03 01:08 -0700
                                                                                Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-03 12:27 +0100
                                                                            Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 09:43 -0400
                                                                              Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-04 01:17 +1100
                                                                                Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 15:07 +0000
                                                                                  Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-04 08:57 +1100
                                                                                  Re: Performance of int/long in Python 3 Serhiy Storchaka <storchaka@gmail.com> - 2013-04-06 12:09 +0300
                                                                                  Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-07 07:24 +1000
                                                                                  Re: Performance of int/long in Python 3 Ethan Furman <ethan@stoneleaf.us> - 2013-04-06 14:58 -0700
                                                                                    Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-07 01:29 +0000
                                                                                      Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-06 19:58 -0600
                                                                                        Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-06 22:18 -0400
                                                                                          Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-06 23:22 -0600
                                                                                        Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-07 08:29 +0000
                                                                                  Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-06 20:00 -0600
                                                                                  Re: Performance of int/long in Python 3 Serhiy Storchaka <storchaka@gmail.com> - 2013-04-07 11:02 +0300
                                                                                  Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-07 16:14 +0100
                                                                              Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 15:02 +0000
                                                                                Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-03 10:38 -0600
                                                                                  Re: Performance of int/long in Python 3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-03 17:43 +0000
                                                                                    Re: Performance of int/long in Python 3 Chris Angelico <rosuav@gmail.com> - 2013-04-04 08:55 +1100
                                                                                    Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-03 23:39 +0100
                                                                                Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 20:49 -0400
                                                                              Re: Performance of int/long in Python 3 rusi <rustompmody@gmail.com> - 2013-04-03 09:10 -0700
                                                                                Re: Performance of int/long in Python 3 Ethan Furman <ethan@stoneleaf.us> - 2013-04-03 10:09 -0700
                                                                                Re: Performance of int/long in Python 3 Roy Smith <roy@panix.com> - 2013-04-03 20:46 -0400
                                                                            Re: Performance of int/long in Python 3 Ian Kelly <ian.g.kelly@gmail.com> - 2013-04-03 10:53 -0600
                                                          Re: Performance of int/long in Python 3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-04-02 20:28 +1100
                                                            Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-03 14:56 +0100
                                                Re: Performance of int/long in Python 3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-01 20:54 +0100
                                            Re: Performance of int/long in Python 3 roy@panix.com (Roy Smith) - 2013-04-01 16:31 -0400
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-29 00:35 +0000
                    Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-28 21:22 +1100
                    Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ned Deily <nad@acm.org> - 2013-03-28 13:23 -0700
                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ethan Furman <ethan@stoneleaf.us> - 2013-03-27 23:12 -0700
                    Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 02:03 -0700
                      Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ian Foote <ian@feete.org> - 2013-03-28 09:36 +0000
                        Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-28 23:11 +1100
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 13:01 +0000
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 07:12 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 01:38 +1100
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 08:14 -0700
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 02:21 +1100
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 08:45 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Terry Reedy <tjreedy@udel.edu> - 2013-03-28 12:01 -0400
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-28 10:11 -0600
                                Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-29 00:39 +0000
                                  Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Chris Angelico <rosuav@gmail.com> - 2013-03-29 11:54 +1100
                                    Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-29 02:37 +0000
                                      Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Chris Angelico <rosuav@gmail.com> - 2013-03-29 13:44 +1100
                                      Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-29 00:11 -0600
                                      Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-29 00:22 -0600
                                      Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Terry Reedy <tjreedy@udel.edu> - 2013-03-29 14:06 -0400
                                      Re: Surrogate pairs in new flexible string representation Christian Heimes <christian@python.org> - 2013-03-29 23:05 +0100
                                  Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-29 01:03 +0000
                                  Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Chris Angelico <rosuav@gmail.com> - 2013-03-29 12:10 +1100
                                  Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] MRAB <python@mrabarnett.plus.com> - 2013-03-29 02:00 +0000
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 03:16 +1100
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-28 10:01 -0600
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-29 14:34 +1100
                              unicode and the FSR [was: Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]] Ethan Furman <ethan@stoneleaf.us> - 2013-03-28 21:56 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 16:33 +1100
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-29 16:46 +1100
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] MRAB <python@mrabarnett.plus.com> - 2013-03-28 14:51 +0000
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-29 14:57 +1100
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 02:07 +1100
                      Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-03-28 09:47 +0000
                      Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-28 21:30 +1100
                        Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 06:34 -0700
                          Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Ian Kelly <ian.g.kelly@gmail.com> - 2013-03-28 10:33 -0600
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 09:55 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 04:13 +1100
                            Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 10:48 -0700
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 04:55 +1100
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 13:26 -0700
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 08:45 +1100
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Terry Reedy <tjreedy@udel.edu> - 2013-03-28 19:12 -0400
                              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Benjamin Kaplan <benjamin.kaplan@case.edu> - 2013-03-28 13:29 -0700
                                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 14:11 -0700
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] jmfauth <wxjmfauth@gmail.com> - 2013-03-28 14:33 -0700
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] MRAB <python@mrabarnett.plus.com> - 2013-03-28 21:50 +0000
                                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Benjamin Kaplan <benjamin.kaplan@case.edu> - 2013-03-28 14:52 -0700
                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-03-28 19:53 -0400
                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-29 11:03 +1100
                  Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-29 00:15 +0000
              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Chris Angelico <rosuav@gmail.com> - 2013-03-28 14:40 +1100
                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] 88888 Dihedral <dihedral88888@googlemail.com> - 2013-03-28 16:04 -0700
                Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] 88888 Dihedral <dihedral88888@googlemail.com> - 2013-03-28 16:04 -0700
              Re: flaming vs accuracy [was Re: Performance of int/long in Python 3] Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-28 12:39 +0000

Page 9 of 11 — ← Prev page 1 … 7 8 [9] 10 11 Next page →

#42151 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	jmfauth <wxjmfauth@gmail.com>
Date	2013-03-28 08:45 -0700
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<9f3c6d10-46a3-4835-baf5-76a7da4b0c7c@k1g2000yqf.googlegroups.com>
In reply to	#42143

On 28 mar, 16:14, jmfauth <wxjmfa...@gmail.com> wrote:
> On 28 mar, 15:38, Chris Angelico <ros...@gmail.com> wrote:
>
>
>
>
>
>
>
>
>
> > On Fri, Mar 29, 2013 at 1:12 AM, jmfauth <wxjmfa...@gmail.com> wrote:
> > > This flexible string representation is so absurd that not only
> > > "it" does not know you can not write Western European Languages
> > > with latin-1, "it" penalizes you by just attempting to optimize
> > > latin-1. Shown in my multiple examples.
>
> > PEP393 strings have two optimizations, or kinda three:
>
> > 1a) ASCII-only strings
> > 1b) Latin1-only strings
> > 2) BMP-only strings
> > 3) Everything else
>
> > Options 1a and 1b are almost identical - I'm not sure what the detail
> > is, but there's something flagging those strings that fit inside seven
> > bits. (Something to do with optimizing encodings later?) Both are
> > optimized down to a single byte per character.
>
> > Option 2 is optimized to two bytes per character.
>
> > Option 3 is stored in UTF-32.
>
> > Once again, jmf, you are forgetting that option 2 is a safe and
> > bug-free optimization.
>
> > ChrisA
>
> As long as you are attempting to devide a set of characters in
> chunks and try to handle them seperately, it will never work.
>
> Read my previous post about the unicode transformation format.
> I know what pep393 does.
>
> jmf

Addendum.

This was you correctly percieved in one another thread.
You qualified it as a "switch". Now you have to understand
from where this "switch" is coming from.

jmf

by toy with

[toc] | [prev] | [next] | [standalone]

#42160 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Terry Reedy <tjreedy@udel.edu>
Date	2013-03-28 12:01 -0400
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mailman.3895.1364486507.2939.python-list@python.org>
In reply to	#42128

On 3/28/2013 10:38 AM, Chris Angelico wrote:

> PEP393 strings have two optimizations, or kinda three:
>
> 1a) ASCII-only strings
> 1b) Latin1-only strings
> 2) BMP-only strings
> 3) Everything else
>
> Options 1a and 1b are almost identical - I'm not sure what the detail
> is, but there's something flagging those strings that fit inside seven
> bits. (Something to do with optimizing encodings later?)

Yes. 'Encoding' an ascii-only string to any ascii-compatible encoding 
amounts to a simple copy of the internal bytes. I do not know if *all* 
the codecs for such encodings are 393-aware, but I do know that the 
utf-8 and latin-1 group are. This is one operation that 3.3+ does much 
faster than 3.2-

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#42164 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2013-03-28 10:11 -0600
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mailman.3898.1364487167.2939.python-list@python.org>
In reply to	#42128

On Thu, Mar 28, 2013 at 8:38 AM, Chris Angelico <rosuav@gmail.com> wrote:
> PEP393 strings have two optimizations, or kinda three:
>
> 1a) ASCII-only strings
> 1b) Latin1-only strings
> 2) BMP-only strings
> 3) Everything else
>
> Options 1a and 1b are almost identical - I'm not sure what the detail
> is, but there's something flagging those strings that fit inside seven
> bits. (Something to do with optimizing encodings later?) Both are
> optimized down to a single byte per character.

The only difference for ASCII-only strings is that they are kept in a
struct with a smaller header.  The smaller header omits the utf8
pointer (which optionally points to an additional UTF-8 representation
of the string) and its associated length variable.  These are not
needed for ASCII-only strings because an ASCII string can be directly
interpreted as a UTF-8 string for the same result.  The smaller header
also omits the "wstr_length" field which, according to the PEP,
"differs from length only if there are surrogate pairs in the
representation."  For an ASCII string, of course there would not be
any surrogate pairs.

[toc] | [prev] | [next] | [standalone]

#42208 — Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-03-29 00:39 +0000
Subject	Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<5154e2dd$0$29974$c3e8da3$5496439d@news.astraweb.com>
In reply to	#42164

On Thu, 28 Mar 2013 10:11:59 -0600, Ian Kelly wrote:

> On Thu, Mar 28, 2013 at 8:38 AM, Chris Angelico <rosuav@gmail.com>
> wrote:
>> PEP393 strings have two optimizations, or kinda three:
>>
>> 1a) ASCII-only strings
>> 1b) Latin1-only strings
>> 2) BMP-only strings
>> 3) Everything else
>>
>> Options 1a and 1b are almost identical - I'm not sure what the detail
>> is, but there's something flagging those strings that fit inside seven
>> bits. (Something to do with optimizing encodings later?) Both are
>> optimized down to a single byte per character.
> 
> The only difference for ASCII-only strings is that they are kept in a
> struct with a smaller header.  The smaller header omits the utf8 pointer
> (which optionally points to an additional UTF-8 representation of the
> string) and its associated length variable.  These are not needed for
> ASCII-only strings because an ASCII string can be directly interpreted
> as a UTF-8 string for the same result.  The smaller header also omits
> the "wstr_length" field which, according to the PEP, "differs from
> length only if there are surrogate pairs in the representation."  For an
> ASCII string, of course there would not be any surrogate pairs.

I wonder why they need care about surrogate pairs? 

ASCII and Latin-1 strings obviously do not have them. Nor do BMP-only 
strings. It's only strings in the SMPs that could need surrogate pairs, 
and they don't need them in Python's implementation since it's a full 32-
bit implementation. So where do the surrogate pairs come into this?

I also wonder why the implementation bothers keeping a UTF-8 
representation. That sounds like premature optimization to me. Surely you 
only need it when writing to a file with UTF-8 encoding? For most 
strings, that will never happen.

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#42209 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Chris Angelico <rosuav@gmail.com>
Date	2013-03-29 11:54 +1100
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3928.1364518484.2939.python-list@python.org>
In reply to	#42208

On Fri, Mar 29, 2013 at 11:39 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> ASCII and Latin-1 strings obviously do not have them. Nor do BMP-only
> strings. It's only strings in the SMPs that could need surrogate pairs,
> and they don't need them in Python's implementation since it's a full 32-
> bit implementation. So where do the surrogate pairs come into this?

PEP 393 says:
"""
wstr_length, wstr: representation in platform's wchar_t
(null-terminated). If wchar_t is 16-bit, this form may use surrogate
pairs (in which cast wstr_length differs form length). wstr_length
differs from length only if there are surrogate pairs in the
representation.

utf8_length, utf8: UTF-8 representation (null-terminated).

data: shortest-form representation of the unicode string. The string
is null-terminated (in its respective representation).

All three representations are optional, although the data form is
considered the canonical representation which can be absent only while
the string is being created. If the representation is absent, the
pointer is NULL, and the corresponding length field may contain
arbitrary data.
"""

If the string was created from a wchar_t string, that string will be
retained, and presumably can be used to re-output the original for a
clean and fast round-trip. Same with...

> I also wonder why the implementation bothers keeping a UTF-8
> representation. That sounds like premature optimization to me. Surely you
> only need it when writing to a file with UTF-8 encoding? For most
> strings, that will never happen.

... the UTF-8 version. It'll keep it if it has it, and not else. A lot
of content will go out in the same encoding it came in in, so it makes
sense to hang onto it where possible.

Though, from the same quote: The UTF-8 representation is
null-terminated. Does this mean that it can't be used if there might
be a \0 in the string?

Minor nitpick, btw:
> (in which cast wstr_length differs form length)
Should be "in which case" and "from". Who has the power to correct
typos in PEPs?

ChrisA

[toc] | [prev] | [next] | [standalone]

#42213 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-03-29 02:37 +0000
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<5154fe82$0$29974$c3e8da3$5496439d@news.astraweb.com>
In reply to	#42209

On Fri, 29 Mar 2013 11:54:41 +1100, Chris Angelico wrote:

> On Fri, Mar 29, 2013 at 11:39 AM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> ASCII and Latin-1 strings obviously do not have them. Nor do BMP-only
>> strings. It's only strings in the SMPs that could need surrogate pairs,
>> and they don't need them in Python's implementation since it's a full
>> 32- bit implementation. So where do the surrogate pairs come into this?
> 
> PEP 393 says:
> """
> wstr_length, wstr: representation in platform's wchar_t
> (null-terminated). If wchar_t is 16-bit, this form may use surrogate
> pairs (in which cast wstr_length differs form length). wstr_length
> differs from length only if there are surrogate pairs in the
> representation.
> 
> utf8_length, utf8: UTF-8 representation (null-terminated).
> 
> data: shortest-form representation of the unicode string. The string is
> null-terminated (in its respective representation).
> 
> All three representations are optional, although the data form is
> considered the canonical representation which can be absent only while
> the string is being created. If the representation is absent, the
> pointer is NULL, and the corresponding length field may contain
> arbitrary data.
> """

All the words are in English (well, most of them...) but what does it 
mean?

> If the string was created from a wchar_t string, that string will be
> retained, and presumably can be used to re-output the original for a
> clean and fast round-trip.

Under what circumstances will a string be created from a wchar_t string? 
How, and why, would such a string be created? Why would Python still 
support strings containing surrogates when it now has a nice, shiny, 
surrogate-free flexible representation?



>> I also wonder why the implementation bothers keeping a UTF-8
>> representation. That sounds like premature optimization to me. Surely
>> you only need it when writing to a file with UTF-8 encoding? For most
>> strings, that will never happen.
> 
> ... the UTF-8 version. It'll keep it if it has it, and not else. A lot
> of content will go out in the same encoding it came in in, so it makes
> sense to hang onto it where possible.

Not to me. That almost doubles the size of the string, on the off-chance 
that you'll need the UTF-8 encoding. Which for many uses, you don't, and 
even if you do, it seems like premature optimization to keep it around 
just in case. Encoding to UTF-8 will be fast for small N, and for large 
N, why carry around (potentially) multiple megabytes of duplicated data 
just in case the encoded version is needed some time?


> Though, from the same quote: The UTF-8 representation is
> null-terminated. Does this mean that it can't be used if there might be
> a \0 in the string?
> 
> Minor nitpick, btw:
>> (in which cast wstr_length differs form length)
> Should be "in which case" and "from". Who has the power to correct typos
> in PEPs?
> 
> ChrisA

[toc] | [prev] | [next] | [standalone]

#42215 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Chris Angelico <rosuav@gmail.com>
Date	2013-03-29 13:44 +1100
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3932.1364525092.2939.python-list@python.org>
In reply to	#42213

On Fri, Mar 29, 2013 at 1:37 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Under what circumstances will a string be created from a wchar_t string?
> How, and why, would such a string be created? Why would Python still
> support strings containing surrogates when it now has a nice, shiny,
> surrogate-free flexible representation?

Strings are created from some form of content. If not from another
Python string, then - most likely - it's from a stream of bytes. If
from a C API that returns wchar_t, then it'd make sense to have that
form around.

ChrisA

[toc] | [prev] | [next] | [standalone]

#42227 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2013-03-29 00:11 -0600
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3939.1364537545.2939.python-list@python.org>
In reply to	#42213

On Thu, Mar 28, 2013 at 8:37 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>>> I also wonder why the implementation bothers keeping a UTF-8
>>> representation. That sounds like premature optimization to me. Surely
>>> you only need it when writing to a file with UTF-8 encoding? For most
>>> strings, that will never happen.
>>
>> ... the UTF-8 version. It'll keep it if it has it, and not else. A lot
>> of content will go out in the same encoding it came in in, so it makes
>> sense to hang onto it where possible.
>
> Not to me. That almost doubles the size of the string, on the off-chance
> that you'll need the UTF-8 encoding. Which for many uses, you don't, and
> even if you do, it seems like premature optimization to keep it around
> just in case. Encoding to UTF-8 will be fast for small N, and for large
> N, why carry around (potentially) multiple megabytes of duplicated data
> just in case the encoded version is needed some time?

>From the PEP:

"""
A new function PyUnicode_AsUTF8 is provided to access the UTF-8
representation. It is thus identical to the existing
_PyUnicode_AsString, which is removed. The function will compute the
utf8 representation when first called. Since this representation will
consume memory until the string object is released, applications
should use the existing PyUnicode_AsUTF8String where possible (which
generates a new string object every time). APIs that implicitly
converts a string to a char* (such as the ParseTuple functions) will
use PyUnicode_AsUTF8 to compute a conversion.
"""

So the utf8 representation is not populated when the string is
created, but when a utf8 representation is requested, and only when
requested by the API that returns a char*, not by the API that returns
a bytes object.

[toc] | [prev] | [next] | [standalone]

#42228 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2013-03-29 00:22 -0600
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3940.1364538176.2939.python-list@python.org>
In reply to	#42213

On Fri, Mar 29, 2013 at 12:11 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> From the PEP:
>
> """
> A new function PyUnicode_AsUTF8 is provided to access the UTF-8
> representation. It is thus identical to the existing
> _PyUnicode_AsString, which is removed. The function will compute the
> utf8 representation when first called. Since this representation will
> consume memory until the string object is released, applications
> should use the existing PyUnicode_AsUTF8String where possible (which
> generates a new string object every time). APIs that implicitly
> converts a string to a char* (such as the ParseTuple functions) will
> use PyUnicode_AsUTF8 to compute a conversion.
> """
>
> So the utf8 representation is not populated when the string is
> created, but when a utf8 representation is requested, and only when
> requested by the API that returns a char*, not by the API that returns
> a bytes object.

Since the PEP specifically mentions ParseTuple string conversion, I am
thinking that this is probably the motivation for caching it.  A
string that is passed into a C function (that uses one of the various
UTF-8 char* format specifiers) is perhaps likely to be passed into
that function again at some point, so the UTF-8 representation is kept
around to avoid the need to recompose it at on each call.

[toc] | [prev] | [next] | [standalone]

#42262 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Terry Reedy <tjreedy@udel.edu>
Date	2013-03-29 14:06 -0400
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3957.1364580449.2939.python-list@python.org>
In reply to	#42213

On 3/28/2013 10:37 PM, Steven D'Aprano wrote:

> Under what circumstances will a string be created from a wchar_t string?
> How, and why, would such a string be created? Why would Python still
> support strings containing surrogates when it now has a nice, shiny,
> surrogate-free flexible representation?

I believe because surrogates are legal codepoints and users may put them 
in strings even though python does not (except for surrogate_escape 
error handling).

I believe some of the internal complexity comes from supporting the old 
C-api so as to not immediately invalidate existing extensions.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#42285 — Re: Surrogate pairs in new flexible string representation

From	Christian Heimes <christian@python.org>
Date	2013-03-29 23:05 +0100
Subject	Re: Surrogate pairs in new flexible string representation
Message-ID	<mailman.3969.1364594730.2939.python-list@python.org>
In reply to	#42213

Am 29.03.2013 07:22, schrieb Ian Kelly:
> Since the PEP specifically mentions ParseTuple string conversion, I am
> thinking that this is probably the motivation for caching it.  A
> string that is passed into a C function (that uses one of the various
> UTF-8 char* format specifiers) is perhaps likely to be passed into
> that function again at some point, so the UTF-8 representation is kept
> around to avoid the need to recompose it at on each call.

It's not just about caching but also about memory management. The
additional utf8 member is required for backward compatibility. The APIs
expect a pointer to an existing and shared block of memory. They don't
take ownership of the memory block and therefore don't free() it.

Christian

[toc] | [prev] | [next] | [standalone]

#42210 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Mark Lawrence <breamoreboy@yahoo.co.uk>
Date	2013-03-29 01:03 +0000
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3929.1364519036.2939.python-list@python.org>
In reply to	#42208

On 29/03/2013 00:54, Chris Angelico wrote:
>
> Minor nitpick, btw:
>> (in which cast wstr_length differs form length)
> Should be "in which case" and "from". Who has the power to correct
> typos in PEPs?
>
> ChrisA
>

Sneak it in here? http://bugs.python.org/issue13604

-- 
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]

#42211 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Chris Angelico <rosuav@gmail.com>
Date	2013-03-29 12:10 +1100
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3930.1364519457.2939.python-list@python.org>
In reply to	#42208

On Fri, Mar 29, 2013 at 12:03 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
> On 29/03/2013 00:54, Chris Angelico wrote:
>> Minor nitpick, btw:
>>>
>>> (in which cast wstr_length differs form length)
>>
>> Should be "in which case" and "from". Who has the power to correct
>> typos in PEPs?
>
> Sneak it in here? http://bugs.python.org/issue13604

Ah! Turns out it's already been fixed; a reword of that section, as
shown in the attached files, no longer has the parenthesis, and thus
its typos.

ChrisA

[toc] | [prev] | [next] | [standalone]

#42212 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	MRAB <python@mrabarnett.plus.com>
Date	2013-03-29 02:00 +0000
Subject	Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3931.1364522420.2939.python-list@python.org>
In reply to	#42208

On 29/03/2013 00:54, Chris Angelico wrote:
> On Fri, Mar 29, 2013 at 11:39 AM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
>> ASCII and Latin-1 strings obviously do not have them. Nor do BMP-only
>> strings. It's only strings in the SMPs that could need surrogate pairs,
>> and they don't need them in Python's implementation since it's a full 32-
>> bit implementation. So where do the surrogate pairs come into this?
>
> PEP 393 says:
> """
> wstr_length, wstr: representation in platform's wchar_t
> (null-terminated). If wchar_t is 16-bit, this form may use surrogate
> pairs (in which cast wstr_length differs form length). wstr_length
> differs from length only if there are surrogate pairs in the
> representation.
>
> utf8_length, utf8: UTF-8 representation (null-terminated).
>
> data: shortest-form representation of the unicode string. The string
> is null-terminated (in its respective representation).
>
> All three representations are optional, although the data form is
> considered the canonical representation which can be absent only while
> the string is being created. If the representation is absent, the
> pointer is NULL, and the corresponding length field may contain
> arbitrary data.
> """
>
> If the string was created from a wchar_t string, that string will be
> retained, and presumably can be used to re-output the original for a
> clean and fast round-trip. Same with...
>
>> I also wonder why the implementation bothers keeping a UTF-8
>> representation. That sounds like premature optimization to me. Surely you
>> only need it when writing to a file with UTF-8 encoding? For most
>> strings, that will never happen.
>
> ... the UTF-8 version. It'll keep it if it has it, and not else. A lot
> of content will go out in the same encoding it came in in, so it makes
> sense to hang onto it where possible.
>
> Though, from the same quote: The UTF-8 representation is
> null-terminated. Does this mean that it can't be used if there might
> be a \0 in the string?
>
You could ask the same question about any encoding.

It's only an issue if it's passed to a C function which expects a
null-terminated string.

> Minor nitpick, btw:
>> (in which cast wstr_length differs form length)
> Should be "in which case" and "from". Who has the power to correct
> typos in PEPs?
>

[toc] | [prev] | [next] | [standalone]

#42166 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Chris Angelico <rosuav@gmail.com>
Date	2013-03-29 03:16 +1100
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mailman.3900.1364487418.2939.python-list@python.org>
In reply to	#42128

On Fri, Mar 29, 2013 at 3:01 AM, Terry Reedy <tjreedy@udel.edu> wrote:
> On 3/28/2013 10:38 AM, Chris Angelico wrote:
>
>> PEP393 strings have two optimizations, or kinda three:
>>
>> 1a) ASCII-only strings
>> 1b) Latin1-only strings
>> 2) BMP-only strings
>> 3) Everything else
>>
>> Options 1a and 1b are almost identical - I'm not sure what the detail
>> is, but there's something flagging those strings that fit inside seven
>> bits. (Something to do with optimizing encodings later?)
>
>
> Yes. 'Encoding' an ascii-only string to any ascii-compatible encoding
> amounts to a simple copy of the internal bytes. I do not know if *all* the
> codecs for such encodings are 393-aware, but I do know that the utf-8 and
> latin-1 group are. This is one operation that 3.3+ does much faster than
> 3.2-

Thanks Terry. So that's not so much a representation difference as a
flag that costs little or nothing to retain, and can improve
performance in the encode later on. Sounds like a useful tweak to the
basics of flexible string representation, without being particularly
germane to jmf's complaints.

ChrisA

[toc] | [prev] | [next] | [standalone]

#42161 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Ian Kelly <ian.g.kelly@gmail.com>
Date	2013-03-28 10:01 -0600
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mailman.3896.1364486514.2939.python-list@python.org>
In reply to	#42123

On Thu, Mar 28, 2013 at 7:01 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> Any string method that takes a starting offset requires the method to
> walk the string byte-by-byte. I've even seen languages put responsibility
> for dealing with that onto the programmer: the "start offset" is given in
> *bytes*, not characters. I don't remember what language this was... it
> might have been Haskell? Whatever it was, it horrified me.

Go does this.  I remember because it came up in one of these threads,
where jmf (or was it Ranting Rick?) was praising Go for just getting
Unicode "right".

[toc] | [prev] | [next] | [standalone]

#42216 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Neil Hodgson <nhodgson@iinet.net.au>
Date	2013-03-29 14:34 +1100
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<-LGdnWTpyKcdkcjMnZ2dnUVZ_jCdnZ2d@westnet.com.au>
In reply to	#42123

Steven D'Aprano:

> Some string operations need to inspect every character, e.g. str.upper().
> Even for them, the increased complexity of a variable-width encoding
> costs. It's not sufficient to walk the string inspecting a fixed 1, 2 or
> 4 bytes per character. You have to walk the string grabbing 1 byte at a
> time, and then decide whether you need another 1, 2 or 3 bytes. Even
> though it's still O(N), the added bit-masking and overhead of variable-
> width encoding adds to the overall cost.

    It does add to implementation complexity but should only add a small 
amount of time.

    To compare costs, I am using the text of the web site 
http://www.mofa.go.jp/mofaj/ since it has a reasonable amount (10%) of 
multi-byte characters. Since the document fits in the the BMP, Python 
would choose a 2-byte wide implementation so I am emulating that choice 
with a very simple 16-bit table-based upper-caser. Real Unicode case 
conversion code is more concerned with edge cases like Turkic and 
Lithuanian locales and Greek combining characters and also allowing for 
measurement/reallocation for the cases where the result is 
smaller/larger. See, for example, glib's real_toupper in 
https://git.gnome.org/browse/glib/tree/glib/guniprop.c

    Here is some simplified example code that implements upper-casing 
over 16-bit wide (utf16_up) and UTF-8 (utf8_up) buffers:
http://www.scintilla.org/UTF8Up.cxx
    Since I didn't want to spend too much time writing code it only 
handles the BMP and doesn't have upper-case table entries outside ASCII 
for now. If this was going to be worked on further to be made 
maintainable, most of the masking and so forth would be in macros 
similar to UTF8_COMPUTE/UTF8_GET in glib.
    The UTF-8 case ranges from around 5% slower on average in a 32 bit 
release build (VC2012 on an i7 870) to averaging a little faster in a 
64-bit build. They're both around a billion characters per-second.

C:\u\hg\UpUTF\UpUTF>..\x64\Release\UpUTF.exe
Time taken for UTF8 of 80449=0.006528
Time taken for UTF16 of 71525=0.006610
Relative time taken UTF8/UTF16 0.987581

> Any string method that takes a starting offset requires the method to
> walk the string byte-by-byte. I've even seen languages put responsibility
> for dealing with that onto the programmer: the "start offset" is given in
> *bytes*, not characters. I don't remember what language this was... it
> might have been Haskell? Whatever it was, it horrified me.

    It doesn't horrify me - I've been working this way for over 10 years 
and it seems completely natural. You can wrap access in iterators that 
hide the byte offsets if you like. This then ensures that all operations 
on those iterators are safe only allowing the iterator to point at the 
start/end of valid characters.

> Sure. And over a different set of samples, it is less compact. If you
> write a lot of Latin-1, Python will use one byte per character, while
> UTF-8 will use two bytes per character.

    I think you mean writing a lot of Latin-1 characters outside ASCII. 
However, even people writing texts in, say, French will find that only a 
small proportion of their text is outside ASCII and so the cost of UTF-8 
is correspondingly small.

    The counter-problem is that a French document that needs to include 
one mathematical symbol (or emoji) outside Latin-1 will double in size 
as a Python string.

    Neil

[toc] | [prev] | [next] | [standalone]

#42222 — unicode and the FSR [was: Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

From	Ethan Furman <ethan@stoneleaf.us>
Date	2013-03-28 21:56 -0700
Subject	unicode and the FSR [was: Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]
Message-ID	<mailman.3936.1364533284.2939.python-list@python.org>
In reply to	#42216

On 03/28/2013 08:34 PM, Neil Hodgson wrote:
> Steven D'Aprano:
>
>> Any string method that takes a starting offset requires the method to
>> walk the string byte-by-byte. I've even seen languages put responsibility
>> for dealing with that onto the programmer: the "start offset" is given in
>> *bytes*, not characters. I don't remember what language this was... it
>> might have been Haskell? Whatever it was, it horrified me.
>
>     It doesn't horrify me - I've been working this way for over 10 years and it seems completely natural.

Horrifying or not, I am willing to give up a small amount of speed for correctness.  Heck, I'm willing to give up a lot 
of speed for correctness.  Once I have my slow but correct prototype going I can recode in a faster language (if needed) 
and compare it's blazingly fast output with my slowly-generated but known-good output.

>  You can wrap
> access in iterators that hide the byte offsets if you like. This then ensures that all operations on those iterators are
> safe only allowing the iterator to point at the start/end of valid characters.

Sure.  Or I can let Python handle it for me.

>     The counter-problem is that a French document that needs to include one mathematical symbol (or emoji) outside
> Latin-1 will double in size as a Python string.

True.  But how often do you have the entire document as a single string?  Use readlines() instead of read().  Besides, 
memory is cheap.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]

#42224 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Chris Angelico <rosuav@gmail.com>
Date	2013-03-29 16:33 +1100
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mailman.3937.1364535241.2939.python-list@python.org>
In reply to	#42216

On Fri, Mar 29, 2013 at 2:34 PM, Neil Hodgson <nhodgson@iinet.net.au> wrote:
>    It doesn't horrify me - I've been working this way for over 10 years and
> it seems completely natural. You can wrap access in iterators that hide the
> byte offsets if you like. This then ensures that all operations on those
> iterators are safe only allowing the iterator to point at the start/end of
> valid characters.

But both this and your example of case conversion are, fundamentally,
iterating over the string. What if you aren't doing that? What if you
want to parse and process?

ChrisA

[toc] | [prev] | [next] | [standalone]

#42226 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

From	Neil Hodgson <nhodgson@iinet.net.au>
Date	2013-03-29 16:46 +1100
Subject	Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]
Message-ID	<mtGdnT6PbeH5tsjMnZ2dnUVZ_vadnZ2d@westnet.com.au>
In reply to	#42224

Chris Angelico:

> But both this and your example of case conversion are, fundamentally,
> iterating over the string. What if you aren't doing that? What if you
> want to parse and process?

    Parsing is also normally a scanning operation. If you want to 
process pieces of the string based on the parse then you remember the 
positions (as iterators) at the significant places and extract/process 
the data based on those positions.

    Neil

[toc] | [prev] | [next] | [standalone]

Page 9 of 11 — ← Prev page 1 … 7 8 [9] 10 11 Next page →

csiph-web

Performance of int/long in Python 3

Contents

#42151 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42160 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42164 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42208 — Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42209 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42213 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42215 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42227 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42228 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42262 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42285 — Re: Surrogate pairs in new flexible string representation

#42210 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42211 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42212 — Re: Surrogate pairs in new flexible string representation [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42166 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42161 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42216 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42222 — unicode and the FSR [was: Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

#42224 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]

#42226 — Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]