Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'operator': 0.03; 'from:addr:yahoo.co.uk': 0.04; 'explicitly': 0.05; 'elements.': 0.07; 'indexing': 0.07; 'memory.': 0.07; 'reason,': 0.07; 'string': 0.09; '(unicode': 0.09; '32-bit': 0.09; 'falls': 0.09; 'lawrence': 0.09; 'operator,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subset': 0.09; 'worse': 0.09; 'python': 0.11; 'language.': 0.14; '"python': 0.16; '(r)': 0.16; 'do!': 0.16; 'encodings': 0.16; 'former,': 0.16; 'guilty': 0.16; 'mardi': 0.16; 'mechanism.': 0.16; 'opposite': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'string)': 0.16; 'width.': 0.16; 'zero,': 0.16; '\xe9crit': 0.16; 'all.': 0.16; 'language': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'pointed': 0.19; "skip:' 30": 0.19; 'touch.': 0.19; '(the': 0.22; '>>>': 0.22; 'memory': 0.22; 'coding': 0.22; 'header:User- Agent:1': 0.23; 'case.': 0.24; 'mathematical': 0.24; 'replace': 0.24; 'space.': 0.24; 'unicode': 0.24; '---': 0.24; 'post': 0.26; 'defined': 0.27; 'header:X-Complaints-To:1': 0.27; 'header:In- Reply-To:1': 0.27; 'rest': 0.29; 'quickly': 0.29; 'tim': 0.29; "doesn't": 0.30; 'characters': 0.30; 'gives': 0.31; "skip:' 10": 0.31; '>>>>': 0.31; 'chase': 0.31; 'continually': 0.31; 'subject:size': 0.31; 'know.': 0.32; 'quite': 0.32; '(e.g.': 0.33; 'cases': 0.33; 'maybe': 0.34; 'beyond': 0.35; 'case,': 0.35; 'google': 0.35; 'like,': 0.36; 'performance': 0.37; 'to:addr :python-list': 0.38; 'little': 0.38; 'does': 0.39; 'pdf': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'received:org': 0.40; 'space': 0.40; 'how': 0.40; 'even': 0.60; 'hope': 0.61; 'free': 0.61; 'full': 0.61; 'viruses': 0.61; 'back': 0.62; 'save': 0.62; 'email addr:gmail.com': 0.63; 'protection': 0.63; 'today': 0.64; 'our': 0.64; 'more': 0.64; 'here': 0.66; 'natural': 0.68; 'antivirus': 0.68; 'saving': 0.69; 'wish': 0.70; 'to,': 0.72; 'day': 0.76; 'lack': 0.78; 'gain': 0.79; 'exclusive': 0.81; 'everything,': 0.84; 'loses': 0.84; 'received:2': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Mark Lawrence Subject: Re: Finding size of Variable Date: Wed, 12 Feb 2014 14:04:42 +0000 References: <8e4c1ab1-e65d-483f-ad9d-6933ae2052c3@googlegroups.com> <7e7d3200-a4ae-4842-ad8d-68b4435b9006@googlegroups.com> <52f219c5$0$29972$c3e8da3$5496439d@news.astraweb.com> <888bd2fc-54b0-4c46-9d7b-d81d01a78b52@googlegroups.com> <52f59aeb$0$29972$c3e8da3$5496439d@news.astraweb.com> <7cc8f49d-a4c7-48c2-a0af-ac58c847d794@googlegroups.com> <71e578f8-0d23-4b8e-b9f2-b987bdc9c01d@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: host-2-98-192-220.as13285.net User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 In-Reply-To: <71e578f8-0d23-4b8e-b9f2-b987bdc9c01d@googlegroups.com> X-Antivirus: avast! (VPS 140212-0, 12/02/2014), Outbound message X-Antivirus-Status: Clean X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 235 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1392213902 news.xs4all.nl 2929 [2001:888:2000:d::a6]:45207 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:66036 On 12/02/2014 07:49, wxjmfauth@gmail.com wrote: > Le mardi 11 février 2014 20:04:02 UTC+1, Mark Lawrence a écrit : >> On 11/02/2014 18:53, wxjmfauth@gmail.com wrote: >> >>> Le lundi 10 février 2014 15:43:08 UTC+1, Tim Chase a écrit : >> >>>> On 2014-02-10 06:07, wxjmfauth@gmail.com wrote: >> >>>> >> >>>>> Python does not save memory at all. A str (unicode string) >> >>>> >> >>>>> uses less memory only - and only - because and when one uses >> >>>> >> >>>>> explicitly characters which are consuming less memory. >> >>>> >> >>>>> >> >>>> >> >>>>> Not only the memory gain is zero, Python falls back to the >> >>>> >> >>>>> worse case. >> >>>> >> >>>>> >> >>>> >> >>>>>>>> sys.getsizeof('a' * 1000000) >> >>>> >> >>>>> 1000025 >> >>>> >> >>>>>>>> sys.getsizeof('a' * 1000000 + 'oe') >> >>>> >> >>>>> 2000040 >> >>>> >> >>>>>>>> sys.getsizeof('a' * 1000000 + 'oe' + '\U00010000') >> >>>> >> >>>>> 4000048 >> >>>> >> >>>> >> >>>> >> >>>> If Python used UTF-32 for EVERYTHING, then all three of those cases >> >>>> >> >>>> would be 4000048, so it clearly disproves your claim that "python >> >>>> >> >>>> does not save memory at all". >> >>>> >> >>>> >> >>>> >> >>>>> The opposite of what the utf8/utf16 do! >> >>>> >> >>>>> >> >>>> >> >>>>>>>> sys.getsizeof(('a' * 1000000 + 'oe' + >> >>>> >> >>>>>>>> '\U00010000').encode('utf-8')) >> >>>> >> >>>>> 1000023 >> >>>> >> >>>>>>>> sys.getsizeof(('a' * 1000000 + 'oe' + >> >>>> >> >>>>>>>> '\U00010000').encode('utf-16')) >> >>>> >> >>>>> 2000025 >> >>>> >> >>>> >> >>>> >> >>>> However, as pointed out repeatedly, string-indexing in fixed-width >> >>>> >> >>>> encodings are O(1) while indexing into variable-width encodings (e.g. >> >>>> >> >>>> UTF8/UTF16) are O(N). The FSR gives the benefits of O(1) indexing >> >>>> >> >>>> while saving space when a string doesn't need to use a full 32-bit >> >>>> >> >>>> width. >> >>>> >> >>>> >> >>> >> >>> A utf optimizes the memory and the performance at the same time. >> >>> It behaves like a mathematical operator, a unique operator for >> >>> a unique set of elements. Unbeatable. >> >>> >> >>> The FSR is an exclusive or mechanism. I you wish to >> >>> same memory, you have to encode, and if you are encoding, >> >>> maybe because you have to, one loses performance. Paradoxal. >> >>> >> >>> Your O(1) indexing works only and only because and >> >>> when you are working explicitly with a "static" unicode >> >>> string you never touch. >> >>> It's a little bit the the "corresponding" performance >> >>> case of the memory case. >> >>> >> >>> jmf >> >>> >> >> >> >> Why are you so rude as to continually post your nonsense here that not a >> >> single person believes, and at the same time still quite deliberately >> >> use gg to post it with double line spacing. If you lack the courtesy to >> >> stop the former, please have the courtesy to stop the latter. >> >> >> >> -- >> >> My fellow Pythonistas, ask not what our language can do for you, ask >> >> what you can do for our language. >> >> > > Nonsense? > >>>> sys.getsizeof('') - sys.getsizeof('a') > -1 > > > The day you find an operator working on the set of > reals (R) and it is somehow "optimized" for N > (the subset of natural numbers), let me know. > > A conflict is quickly appearing. Either the operator is > not correctly defined or the choice of the set is wrong. > > You can replace the "operator" with an "encoding" and > the "set" with a "repertoire of characters". > > It's the main reason, why we have to live today with > all these coding schemes. Even in more sophisticated > cases like, CID-fonts or "char boxes" in a pdf (with the > hope you understand how it works). > > jmf > I ask you, members of the jury, to find the accused, jmf, guilty of writing nonsense and deliberately using google groups to double line space. The evidence is directly above and quite clearly prooves, beyond a resonable doubt, that no verdict other than guilty can be recorded. I rest my case, m'lud. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com