Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!border1.nntp.ams1.giganews.com!nntp.giganews.com!usenetcore.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '16,': 0.03; 'base.': 0.05; 'binary': 0.05; 'linear': 0.07; 'shortcut': 0.07; 'subject:skip:b 10': 0.07; 'table.': 0.07; 'logic': 0.09; 'lookup': 0.09; 'bug': 0.10; 'cc:addr:python-list': 0.10; 'def': 0.14; 'wed,': 0.15; '32,': 0.16; '64,': 0.16; 'algorithm.': 0.16; 'algorithmic': 0.16; 'bases,': 0.16; 'binary,': 0.16; 'complexity,': 0.16; 'digits.': 0.16; 'division.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'hexadecimal': 0.16; 'itself),': 0.16; 'mappings,': 0.16; 'received:mail-ig0-x22a.google.com': 0.16; 'scope.': 0.16; 'wrote:': 0.16; 'instance,': 0.18; 'integer': 0.18; 'intermediate': 0.18; '>>>': 0.20; 'otherwise,': 0.20; 'machine': 0.21; 'cc:2**0': 0.21; 'cc:addr:python.org': 0.21; '(the': 0.22; 'fairly': 0.22; 'am,': 0.23; '2015': 0.23; 'header:In-Reply-To:1': 0.24; 'converting': 0.27; 'operations,': 0.27; 'sequence': 0.27; 'message-id:@mail.gmail.com': 0.28; 'division': 0.29; 'function:': 0.29; 'value)': 0.29; 'convert': 0.29; 'certainly': 0.31; 'subject:time': 0.31; "can't": 0.32; 'table': 0.32; 'related': 0.32; 'true.': 0.33; 'running': 0.34; 'received:google.com': 0.34; 'that,': 0.34; 'useful': 0.35; 'done': 0.35; 'but': 0.36; 'except': 0.36; 'there': 0.36; 'faster': 0.36; 'subject:: ': 0.37; "skip:' 20": 0.37; 'tue,': 0.38; 'christian': 0.38; 'building': 0.38; 'where': 0.40; 'some': 0.40; 'simple': 0.61; 'taking': 0.62; '30,': 0.63; 'relatively': 0.63; 'complete': 0.63; 'within': 0.64; 'different': 0.64; 'between': 0.65; 'therefore': 0.65; '>>>>>': 0.66; 'believe': 0.67; 'jul': 0.72; 'special': 0.72; 'million': 0.73; 'square': 0.76; 'actually,': 0.84; 'chrisa': 0.84; 'faster.': 0.84; 'gollwitzer': 0.84; 'variation': 0.84; 'to:none': 0.90; 'analyses': 0.91; 'ratio': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=xUsj1B1ScTpuhf4ruh5nk7UoT542+dJHGY7MvTLtIOk=; b=X22m6W565TlbJyZHh1n4EOFYio714ZTkdruBxwacSfAZJNGVK+Jb7HkTUlPNNVO4pJ EVHF5TqcNwwtC9f+vKpDGMS6rY6sumjyi4Ko2va8EIeUrJ42Z4+XGAHZzosIqmGtDUXg 0fkurlEPlCI/ZaxoFCwluAWShoLjy90kD8OcVvhiWq+96vyXyJWIF1RaTIcinVcauoGT NSyWTnjCHdyd1Firw22Jw9o/3uvUPx/WrJqHN9iOxmLd2BNTpV6waUmH8Z88qb4kYfHu yQxnurRmQ4DsVSz/qr+DfFWsBI16qBcyEOJszw6xeQO1PvRyecHFcpHnF1nfnAkm275u k1iQ== MIME-Version: 1.0 X-Received: by 10.50.143.104 with SMTP id sd8mr25160294igb.14.1435680615398; Tue, 30 Jun 2015 09:10:15 -0700 (PDT) In-Reply-To: References: <777831f0-d4b4-48f6-ae0b-c9b1ea7ffc06@googlegroups.com> <87r3ouawgt.fsf@bsb.me.uk> <7c6dac9d-5722-4179-bd7e-ceaac6698490@googlegroups.com> Date: Wed, 1 Jul 2015 02:10:15 +1000 Subject: Re: Linear time baseconversion From: Chris Angelico Cc: Python Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 57 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1435680618 news.xs4all.nl 2859 [2001:888:2000:d::a6]:41505 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:93336 On Wed, Jul 1, 2015 at 1:45 AM, Ian Kelly wrote: > On Tue, Jun 30, 2015 at 9:40 AM, Ian Kelly wrote: >> On Tue, Jun 30, 2015 at 3:07 AM, Christian Gollwitzer wrote: >>> Am 30.06.15 um 10:52 schrieb jonas.thornvall@gmail.com: >>>> >>>> It still bug out on very big numbers if base outside integer scope. >>>> I am very keen on suggestions regarding the logic to make it faster. >>> >>> >>> Concerning the algorithmic complexity, it can't be faster than square time >>> in the number of digits N. Baseconversion needs to do a sequence of division >>> operations, where every operation gves you one digit in the new base. The >>> number of digits in the new base is proportional to the number of digits in >>> the old base (the ratio is log b1/log b2). Therefore it will be O(N^2). >> >> I don't think that's true. Here's a linear hexadecimal to binary function: >> >>>>> def hextobin(value): >> ... digits = {'0': '0000', '1': '0001', '2': '0010', '3': '0011', >> ... '4': '0100', '5': '0101', '6': '0110', '7': '0111', >> ... '8': '1000', '9': '1001', 'A': '1010', 'B': '1011', >> ... 'C': '1100', 'D': '1101', 'E': '1110', 'F': '1111'} >> ... return ''.join(digits[d.upper()] for d in value) >> ... >>>>> hextobin('3f') >> '00111111' >> >> I believe this approach can be extended to arbitrary bases with some >> effort, although for converting arbitrary base b1 to b2, you would >> need up to b2 different mappings if b1 and b2 are relatively prime. > > Actually, I think you need up to b1 * b2 mappings, as you're > effectively building a state machine with b1 * b2 states. The mappings > can be pre-computed, however, so actually running the state machine > would then just be a linear algorithm. Arbitrary base conversion has to be stateful. You can take a shortcut like this when the bases are related (eg binary, octal, hexadecimal), but otherwise, you need the division. Consider what happens when you convert the binary digit "1" to decimal, and then follow it with varying numbers of zeros: 1, 2, 4, 8, 16, 32, 64,... 32768, 65536, 131072,... 1048576,... 1073741824,... You can certainly do some useful analyses on the last digits (they'll never be anything other than 2, 4, 8, 6, except for the special case of 1 itself), but there's a lot of variation in the intermediate digits. When there's a simple ratio between the bases, it's fairly straight-forward to convert a few digits at a time. Converting base 256 into base 64, for instance, can be done by taking three digits and yielding four. But within that, you would still need a complete table of all sixteen million possibilities, if you want to do the lookup table. And that only works when there is that kind of relationship. ChrisA