Path: csiph.com!usenet.pasdenom.info!nntpfeed.proxad.net!proxad.net!feeder1-1.proxad.net!ecngs!feeder2.ecngs.de!novso.com!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Thu, 30 Apr 2015 14:53:10 -0400
From: Dave Angel <davea@davea.name>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Is my implementation of happy number OK
References: <87oam5vc8k.fsf@Equus.decebal.nl>
In-Reply-To: <87oam5vc8k.fsf@Equus.decebal.nl>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.140.1430420006.3680.python-list@python.org>
Lines: 103
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:89657

On 04/30/2015 11:59 AM, Cecil Westerhof wrote:
> I implemented happy_number function:
>      _happy_set      = { '1' }
>      _unhappy_set    = set()
>
>      def happy_number(n):
>          """
>          Check if a number is a happy number
>          https://en.wikipedia.org/wiki/Happy_number
>          """
>
>          def create_current(n):
>              current_array = sorted([value for value in str(n) if value != '0'])
>              return (current_array, ''.join(current_array))
>
>          global _happy_set
>          global _unhappy_set
>
>          current_run         = set()
>          current_array, \
>              current_string  = create_current(n)
>          if current_string in _happy_set:
>              return True
>          if current_string in _unhappy_set:
>              return False
>          while True:
>              current_run.add(current_string)
>              current_array, \
>                  current_string = create_current(sum([int(value) ** 2
>                                                       for value in current_string]))
>              if current_string in current_run | _unhappy_set:
>                  _unhappy_set |= current_run
>                  return False
>              if current_string in _happy_set:
>                  _happy_set |= current_run
>                  return True
>
> Besides it need some documentation: is it a good implementation? Or
> are there things I should do differently?
>
> To decide for the values from 1 to 1E8 if it is happy or not, takes
> 280 seconds. Not to bad I think. Also not very good.
>

First comment, if you're coding a complex implementation like this, take 
the time to do a brute force one as well. Then you can compare the 
results between brute force and your optimized one for all possible 
values, and make sure you haven't introduced any bugs.

My brute force looks like:

#Dave's version, brute force

def davea1(n):
     cache = []
     anum = str(n)
     newnum = 0
     while newnum != 1:
         newnum = sum(int(i)*int(i) for i in anum)
         anum = str(newnum)
         if newnum in cache:
             return False     #not a happy number
         cache.append(newnum)
     return True  #found a happy number

I then tried an optimized one, and my speed is only about 10% faster 
than yours for 1e7 loops.  I show it anyway, since I think it reads a 
little better.  And readability counts much more than a little performance.

  #optimizations:
#   cached happy and unhappy sets
#   sort the digits, and compare only the sorted values, without zeroes
davea_happy = {1}
davea_unhappy = set()

SQUARES = dict((str(i), i*i) for i in xrange(10))

def davea2(n):
     global davea_happy, davea_unhappy
     cache = set()
     newnum = n
     while newnum != 1:
         newnum = int("".join(sorted(str(newnum))))
         if newnum in davea_unhappy or newnum in cache:
             davea_unhappy |= cache
             return False     #not a happy number
         if newnum in davea_happy:
             break
         cache.add(newnum)
         newnum = sum(SQUARES[ch] for ch in str(newnum))
     davea_happy |= cache
     return True  #found a happy number

Finally, I did some testing on Jon Ribben's version.  His was 
substantially faster for smaller sets, and about the same for 10*7.  So 
it's likely it'll be slower than yours and mine for 10**8.  But the real 
reason I didn't like it was it produced a much larger set of 
happy_numbers, which could clog memory a lot sooner.  For 10**7 items, I 
had 3250 happy members, and 19630 unhappy.  And Jon had 1418854 happy 
members.

-- 
DaveA