Groups > comp.lang.python > #49945 > unrolled thread

How to make this faster

Started by	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
First post	2013-07-05 08:22 +0000
Last post	2013-07-05 18:42 +0000
Articles	20 on this page of 27 — 6 participants

Back to article view | Back to comp.lang.python

  How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 08:22 +0000
    Re: How to make this faster Fábio Santos <fabiosantosart@gmail.com> - 2013-07-05 10:38 +0100
    Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 10:07 +0000
    Re: How to make this faster Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-07-05 11:13 +0100
    Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 10:53 +0000
      Re: How to make this faster Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-05 12:02 +0000
      Re: How to make this faster Fábio Santos <fabiosantosart@gmail.com> - 2013-07-05 13:44 +0100
      Re: How to make this faster Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-07-05 14:41 +0100
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 14:28 +0000
        Re: How to make this faster Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-07-05 15:45 +0100
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 14:48 +0000
        Re: How to make this faster Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-07-05 16:26 +0100
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 14:54 +0000
        Re: How to make this faster Fábio Santos <fabiosantosart@gmail.com> - 2013-07-05 16:18 +0100
        Re: How to make this faster Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-05 16:24 +0000
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 15:17 +0000
        Re: How to make this faster Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-07-05 16:38 +0100
        Re: How to make this faster MRAB <python@mrabarnett.plus.com> - 2013-07-05 17:25 +0100
        Re: How to make this faster Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-06 02:45 +0100
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 15:47 +0000
        Re: How to make this faster Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-05 16:52 +0000
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 16:07 +0000
        Re: How to make this faster Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-05 16:50 +0000
          Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 18:39 +0000
            Re: How to make this faster Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-06 03:05 +0000
            Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-06 07:25 +0000
      Re: How to make this faster Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2013-07-05 18:42 +0000

Page 1 of 2 [1] 2 Next page →

#49945 — How to make this faster

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 08:22 +0000
Subject	How to make this faster
Message-ID	<b3ne2rFojnkU1@mid.dfncis.de>

Hi,

I have coded a simple algorithm to solve a Sudoku (probably not the first one).
Unfortunately, it takes 13 seconds for a difficult problem which is more than 75 times slower
than the same algorithm coded in C++.
Is this to be expected or could I have made my Python version faster *** without *** sacrificing readability.
Profiling shows that the function find_good_cell is called (only) 45267 times and this take 12.9 seconds
CPU time (on a 3.2 GHz machine)

For your pleasure I include the full code below.

Many thanks for a hint,
Helmut


#!/usr/bin/python3
import numpy as np

Grid= np.zeros((9,9),dtype=int)
Row_Digits = np.asarray(np.zeros((9,10)),dtype=bool)
Col_Digits = np.asarray(np.zeros((9,10)),dtype=bool)
Sqr_Digits = np.asarray(np.zeros((9,10)),dtype=bool)

def Read_Sudoku(Input) :
  r= -1
  R_Cells= 81
  for line in Input :
    line= line.strip()
    if  len(line) == 0  or  line[0] == '#' : continue
    r+= 1
    for (c,ch) in enumerate(line) :
      if  ch == '_' : continue
      if not ch in "123456789_" :
        raise ValueError("invalid character {0} in input line {1}".format(c,line))
      Sq_No= (r//3)*3+c//3
      d= int(ch)
      Grid[r,c]= d
      Row_Digits[r,d]= True
      Col_Digits[c,d]= True
      Sqr_Digits[Sq_No,d]= True
      R_Cells-= 1
  
  return R_Cells

def Print_Grid() :
  Sep = "+---+---+---#---+---+---#---+---+---+"
  SepK= "#####################################"
  print(Sep)
  for r in range(9) :
    print('|',end='')
    for c in range(9) :
      d= Grid[r,c]
      print(" {0} {1}".format( str(d) if d>0 else ' ',
                               '#' if (c+1)%3==0 and c>0 and c<8 else '|' ), end='')
    print("\n{}".format(SepK if (r+1)%3==0 and r>0 and r<8 else Sep))


def find_good_cell() :
  Best= None
  minPoss= 10
  for r in range(9) :
    for c in range(9) :
      if  Grid[r,c] > 0 : continue
      Sq_No= (r//3)*3+c//3
      Possibilities= 0
      for d in range(1,10) :
        if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
        Possibilities+= 1
      
      if ( Possibilities < minPoss ) :
        minPoss= Possibilities
        Best= (r,c)

  if minPoss == 0 : Best=(-1,-1)
  return Best

def Solve(R_Cells) :
  if  R_Cells == 0 :
    print("\n\n++++++++++ S o l u t i o n ++++++++++\n")
    Print_Grid()
    return True

  r,c= find_good_cell()
  if r < 0 : return False
  Sq_No= (r//3)*3+c//3
  
  for d in range(1,10) :
    if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
    # put d into Grid
    Grid[r,c]= d
    Row_Digits[r,d]= True
    Col_Digits[c,d]= True
    Sqr_Digits[Sq_No,d]= True
    
    Success= Solve(R_Cells-1)

    # remove d again
    Grid[r,c]= 0
    Row_Digits[r,d]= False
    Col_Digits[c,d]= False
    Sqr_Digits[Sq_No,d]= False
  
    if Success :
      Zuege.append((d,r,c))
      return True

  return False

from io import StringIO

Problem='''
_________
_____3_85
__1_2____
___5_7___
__4___1__
_9_______
5______73
__2_1____
____4___9
'''

Input= StringIO(Problem)
from time import process_time
R_Cells= Read_Sudoku(Input)
Print_Grid()
Zuege=[]
Start= process_time()
# import cProfile
# cProfile.run("Success = Solve(R_Cells)")
Success = Solve(R_Cells)
Stop= process_time()
print("after {} seconds:".format(Stop-Start))
if Success :
  print("\nZuege:")
  n=0
  for Z in reversed(Zuege) :
    print("{0} -> ({1},{2})\t".format(Z[0],Z[1]+1,Z[2]+1),end='')
    n+= 1
    if n%5 == 0 :
      print()

  print()

[toc] | [next] | [standalone]

#49958

From	Fábio Santos <fabiosantosart@gmail.com>
Date	2013-07-05 10:38 +0100
Message-ID	<mailman.4287.1373017118.3114.python-list@python.org>
In reply to	#49945

[Multipart message — attachments visible in raw view] — view raw

On 5 Jul 2013 09:29, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote:
>
> Hi,
>
> I have coded a simple algorithm to solve a Sudoku (probably not the first
one).
> Unfortunately, it takes 13 seconds for a difficult problem which is more
than 75 times slower
> than the same algorithm coded in C++.
> Is this to be expected or could I have made my Python version faster ***
without *** sacrificing readability.
> Profiling shows that the function find_good_cell is called (only) 45267
times and this take 12.9 seconds
> CPU time (on a 3.2 GHz machine)

[Skipping to bottleneck]

> def find_good_cell() :

In this function you are accessing global variables a lot of times. Since
accessing globals takes much more time than accessing locals, I advise you
to assign them to local names, and use them.

>   Best= None
>   minPoss= 10
>   for r in range(9) :
>     for c in range(9) :
>       if  Grid[r,c] > 0 : continue
>       Sq_No= (r//3)*3+c//3
>       Possibilities= 0
>       for d in range(1,10) :

On this condition (below) you can try to check which condition is True more
often and put that condition first in order to take advantage of short
circuiting and minimize array access.

>         if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] :
continue
>         Possibilities+= 1
>
>       if ( Possibilities < minPoss ) :
>         minPoss= Possibilities
>         Best= (r,c)
>
>   if minPoss == 0 : Best=(-1,-1)
>   return Best

Well I think that is as far as I go with my knowledge. Someone who knows
numpy arrays' performance traits should be able to help you more.

[toc] | [prev] | [next] | [standalone]

#49959

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 10:07 +0000
Message-ID	<b3nk7tFojnkU2@mid.dfncis.de>
In reply to	#49945

On Fri, 05 Jul 2013 10:38:35 +0100, Fábio Santos wrote:

> [Skipping to bottleneck]
> 
>> def find_good_cell() :
> 
> In this function you are accessing global variables a lot of times. Since
> accessing globals takes much more time than accessing locals, I advise you
> to assign them to local names, and use them.

I've tried to use local "references" like
G= Grid and using G instead of Grid below and similarly for
Row_Digits, Col_Digits and Sqr_Digits but it had no noticeable effect.

> 
>>   Best= None
>>   minPoss= 10
>>   for r in range(9) :
>>     for c in range(9) :
>>       if  Grid[r,c] > 0 : continue
>>       Sq_No= (r//3)*3+c//3
>>       Possibilities= 0
>>       for d in range(1,10) :
> 
> On this condition (below) you can try to check which condition is True more
> often and put that condition first in order to take advantage of short
> circuiting and minimize array access.
Unfortunately, that's unpredictable.
> 
>>         if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] :
> continue

Many thanks,
Helmut

[toc] | [prev] | [next] | [standalone]

#49960

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-07-05 11:13 +0100
Message-ID	<mailman.4288.1373019242.3114.python-list@python.org>
In reply to	#49945

On 5 July 2013 09:22, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
> Hi,
>
> I have coded a simple algorithm to solve a Sudoku (probably not the first one).
> Unfortunately, it takes 13 seconds for a difficult problem which is more than 75 times slower
> than the same algorithm coded in C++.
> Is this to be expected or could I have made my Python version faster *** without *** sacrificing readability.

It is to be expected that this kind of processing is faster in C than
in straight Python. Where Python can win in these kind of problems is
if it makes it easier to implement a better algorithm but if your code
is just a transliteration from C to Python you should expect it to run
slower. Another way that Python can win is if you can express your
problem in terms of optimised operations from e.g. the numpy library
but you're not doing that here.

Of course that does depend on your Python implementation so you could
try e.g. using PyPy/nuitka/cython etc. to speed up the core
processing.

> Profiling shows that the function find_good_cell is called (only) 45267 times and this take 12.9 seconds
> CPU time (on a 3.2 GHz machine)
[snip]
>
> def find_good_cell() :
>   Best= None
>   minPoss= 10
>   for r in range(9) :
>     for c in range(9) :
>       if  Grid[r,c] > 0 : continue
>       Sq_No= (r//3)*3+c//3
>       Possibilities= 0
>       for d in range(1,10) :
>         if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
>         Possibilities+= 1
>
>       if ( Possibilities < minPoss ) :
>         minPoss= Possibilities
>         Best= (r,c)
>
>   if minPoss == 0 : Best=(-1,-1)
>   return Best

My one comment is that you're not really making the most out of numpy
arrays. Numpy's ndarrays are efficient when each line of Python code
is triggering a large number of numerical computations performed over
the array. Because of their N-dimensional nature and the fact that
they are in some sense second class citizens in CPython they are often
not as good as lists for this kind of looping and indexing.

I would actually expect this program to run faster with ordinary
Python lists and lists of lists. It means that you need to change e.g.
Grid[r, c] to Grid[r][c] but really I think that the indexing syntax
is all you're getting out of numpy here.

Oscar

[toc] | [prev] | [next] | [standalone]

#49968

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 10:53 +0000
Message-ID	<b3nmtfFojnkU3@mid.dfncis.de>
In reply to	#49945

On Fri, 05 Jul 2013 11:13:33 +0100, Oscar Benjamin wrote:

> My one comment is that you're not really making the most out of numpy
> arrays. Numpy's ndarrays are efficient when each line of Python code
> is triggering a large number of numerical computations performed over
> the array. Because of their N-dimensional nature and the fact that
> they are in some sense second class citizens in CPython they are often
> not as good as lists for this kind of looping and indexing.
> 
> I would actually expect this program to run faster with ordinary
> Python lists and lists of lists. It means that you need to change e.g.
> Grid[r, c] to Grid[r][c] but really I think that the indexing syntax
> is all you're getting out of numpy here.
> 

Thanks Oscar, that was a big improvement, indeed.
Using lists of lists instead of numpy arrays made the code more than
twice as fast (13 seconds down to 6 seconds)

Since I don't do any numerical stuff with the arrays, Numpy doesn't seem to be
a good choice. I think this is an argument to add real arrays to Python.

I even tried to use dictionaries instead of Numpy arrays. This version is a bit
slower then the lists of lists version (7.2 seconds instead of 6 second) but still
much faster than the Numpy array solution.

Helmut

[toc] | [prev] | [next] | [standalone]

#49975

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-07-05 12:02 +0000
Message-ID	<51d6b5cd$0$29999$c3e8da3$5496439d@news.astraweb.com>
In reply to	#49968

On Fri, 05 Jul 2013 10:53:35 +0000, Helmut Jarausch wrote:

> Since I don't do any numerical stuff with the arrays, Numpy doesn't seem
> to be a good choice. I think this is an argument to add real arrays to
> Python.

Guido's time machine strikes again:

import array

By the way, I'm not exactly sure how you go from "I don't do numerical 
calculations on numpy arrays" to "therefore Python should have arrays".

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#49980

From	Fábio Santos <fabiosantosart@gmail.com>
Date	2013-07-05 13:44 +0100
Message-ID	<mailman.4294.1373028307.3114.python-list@python.org>
In reply to	#49968

[Multipart message — attachments visible in raw view] — view raw

On 5 Jul 2013 11:58, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote:
>
> On Fri, 05 Jul 2013 11:13:33 +0100, Oscar Benjamin wrote:
>
> > My one comment is that you're not really making the most out of numpy
> > arrays. Numpy's ndarrays are efficient when each line of Python code
> > is triggering a large number of numerical computations performed over
> > the array. Because of their N-dimensional nature and the fact that
> > they are in some sense second class citizens in CPython they are often
> > not as good as lists for this kind of looping and indexing.
> >
> > I would actually expect this program to run faster with ordinary
> > Python lists and lists of lists. It means that you need to change e.g.
> > Grid[r, c] to Grid[r][c] but really I think that the indexing syntax
> > is all you're getting out of numpy here.
> >
>
> Thanks Oscar, that was a big improvement, indeed.
> Using lists of lists instead of numpy arrays made the code more than
> twice as fast (13 seconds down to 6 seconds)
>
> Since I don't do any numerical stuff with the arrays, Numpy doesn't seem
to be
> a good choice. I think this is an argument to add real arrays to Python.
>
> I even tried to use dictionaries instead of Numpy arrays. This version is
a bit
> slower then the lists of lists version (7.2 seconds instead of 6 second)
but still
> much faster than the Numpy array solution.

May I suggest you avoid range and use enumerate(the_array) instead? It
might be faster.

[toc] | [prev] | [next] | [standalone]

#49983

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-07-05 14:41 +0100
Message-ID	<mailman.4296.1373031707.3114.python-list@python.org>
In reply to	#49968

On 5 July 2013 11:53, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
> I even tried to use dictionaries instead of Numpy arrays. This version is a bit
> slower then the lists of lists version (7.2 seconds instead of 6 second) but still
> much faster than the Numpy array solution.

When you switched to dictionaries did you take advantage of the
sparseness by iterating over dictionary keys instead of indices? This
is the kind of thing that I meant when I said that in Python it's
often easier to implement a better algorithm than in C. What I mean is
that if Grid is a dict so that Grid[(r, c)] is the entry at row r and
column c (if it exists) then you can change a loop like:

    for r in range(9):
        for c in range(9):
            if Grid[r, c] > 0: continue
            # do stuff

so that it looks like:

    for r, c in Grid:
        # do stuff

If the grid is sparsely occupied then this could be a significant improvement.

Oscar

[toc] | [prev] | [next] | [standalone]

#49988

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 14:28 +0000
Message-ID	<b3o3gkFtc39U1@mid.dfncis.de>
In reply to	#49968

On Fri, 05 Jul 2013 14:41:23 +0100, Oscar Benjamin wrote:

> On 5 July 2013 11:53, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
>> I even tried to use dictionaries instead of Numpy arrays. This version is a bit
>> slower then the lists of lists version (7.2 seconds instead of 6 second) but still
>> much faster than the Numpy array solution.
> 
> When you switched to dictionaries did you take advantage of the
> sparseness by iterating over dictionary keys instead of indices? This
> is the kind of thing that I meant when I said that in Python it's
> often easier to implement a better algorithm than in C. What I mean is
> that if Grid is a dict so that Grid[(r, c)] is the entry at row r and
> column c (if it exists) then you can change a loop like:
> 
>     for r in range(9):
>         for c in range(9):
>             if Grid[r, c] > 0: continue
>             # do stuff
> 
> so that it looks like:
> 
>     for r, c in Grid:
>         # do stuff
> 
> If the grid is sparsely occupied then this could be a significant improvement.
> 
> 
> Oscar

This gives a big speedup. Now, the time is gone down to 1.73 seconds in comparison to
original 13 seconds or the 7 seconds for the first version above.

Many thanks,
it seems hard to optimize a Python program,
Helmut

[toc] | [prev] | [next] | [standalone]

#49989

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-07-05 15:45 +0100
Message-ID	<mailman.4299.1373035554.3114.python-list@python.org>
In reply to	#49988

On 5 July 2013 15:28, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
> On Fri, 05 Jul 2013 14:41:23 +0100, Oscar Benjamin wrote:
>
>> On 5 July 2013 11:53, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
>>> I even tried to use dictionaries instead of Numpy arrays. This version is a bit
>>> slower then the lists of lists version (7.2 seconds instead of 6 second) but still
>>> much faster than the Numpy array solution.
>>
>> When you switched to dictionaries did you take advantage of the
>> sparseness by iterating over dictionary keys instead of indices? This
>> is the kind of thing that I meant when I said that in Python it's
>> often easier to implement a better algorithm than in C. What I mean is
>> that if Grid is a dict so that Grid[(r, c)] is the entry at row r and
>> column c (if it exists) then you can change a loop like:
>>
>>     for r in range(9):
>>         for c in range(9):
>>             if Grid[r, c] > 0: continue
>>             # do stuff
>>
>> so that it looks like:
>>
>>     for r, c in Grid:
>>         # do stuff
>>
>> If the grid is sparsely occupied then this could be a significant improvement.
>>
> This gives a big speedup. Now, the time is gone down to 1.73 seconds in comparison to
> original 13 seconds or the 7 seconds for the first version above.

Presumably then you're now down to the innermost loop as a bottle-neck:

      Possibilities= 0
      for d in range(1,10) :
        if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
        Possibilities+= 1

If you make it so that e.g. Row_Digits[r] is a set of indices rather
than a list of bools then you can do this with something like

    Possibilities = len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])

or perhaps

    Possibilities = len(set.union(Row_Digits[r], Col_Digits[c],
Sqr_Digits[Sq_No]))

which I would expect to be a little faster than looping over range
since the loop is then performed under the hood by the builtin
set-type.

> Many thanks,
> it seems hard to optimize a Python program,

It just takes practice. It's a little less obvious in Python than in
low-level languages where the bottlenecks will be and which operations
are faster/slower but optimisation always involves a certain amount of
trial and error anyway.


Oscar

[toc] | [prev] | [next] | [standalone]

#49990

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 14:48 +0000
Message-ID	<b3o4lhFtc39U2@mid.dfncis.de>
In reply to	#49968

On Fri, 05 Jul 2013 12:02:21 +0000, Steven D'Aprano wrote:

> On Fri, 05 Jul 2013 10:53:35 +0000, Helmut Jarausch wrote:
> 
>> Since I don't do any numerical stuff with the arrays, Numpy doesn't seem
>> to be a good choice. I think this is an argument to add real arrays to
>> Python.
> 
> Guido's time machine strikes again:
> 
> import array
> 
> 
> By the way, I'm not exactly sure how you go from "I don't do numerical 
> calculations on numpy arrays" to "therefore Python should have arrays".

I should have been more clear. I meant multi-dimensional arrays (2D, at least)
Numpy is fine if I do math with matrices (without loops in python).

Given that I don't like to use the old FORTRAN way (when "dynamic" arrays are passed to
functions) of indexing a 2-d array I would need a MACRO or an INLINED function in Python
or something like a META-compiler phase     transforming

def access2d(v,i,j,dim1) :  # doesn't work on the l.h.s.
  return v[i*dim1+j]

access2d(v,i,j,dim1) = 7    # at compile time, please

to

v[i*dim1+j]= 7  # this, by itself, is considered ugly (the FORTRAN way)

Thanks for the discussion,
Helmut

[toc] | [prev] | [next] | [standalone]

#49994

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-07-05 16:26 +0100
Message-ID	<mailman.4301.1373038497.3114.python-list@python.org>
In reply to	#49990

On 5 July 2013 15:48, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
> On Fri, 05 Jul 2013 12:02:21 +0000, Steven D'Aprano wrote:
>
>> On Fri, 05 Jul 2013 10:53:35 +0000, Helmut Jarausch wrote:
>>
>>> Since I don't do any numerical stuff with the arrays, Numpy doesn't seem
>>> to be a good choice. I think this is an argument to add real arrays to
>>> Python.
>>
>> Guido's time machine strikes again:
>>
>> import array
>>
>>
>> By the way, I'm not exactly sure how you go from "I don't do numerical
>> calculations on numpy arrays" to "therefore Python should have arrays".
>
> I should have been more clear. I meant multi-dimensional arrays (2D, at least)
> Numpy is fine if I do math with matrices (without loops in python).
>
> Given that I don't like to use the old FORTRAN way (when "dynamic" arrays are passed to
> functions) of indexing a 2-d array I would need a MACRO or an INLINED function in Python
> or something like a META-compiler phase     transforming
>
> def access2d(v,i,j,dim1) :  # doesn't work on the l.h.s.
>   return v[i*dim1+j]
>
> access2d(v,i,j,dim1) = 7    # at compile time, please
>
> to
>
> v[i*dim1+j]= 7  # this, by itself, is considered ugly (the FORTRAN way)

The list of lists approach works fine for what you're doing. I don't
think that a[r][c] is that much worse than a[r, c]. It's only when you
want to do something like a[:, c] that it breaks down. In any case,
your algorithm would work better with Python's set/dict/list types
than numpy arrays.

One of the reasons that it's faster to use lists than numpy arrays (as
you found out) is precisely because the N-dimensional array logic
complicates 1-dimensional processing. I've seen discussions in Cython
and numpy about lighter-weight 1-dimensional array types for this
reason.

The other reason that numpy arrays are slower for what you're doing is
that (just like the stdlib array type Steven referred to) they use
homogeneous types in a contiguous buffer and each element is not a
Python object in its own right until you access it with e.g. a[0].
That means that the numpy array has to create a new object every time
you index into it whereas the list can simply return a new reference
to an existing object. You can get the same effect with numpy arrays
by using dtype=object but I'd still expect it to be slower for this.

Oscar

[toc] | [prev] | [next] | [standalone]

#49991

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 14:54 +0000
Message-ID	<b3o512Ftc39U3@mid.dfncis.de>
In reply to	#49968

On Fri, 05 Jul 2013 13:44:57 +0100, Fábio Santos wrote:
May I suggest you avoid range and use enumerate(the_array) instead? It
might be faster.

How does this work?

Given

Grid= [[0 for j in range(9)] for i in range(9)]

for (r,c,val) in ????(Grid) :

Helmut

[toc] | [prev] | [next] | [standalone]

#49993

From	Fábio Santos <fabiosantosart@gmail.com>
Date	2013-07-05 16:18 +0100
Message-ID	<mailman.4300.1373037531.3114.python-list@python.org>
In reply to	#49991

[Multipart message — attachments visible in raw view] — view raw

On 5 Jul 2013 15:59, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote:
>
> On Fri, 05 Jul 2013 13:44:57 +0100, Fábio Santos wrote:
> May I suggest you avoid range and use enumerate(the_array) instead? It
> might be faster.
>
> How does this work?
>
> Given
>
> Grid= [[0 for j in range(9)] for i in range(9)]
>
> for (r,c,val) in ????(Grid) :
>
> Helmut

for r, row_lst in enumerate(Grid):
    for c, val in enumerate(row_lst):

[toc] | [prev] | [next] | [standalone]

#49999

From	Steven D'Aprano <steve+comp.lang.python@pearwood.info>
Date	2013-07-05 16:24 +0000
Message-ID	<51d6f332$0$29999$c3e8da3$5496439d@news.astraweb.com>
In reply to	#49991

On Fri, 05 Jul 2013 14:54:26 +0000, Helmut Jarausch wrote:

> On Fri, 05 Jul 2013 13:44:57 +0100, Fábio Santos wrote: May I suggest
> you avoid range and use enumerate(the_array) instead? It might be
> faster.
> 
> How does this work?
> 
> Given
> 
> Grid= [[0 for j in range(9)] for i in range(9)]

This creates a list containing nine lists, each one of which contains 0 
repeated nine times.

That can be simplified to:

grid = [[0]*9 for i in range(9)]

which will likely be faster, although for such a small number of lists 
you may not notice the difference.

> for (r,c,val) in ????(Grid) :

This causes a SyntaxError, as ???? is not legal Python code. I'm not 
really sure what you think you are trying to do here. But I think Fábio 
means that instead of writing code like this:

for r in range(9):
    for c in range(9):
        value = grid[r][c]
        ...

you should consider writing this:

for r, row in enumerate(grid):
    for c, value in enumerate(row):
        ...

-- 
Steven

[toc] | [prev] | [next] | [standalone]

#49992

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 15:17 +0000
Message-ID	<b3o6bkFtc39U4@mid.dfncis.de>
In reply to	#49968

On Fri, 05 Jul 2013 15:45:25 +0100, Oscar Benjamin wrote:

> Presumably then you're now down to the innermost loop as a bottle-neck:
> 
>       Possibilities= 0
>       for d in range(1,10) :
>         if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
>         Possibilities+= 1
> 
> If you make it so that e.g. Row_Digits[r] is a set of indices rather
> than a list of bools then you can do this with something like
> 
>     Possibilities = len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
> 
> or perhaps
> 
>     Possibilities = len(set.union(Row_Digits[r], Col_Digits[c],
> Sqr_Digits[Sq_No]))
> 
> which I would expect to be a little faster than looping over range
> since the loop is then performed under the hood by the builtin
> set-type.
> 
> It just takes practice. 

indeed

> It's a little less obvious in Python than in
> low-level languages where the bottlenecks will be and which operations
> are faster/slower but optimisation always involves a certain amount of
> trial and error anyway.
> 
> 
> Oscar

I've tried the following version

def find_good_cell() :
  Best= None
  minPoss= 10
  for r,c in Grid :
    if  Grid[(r,c)] > 0 : continue
    Sq_No= (r//3)*3+c//3
    Possibilities= 9-len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
    if ( Possibilities < minPoss ) :
      minPoss= Possibilities
      Best= (r,c)

  if minPoss == 0 : Best=(-1,-1)
  return Best

All_digits= set((1,2,3,4,5,6,7,8,9))

def Solve(R_Cells) :
  if  R_Cells == 0 :
    print("\n\n++++++++++ S o l u t i o n ++++++++++\n")
    Print_Grid()
    return True

  r,c= find_good_cell()
  if r < 0 : return False
  Sq_No= (r//3)*3+c//3
  
  for d in All_digits - (Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No]) :
    # put d into Grid
    Grid[(r,c)]= d
    Row_Digits[r].add(d)
    Col_Digits[c].add(d)
    Sqr_Digits[Sq_No].add(d)
    
    Success= Solve(R_Cells-1)

    # remove d again
    Grid[(r,c)]= 0
    Row_Digits[r].remove(d)
    Col_Digits[c].remove(d)
    Sqr_Digits[Sq_No].remove(d)
  
    if Success :
      Zuege.append((d,r,c))
      return True

  return False

which turns out to be as fast as the previous "dictionary only version".
Probably,  set.remove is a bit slow

Thanks,
Helmut

[toc] | [prev] | [next] | [standalone]

#49996

From	Oscar Benjamin <oscar.j.benjamin@gmail.com>
Date	2013-07-05 16:38 +0100
Message-ID	<mailman.4302.1373038751.3114.python-list@python.org>
In reply to	#49992

On 5 July 2013 16:17, Helmut Jarausch <jarausch@igpm.rwth-aachen.de> wrote:
>
> I've tried the following version
>
> def find_good_cell() :
>   Best= None
>   minPoss= 10
>   for r,c in Grid :
>     if  Grid[(r,c)] > 0 : continue

Sorry, I think what I meant was that you should have a structure
called e.g. Remaining which is the set of all (r, c) pairs that you
want to loop over here. Then there's no need to check on each
iteration whether or not Grid[(r, c)] > 0. When I said "sparse" I
meant that you don't need to set keys in Grid unless you actually have
a value there so the test "Grid[(r, c)] > 0" would look like "(r, c)
in Grid". Remaining is the set of all (r, c) pairs not in Grid that
you update incrementally with .add() and .remove().

Then this

   for r,c in Grid :
     if  Grid[(r,c)] > 0 : continue

becomes

    for r, c in Remaining:

>     Sq_No= (r//3)*3+c//3
>     Possibilities= 9-len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
>     if ( Possibilities < minPoss ) :
>       minPoss= Possibilities
>       Best= (r,c)
>
>   if minPoss == 0 : Best=(-1,-1)
>   return Best
>
> All_digits= set((1,2,3,4,5,6,7,8,9))

All_digits= set(range(1, 10))

or

All_digits = {1,2,3,4,5,6,7,8,9}

>
> def Solve(R_Cells) :
>   if  R_Cells == 0 :
>     print("\n\n++++++++++ S o l u t i o n ++++++++++\n")
>     Print_Grid()
>     return True
>
>   r,c= find_good_cell()
>   if r < 0 : return False
>   Sq_No= (r//3)*3+c//3
>
>   for d in All_digits - (Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No]) :
>     # put d into Grid
>     Grid[(r,c)]= d
>     Row_Digits[r].add(d)
>     Col_Digits[c].add(d)
>     Sqr_Digits[Sq_No].add(d)
>
>     Success= Solve(R_Cells-1)
>
>     # remove d again
>     Grid[(r,c)]= 0
>     Row_Digits[r].remove(d)
>     Col_Digits[c].remove(d)
>     Sqr_Digits[Sq_No].remove(d)
>
>     if Success :
>       Zuege.append((d,r,c))
>       return True
>
>   return False
>
> which turns out to be as fast as the previous "dictionary only version".
> Probably,  set.remove is a bit slow

No it's not and you're not using it in your innermost loops anyway.
Probably the loop I referred to isn't your bottleneck.


Oscar

[toc] | [prev] | [next] | [standalone]

#50000

From	MRAB <python@mrabarnett.plus.com>
Date	2013-07-05 17:25 +0100
Message-ID	<mailman.4303.1373041542.3114.python-list@python.org>
In reply to	#49992

On 05/07/2013 16:17, Helmut Jarausch wrote:
> On Fri, 05 Jul 2013 15:45:25 +0100, Oscar Benjamin wrote:
>
>> Presumably then you're now down to the innermost loop as a bottle-neck:
>>
>>       Possibilities= 0
>>       for d in range(1,10) :
>>         if Row_Digits[r,d] or Col_Digits[c,d] or Sqr_Digits[Sq_No,d] : continue
>>         Possibilities+= 1
>>
>> If you make it so that e.g. Row_Digits[r] is a set of indices rather
>> than a list of bools then you can do this with something like
>>
>>     Possibilities = len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
>>
>> or perhaps
>>
>>     Possibilities = len(set.union(Row_Digits[r], Col_Digits[c],
>> Sqr_Digits[Sq_No]))
>>
>> which I would expect to be a little faster than looping over range
>> since the loop is then performed under the hood by the builtin
>> set-type.
>>
>> It just takes practice.
>
> indeed
>
>> It's a little less obvious in Python than in
>> low-level languages where the bottlenecks will be and which operations
>> are faster/slower but optimisation always involves a certain amount of
>> trial and error anyway.
>>
>>
>> Oscar
>
> I've tried the following version
>
> def find_good_cell() :
>    Best= None
>    minPoss= 10
>    for r,c in Grid :
>      if  Grid[(r,c)] > 0 : continue
>      Sq_No= (r//3)*3+c//3
>      Possibilities= 9-len(Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No])
>      if ( Possibilities < minPoss ) :
>        minPoss= Possibilities
>        Best= (r,c)
>
>    if minPoss == 0 : Best=(-1,-1)
>    return Best
>
> All_digits= set((1,2,3,4,5,6,7,8,9))
>
> def Solve(R_Cells) :
>    if  R_Cells == 0 :
>      print("\n\n++++++++++ S o l u t i o n ++++++++++\n")
>      Print_Grid()
>      return True
>
>    r,c= find_good_cell()
>    if r < 0 : return False
>    Sq_No= (r//3)*3+c//3
>
>    for d in All_digits - (Row_Digits[r] | Col_Digits[c] | Sqr_Digits[Sq_No]) :
>      # put d into Grid
>      Grid[(r,c)]= d
>      Row_Digits[r].add(d)
>      Col_Digits[c].add(d)
>      Sqr_Digits[Sq_No].add(d)
>
>      Success= Solve(R_Cells-1)
>
>      # remove d again
>      Grid[(r,c)]= 0
>      Row_Digits[r].remove(d)
>      Col_Digits[c].remove(d)
>      Sqr_Digits[Sq_No].remove(d)
>
>      if Success :
>        Zuege.append((d,r,c))
>        return True
>
>    return False
>
> which turns out to be as fast as the previous "dictionary only version".
> Probably,  set.remove is a bit slow
>
For comparison, here's my solution:

from collections import Counter

problem = '''
_________
_____3_85
__1_2____
___5_7___
__4___1__
_9_______
5______73
__2_1____
____4___9
'''

# Build the grid.
digits = "123456789"

grid = []

for row in problem.splitlines():
   if not row:
     continue

   new_row = []

   for cell in row:
     if cell.isdigit():
       new_row.append({cell})
     else:
       new_row.append(set(digits))

   grid.append(new_row)

# Solve the grid.
changed = True
while changed:
   changed = False

   # Look for cells that contain only one digit.
   for r in range(9):
     for c in range(9):
       if len(grid[r][c]) == 1:
         digit = list(grid[r][c])[0]

         # Remove from other cells in same row.
         for c2 in range(9):
           if c2 != c and digit in grid[r][c2]:
             grid[r][c2].remove(digit)
             changed = True

         # Remove from other cells in same column.
         for r2 in range(9):
           if r2 != r and digit in grid[r2][c]:
             grid[r2][c].remove(digit)
             changed = True

         # Remove from other cells in the same block of 9.
         start_row = r - r % 3
         start_column = c - c % 3
         for r2 in range(start_row, start_row + 3):
           for c2 in range(start_column, start_column + 3):
             if (r2, c2) != (r, c) and digit in grid[r2][c2]:
               grid[r2][c2].remove(digit)
               changed = True

   # Look for digits that occur in only one cell in a row.
   for r in range(9):
     counts = Counter()
     for c in range(9):
       counts += Counter(grid[r][c])

     unique = {digit for digit, times in counts.items() if times == 1}

     for c in range(9):
       if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
         grid[r][c] &= unique
         changed = True

   # Look for digits that occur in only one cell in a column.
   for c in range(9):
     counts = Counter()
     for r in range(9):
       counts += Counter(grid[r][c])

     unique = {digit for digit, times in counts.items() if times == 1}

     for r in range(9):
       if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
         grid[r][c] &= unique
         changed = True

   # Look for digits that occur in only one cell in a block of 9.
   for start_row in range(0, 9, 3):
     for start_column in range(0, 9, 3):
       counts = Counter()
       for r in range(start_row, start_row + 3):
         for c in range(start_column, start_column + 3):
           counts += Counter(grid[r][c])

       unique = {digit for digit, times in counts.items() if times == 1}

       for r in range(start_row, start_row + 3):
         for c in range(start_column, start_column + 3):
           if len(grid[r][c]) > 1 and len(grid[r][c] & unique) == 1:
             grid[r][c] &= unique
             changed = True

# Display the solution.
for row in grid:
   for cell in row:
     print("".join(sorted(cell)), end=" ")

   print()

[toc] | [prev] | [next] | [standalone]

#50037

From	Joshua Landau <joshua.landau.ws@gmail.com>
Date	2013-07-06 02:45 +0100
Message-ID	<mailman.4321.1373075195.3114.python-list@python.org>
In reply to	#49992

On 5 July 2013 17:25, MRAB <python@mrabarnett.plus.com> wrote:
> For comparison, here's my solution:
<CODE>

Unfortunately, there are some sudokus that require guessing - your
algorithm will not solve those. A combination algorithm would be best,
AFAIK.

-----

FWIW, this is my interpretation of the original algorithm:

from collections import defaultdict
from io import StringIO

# For printing
DONE = defaultdict(lambda: " ")

# How many different numbers can go in each slot, indexes are of the
form: [position]
NUM_POSSIBILITIES = defaultdict(lambda: 9)

# How many times each value has been removed from the possibilities,
# indexes are of the form: [value, position]
TIMES_REMOVED = defaultdict(lambda: 0)

# Set of empty spaces in the grid
REMAINING = {(i,j) for i in range(9) for j in range(9)}

# Cache, would normally be done inline
PLACES_TO_REMOVE = {}
for row in range(9):
    for column in range(9):
        square_top, square_left = 3*(row // 3), 3*(column // 3)

        # Across the row
        PLACES_TO_REMOVE[row, column]  = {(p_row, column) for p_row in range(9)}

        # Across the collumn
        PLACES_TO_REMOVE[row, column] |= {(row, p_collumn) for
p_collumn in range(9)}

        # Inside the square
        PLACES_TO_REMOVE[row, column] |= {
            (p_row, p_collumn)
            for p_row     in range(square_top,  3+square_top)
            for p_collumn in range(square_left, 3+square_left)
        }

def add_at(value, position):
    """Add value to the position, updating needed globals"""

    DONE[position] = value

    # Use "REMAINING &" so that we only change places we care about
    # It's reversed in a consistent order to allow this
    for add_pos in REMAINING & PLACES_TO_REMOVE[position]:

        # Remove possibility if it hasn't already been removed
        if not TIMES_REMOVED[value, add_pos]:
            NUM_POSSIBILITIES[add_pos] -= 1

        TIMES_REMOVED[value, add_pos] += 1

    REMAINING.remove(position)


def remove_at(value, position):
    """The inverse of add_at"""

    REMAINING.add(position)

    # Use "REMAINING &" so that we only change places we care about
    # It's reversed in a consistent order to allow this
    for remove_pos in REMAINING & PLACES_TO_REMOVE[position]:
        TIMES_REMOVED[value, remove_pos] -= 1

        # Add posibility back if TIMES_REMOVED falls to 0
        if not TIMES_REMOVED[value, remove_pos]:
            NUM_POSSIBILITIES[remove_pos] += 1

    del DONE[position]


def read_sudoku(input):
    number_remaining = 81

    # StringIO iterates over lines
    lines = (line.strip() for line in StringIO(input))
    lines = (line for line in lines if line and not line.startswith("#"))

    for row, line in enumerate(lines):
        line = line.strip().replace(" ", "").replace("\t", "")

        for column, character in enumerate(line):
            if character == "_":
                pass

            elif character.isdigit():
                add_at(int(character), (row, column))
                number_remaining -= 1

            else:
                raise ValueError("Invalid character {} in input line
{}".format(character, line))

    return number_remaining


def print_grid(highlight=None):
    row_seperator     = "¦   ·   ·   ¦   ·   ·   ¦   ·   ·   ¦"
    special_seperator = "·-----------·-----------·-----------·"
    seperators = "  ¦  ¦  ¦"

    row_seperators = [special_seperator, row_seperator, row_seperator] * 3

    for row, row_seperator in enumerate(row_seperators):
        print(row_seperator)

        print("¦", end=" ")

        for column, seperator in enumerate(seperators):
            if (row, column) == highlight:
                # These are colors, ignore 'em if you wish
                # This is to let you see changes when debugging
                print("\x1b[31;01m{}\x1b[39;49;00m".format(DONE[row,
column]), seperator, end=" ")

            else:
                print(DONE[row, column], seperator, end=" ")

        print()

    print(special_seperator)
    print()


def solve(number_remaining):
    if not number_remaining:
        return True

    # We need to guess -- go for the one where we have the least choice
    position = min(REMAINING, key=NUM_POSSIBILITIES.__getitem__)

    for value in range(1, 10):
        if not TIMES_REMOVED[value, position]:
            add_at(value, position)

            if solve(number_remaining-1):
                return True

            remove_at(value, position)

    return False



problem = """
    _ _ _  _ _ _  _ _ _
    _ _ _  _ _ 3  _ 8 5
    _ _ 1  _ 2 _  _ _ _

    _ _ _  5 _ 7  _ _ _
    _ _ 4  _ _ _  1 _ _
    _ 9 _  _ _ _  _ _ _

    5 _ _  _ _ _  _ 7 3
    _ _ 2  _ 1 _  _ _ _
    _ _ _  _ 4 _  _ _ 9
"""

solve(read_sudoku(problem))
print_grid()

[toc] | [prev] | [next] | [standalone]

#49997

From	Helmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date	2013-07-05 15:47 +0000
Message-ID	<b3o851Fu5lhU1@mid.dfncis.de>
In reply to	#49968

On Fri, 05 Jul 2013 16:18:41 +0100, Fábio Santos wrote:

> On 5 Jul 2013 15:59, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote:
>>
>> On Fri, 05 Jul 2013 13:44:57 +0100, Fábio Santos wrote:
>> May I suggest you avoid range and use enumerate(the_array) instead? It
>> might be faster.
>>
>> How does this work?
>>
>> Given
>>
>> Grid= [[0 for j in range(9)] for i in range(9)]
>>
>> for (r,c,val) in ????(Grid) :
>>
>> Helmut
> 
> for r, row_lst in enumerate(Grid):
>     for c, val in enumerate(row_lst):

This is only slightly faster. I assume the creation of the temporary lists "row_list"
is a bit expensive.
Taking 5.4 seconds it's much slower than the current champion ( 0.79 seconds )

Thanks,
Helmut.

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

How to make this faster

Contents

#49945 — How to make this faster

#49958

#49959

#49960

#49968

#49975

#49980

#49983

#49988

#49989

#49990

#49994

#49991

#49993

#49999

#49992

#49996

#50000

#50037

#49997