Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #64611

generate De Bruijn sequence memory and string vs lists

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <vincent@vincentdavis.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.011
X-Spam-Evidence '*H*': 0.98; '*S*': 0.00; 'else:': 0.03; 'algorithm': 0.04; '"""': 0.07; 'string': 0.09; '[0,': 0.09; '[0]': 0.09; 'integers': 0.09; 'received:209.85.219': 0.09; 'sequences.': 0.09; 'subject:string': 0.09; 'python': 0.11; 'def': 0.12; "'0'": 0.16; '1):': 0.16; '50)': 0.16; 'alphabet': 0.16; 'better?': 0.16; 'outputs': 0.16; 'subsequences': 0.16; 'memory': 0.22; 'to:name :python-list@python.org': 0.22; 'algorithms.': 0.24; 'fine': 0.24; '&gt;': 0.26; 'message-id:@mail.gmail.com': 0.30; "skip:' 10": 0.31; 'url:wiki': 0.31; 'accomplished': 0.31; 'sep': 0.31; 'skip:q 20': 0.31; 'url:wikipedia': 0.31; 'this.': 0.32; 'option': 0.32; 'received:209.85': 0.35; 'subject:lists': 0.35; 'but': 0.35; 'received:google.com': 0.35; '8bit%:9': 0.36; 'sequence': 0.36; 'entry': 0.36; 'url:org': 0.36; 'changing': 0.37; 'list': 0.37; 'received:209': 0.37; 'skip:& 10': 0.38; 'to:addr:python-list': 0.38; 'skip:& 20': 0.39; 'generating': 0.39; 'to:addr:python.org': 0.39; 'skip:\xc2 10': 0.60; 'length': 0.61; '8bit%:10': 0.64; 'more': 0.64; '8bit%:31': 0.68; 'limit': 0.70; 'below.': 0.71; '8bit%:16': 0.84; 'computation.': 0.84; 'n):': 0.84; 'questions;': 0.84; 'ugly,': 0.84; 'time)': 0.91; 'wanting': 0.93; '2013': 0.98
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-type; bh=hhXbMNEsXcKNlW/Q2vPOsywbCEot0rrz9O6kA69UfHc=; b=DMJpAJV2Dz5OaEd8OqoUbRbNUdcK2CuWNNBMmnlQMORkvIaMGu46RtHMO6TyrelRi7 IgD9s4HAacURim+l72yjsoIWdZoOe+qksbvmBLgi/K8FMiXMdOSTxMsPDBObLEuI3paC 5AAufd8BeRYKRqCKyHlBrgCEQVEyS3QrYam8wJgqIR6eiNDixw4KNG19c4EcBakQPwsz uieTvLnAGciYS0QkIv5G5q+rcsPMmFUzmlhMVFE0V0bbsl3HQ5FdzYnuiO/da5CBeQuq 0Mo967QIOxBB/P1j/Y/2/RcZAI/HhuAV0u8Kgd8Cj8MUc58JlMBpR7J0zN7TMr6EpjbK dBVw==
X-Gm-Message-State ALoCoQlNGJ2achpPfq2TNpD+nwihJM1VYD9Zmp9uRMLbPHNmjT1ZDHxyDu/L2IHDxZm8YFsPo0XY
X-Received by 10.182.117.195 with SMTP id kg3mr6742670obb.17.1390487020024; Thu, 23 Jan 2014 06:23:40 -0800 (PST)
MIME-Version 1.0
From Vincent Davis <vincent@vincentdavis.net>
Date Thu, 23 Jan 2014 08:23:19 -0600
Subject generate De Bruijn sequence memory and string vs lists
To "python-list@python.org" <python-list@python.org>
Content-Type multipart/alternative; boundary=089e0149c506e4cfa304f0a3fc71
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5893.1390487029.18130.python-list@python.org> (permalink)
Lines 309
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1390487029 news.xs4all.nl 2849 [2001:888:2000:d::a6]:36159
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:64611

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

For reference, Wikipedia entry for De Bruijn sequence
http://en.wikipedia.org/wiki/De_Bruijn_sequence

At the above link is a python algorithm for generating De Brujin sequences.
It works fine but outputs a list of integers [0, 0, 0, 1, 0, 1, 1, 1] and I
would prefer a string '00010111'. This can be accomplished by changing the
last line from;
return sequence
to
return ''.join([str(i) for i in sequence])
See de_bruijn_1 Below.

The other option would be to manipulate strings directly (kind of).
I butchered the original algorithm to do this. See de_bruijn_2 below. But
it is much slower and ungly.

I am wanting to make a few large De Bruijin sequences. hopefully on the
order of de_bruijn(4, 50) to de_bruijn(4, 100) (wishful thinking?). I don't
know the limits (memory or time) for the current algorithms. I think I am
will hit the memory mazsize limit at about 4^31. The system I will be using
has 64GB RAM.
The size of a De Brujin sequence is k^n

My questions;
1, de_bruijn_2 is ugly, any suggestions to do it better?
2, de_bruijn_2 is significantly slower than de_bruijn_1. Speedups?
3, Any thought on which is more memory efficient during computation.

#### 1 ####
def de_bruijn_1(k, n):
    """
    De Bruijn sequence for alphabet size k (0,1,2...k-1)
    and subsequences of length n.
    From wikipedia Sep 22 2013
    """
    a = [0] * k * n
    sequence = []
    def db(t, p,):
        if t > n:
            if n % p == 0:
                for j in range(1, p + 1):
                    sequence.append(a[j])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(int(a[t - p]) + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    #return sequence  #original
    return ''.join([str(i) for i in sequence])

d1 = de_bruijn_1(4, 8)

#### 2 ####
def de_bruijn_2(k, n):
    global sequence
    a = '0' * k * n
    sequence = ''
    def db(t, p):
        global sequence
        global a
        if t > n:
            if n % p == 0:
                for j in range(1, p + 1):
                    sequence = sequence + a[j]
        else:
            a = a[:t] + a[t - p]  + a[t+1:]
            db(t + 1, p)
            for j in range(int(a[t - p]) + 1, k):
                a = a[:t] + str(j)  + a[t+1:]
                db(t + 1, t)
        return sequence
    db(1, 1)
    return sequence

d2 = de_bruijn_2(4, 8)


Vincent Davis

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

generate De Bruijn sequence memory and string vs lists Vincent Davis <vincent@vincentdavis.net> - 2014-01-23 08:23 -0600

csiph-web