Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'resulting': 0.04; 'encoding': 0.05; 'table.': 0.07; 'compact': 0.09; 'lookup': 0.09; 'received:209.85.219': 0.09; 'sub': 0.09; 'subject:string': 0.09; 'way:': 0.09; 'cc:addr:python-list': 0.11; 'jan': 0.12; '24,': 0.16; 'algorithm.': 0.16; 'integer,': 0.16; 'overlaps': 0.16; 'sequence.': 0.16; 'index': 0.16; 'wrote:': 0.18; '(the': 0.22; 'cc:addr:python.org': 0.22; 'integer': 0.24; 'simpler': 0.24; '(or': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'message-id:@mail.gmail.com': 0.30; 'quite': 0.32; 'fri,': 0.33; 'received:209.85': 0.35; 'convert': 0.35; 'subject:lists': 0.35; 'received:google.com': 0.35; 'i.e.': 0.36; 'sequence': 0.36; 'received:209': 0.37; 'easily': 0.37; 'mapping': 0.38; 'length': 0.61; 'times': 0.62; 'kind': 0.63; 'needing': 0.65; 'between': 0.67; 'observed': 0.84; 'recover': 0.91 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=KjRXJ5Uo8C/SXBzXjnDcH7ev7fGCy2YR+Oep6IIoYjY=; b=b3h1Z27yr3OmUvL/YgsHmy8ZC/566fB3nrbHLuoR+jxHWp0t0xU3w3T444T0+jByZP 5RTrDT5kxwzaQlrxeFDnUYgQolVu2ljKEaZgozzt2EaBrxqeYQmcWMNvHnL5wUy2XMMY gP4QAdRXDWUDlFlhKhg9fSrQ/i7gWxCoJw9PpGLlf650Saxd7AiWyRBqw/T6PvbwYcLZ oI5m/7IghNH1WMdUeQYTo2sYufxK5KhsZO/LnECGiOlS9AOCKTYwrgQ2I5hr67RDb5Ww pPWgpgvzP9Q9CU85q38BG4l/WNpC8+2HwvMVvN+FeI+eqU8AcwWZJUCH+UjoQGeETyiS jN0g== X-Gm-Message-State: ALoCoQlEzqPRWhgsIWzj26hzrcgEBOuUtLI5wYUJmdmmAd/jmXLQqISC38znh1f2vaNaLVYDHbOS X-Received: by 10.182.24.69 with SMTP id s5mr12213264obf.35.1390569669982; Fri, 24 Jan 2014 05:21:09 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: From: Vincent Davis Date: Fri, 24 Jan 2014 07:20:49 -0600 Subject: Re: generate De Bruijn sequence memory and string vs lists To: Gregory Ewing Content-Type: multipart/alternative; boundary=001a11c2a20c370fef04f0b73b26 Cc: "python-list@python.org" X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 69 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1390569680 news.xs4all.nl 2878 [2001:888:2000:d::a6]:45711 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:64680 --001a11c2a20c370fef04f0b73b26 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Jan 24, 2014 at 2:29 AM, Gregory Ewing wrote: > If all you want is a mapping between a sequence of > length n and compact representation of it, there's > a much simpler way: just convert it to a base-k > integer, where k is the size of the alphabet. > > The resulting integer won't be any larger than an > index into the de Bruijn sequence would be, and > you can easily recover the original sequence from > its encoding without needing any kind of lookup > table. > =E2=80=8BTrue, =E2=80=8Bthe "all you want is a mapping" is not quite true. = I actually plan to plot frequency (the number of times an observed sub sequence overlaps a value in the De Bruijn sequence) The way the sub sequences overlap is important to me and I don't see a way go from base-k (or any other base) to the index location in the De Bruijn sequence. i.e. a decoding algorithm. Vincent Davis 720-301-3003 --001a11c2a20c370fef04f0b73b26 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= On Fri, Jan 24, 2014 at 2:29 AM, Gregory Ewing <greg.ewing@cante= rbury.ac.nz> wrote:
If all you want= is a mapping between a sequence of
length n and compact representation of it, there's
a much simpler way: just convert it to a base-k
integer, where k is the size of the alphabet.

The resulting integer won't be any larger than an
index into the de Bruijn sequence would be, and
you can easily recover the original sequence from
its encoding without needing any kind of lookup
table.

=E2=80=8BTrue, =E2=80=8Bthe "all you w= ant is a mapping" is not quite true. I=C2=A0actually plan to plot freq= uency (the number of times an observed sub sequence overlaps a value in the= =C2=A0De Bruijn sequence) The way the sub sequences overlap is important to= me and I don't see a way go from base-k (or any other base) to the ind= ex location in the De Bruijn sequence. i.e. a=C2=A0decoding algorithm.


Vincent = Davis
720-301-3003
--001a11c2a20c370fef04f0b73b26--