Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.ruby > #3114
| From | Martin Hansen <mail@maasha.dk> |
|---|---|
| Newsgroups | comp.lang.ruby |
| Subject | Re: Need for speed -> a C extension? |
| Date | 2011-04-18 13:30 -0500 |
| Organization | Service de news de lacave.net |
| Message-ID | <4638db940d54573d9643ab0a369c8c7e@ruby-forum.com> (permalink) |
| References | <c094098c0ea21c2b9618d1b8d7a4b176@ruby-forum.com> <iohsms02e2q@enews2.newsguy.com> |
WJ wrote in post #993576: > Martin Hansen wrote: > >> The below code is too slow for practical use. I need it to run at least >> 20 times faster. Perhaps that is possible with some C code? I have no >> experience with writing Ruby extensions. What are the pitfalls? Which >> part of the code should be ported? Any pointers to get me started? > > Please give a clear description of the algorithm, and then > give some sample input and output. Here is a working version of the code that can be profiled (though it will take forever with 20M iterations): http://pastie.org/1808127 The slow part according to profiler is: % cumulative self self total time seconds seconds calls ms/call ms/call name 29.39 2.66 2.66 1521 1.75 11.78 Range#each 15.80 4.09 1.43 33000 0.04 0.06 Seq#match? 10.72 5.06 0.97 78940 0.01 0.03 Kernel.dup 9.28 5.90 0.84 78940 0.01 0.01 Kernel.initialize_dup 6.63 6.50 0.60 142380 0.00 0.00 Seq::Score#edit_distance 5.30 6.98 0.48 22220 0.02 0.03 Seq#deletion? 3.54 7.30 0.32 66016 0.00 0.00 String#ord 3.43 7.61 0.31 14680 0.02 0.04 Seq#mismatch? 3.31 7.91 0.30 8300 0.04 0.05 Seq#insertion? The input is DNA sequences. Basically strings of ATCG and Ns of length 50-100. These comes in files with 20M-30M sequences per file. I've got ~50 of these files and more incoming. The output will be truncated sequences based on the match position located with this bit of code. The algorithm is of the dynamic programming flavor and was inspired by the paper by Bruno Woltzenlogel Paleo (page 197): http://www.logic.at/people/bruno/Papers/2007-GATE-ESSLLI.pdf Locating variable length matches is tricky! Cheers, Martin -- Posted via http://www.ruby-forum.com/.
Back to comp.lang.ruby | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-18 10:15 -0500
Re: Need for speed -> a C extension? Chuck Remes <cremes.devlist@mac.com> - 2011-04-18 11:10 -0500
Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-18 11:10 -0500
Re: Need for speed -> a C extension? "WJ" <w_a_x_man@yahoo.com> - 2011-04-18 17:34 +0000
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-18 13:30 -0500
Re: Need for speed -> a C extension? Ryan Davis <ryand-ruby@zenspider.com> - 2011-04-18 14:15 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 05:30 -0500
Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 07:21 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 08:13 -0500
Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 09:56 -0500
Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 10:19 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 08:35 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 09:12 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 13:51 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 18:13 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 02:04 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 07:33 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 07:40 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 07:55 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 08:42 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 10:18 -0500
Re: Need for speed -> a C extension? Phillip Gawlowski <cmdjackryan@googlemail.com> - 2011-04-20 10:24 -0500
Re: Need for speed -> a C extension? Eric Christopherson <echristopherson@gmail.com> - 2011-04-20 17:08 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 10:34 -0500
Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 10:39 -0500
Re: Need for speed -> a C extension? Colin Bartlett <colinb2r@googlemail.com> - 2011-04-20 22:39 -0500
Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-05-15 04:16 -0500
Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-05-15 13:46 +0200
csiph-web