Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.ruby > #3105 > unrolled thread

Need for speed -> a C extension?

Started byMartin Hansen <mail@maasha.dk>
First post2011-04-18 10:15 -0500
Last post2011-05-15 13:46 +0200
Articles 8 on this page of 28 — 9 participants

Back to article view | Back to comp.lang.ruby


Contents

  Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-18 10:15 -0500
    Re: Need for speed -> a C extension? Chuck Remes <cremes.devlist@mac.com> - 2011-04-18 11:10 -0500
    Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-18 11:10 -0500
    Re: Need for speed -> a C extension? "WJ" <w_a_x_man@yahoo.com> - 2011-04-18 17:34 +0000
      Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-18 13:30 -0500
        Re: Need for speed -> a C extension? Ryan Davis <ryand-ruby@zenspider.com> - 2011-04-18 14:15 -0500
          Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 05:30 -0500
            Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 07:21 -0500
              Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 08:13 -0500
                Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 09:56 -0500
                Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-04-19 10:19 -0500
            Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 08:35 -0500
              Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-19 09:12 -0500
                Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 13:51 -0500
                Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-19 18:13 -0500
                  Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 02:04 -0500
                    Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 07:33 -0500
                      Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 07:40 -0500
                        Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 07:55 -0500
                          Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 08:42 -0500
                            Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-04-20 10:18 -0500
                              Re: Need for speed -> a C extension? Phillip Gawlowski <cmdjackryan@googlemail.com> - 2011-04-20 10:24 -0500
                                Re: Need for speed -> a C extension? Eric Christopherson <echristopherson@gmail.com> - 2011-04-20 17:08 -0500
                              Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 10:34 -0500
                                Re: Need for speed -> a C extension? brabuhr@gmail.com - 2011-04-20 10:39 -0500
                Re: Need for speed -> a C extension? Colin Bartlett <colinb2r@googlemail.com> - 2011-04-20 22:39 -0500
    Re: Need for speed -> a C extension? Martin Hansen <mail@maasha.dk> - 2011-05-15 04:16 -0500
      Re: Need for speed -> a C extension? Robert Klemme <shortcutter@googlemail.com> - 2011-05-15 13:46 +0200

Page 2 of 2 — ← Prev page 1 [2]


#3245

FromMartin Hansen <mail@maasha.dk>
Date2011-04-20 10:18 -0500
Message-ID<4e0068850b4e890703a7f787e157759e@ruby-forum.com>
In reply to#3242
> Okay, so the issue is in your ruby not RubyInline.
>
> I updated my machine to 10.6.7 (uname: 10.7.0 :-) and my previously
> compiled 1.9.2 still works correctly.  Can you try to install a fresh
> build in another prefix (by hand, RVM, or homebrew) and see if the
> same behavior is still present?
>
> Thanks.

I reverted to ruby 1.9.1p129

gem install RubyInline
ERROR:  While executing gem ... (ArgumentError)
    undefined class/module Digest::Base

This happens on both Mac and Linux

Darwin mel.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 
15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386 i386

Linux bixeonws 2.6.32-30-server #59-Ubuntu SMP Tue Mar 1 22:46:09 UTC 
2011 x86_64 GNU/Linux

What is this Digest::Base anyway? Can we have it shot and killed?


Martin

-- 
Posted via http://www.ruby-forum.com/.

[toc] | [prev] | [next] | [standalone]


#3247

FromPhillip Gawlowski <cmdjackryan@googlemail.com>
Date2011-04-20 10:24 -0500
Message-ID<BANLkTi=4zdsUZ1N=iqucuJN5TaSwgnhnog@mail.gmail.com>
In reply to#3245
On Wed, Apr 20, 2011 at 5:18 PM, Martin Hansen <mail@maasha.dk> wrote:
>
> What is this Digest::Base anyway? Can we have it shot and killed?

http://www.ruby-doc.org/stdlib/libdoc/digest/rdoc/classes/Digest/Base.html

If you still have the log for your Ruby compile lying around (and the
config file ./configure has generated), you could grep for OpenSSL
errors (that's what Ruby uses for SSL and, I think, crypto in
general).

-- 
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

[toc] | [prev] | [next] | [standalone]


#3272

FromEric Christopherson <echristopherson@gmail.com>
Date2011-04-20 17:08 -0500
Message-ID<BANLkTikuVSiuO7WpbS50E1r=qT1f=YmHEQ@mail.gmail.com>
In reply to#3247
On Wed, Apr 20, 2011 at 10:24 AM, Phillip Gawlowski
<cmdjackryan@googlemail.com> wrote:
> On Wed, Apr 20, 2011 at 5:18 PM, Martin Hansen <mail@maasha.dk> wrote:
>>
>> What is this Digest::Base anyway? Can we have it shot and killed?
>
> http://www.ruby-doc.org/stdlib/libdoc/digest/rdoc/classes/Digest/Base.html
>
> If you still have the log for your Ruby compile lying around (and the
> config file ./configure has generated), you could grep for OpenSSL
> errors (that's what Ruby uses for SSL and, I think, crypto in
> general).

I just remembered something I heard on IRC the other night, that Ruby
1.9 sometimes has a problem with MacPorts's OpenSSL. I don't know the
details, but that might be something to consider, if you Martin is
using MacPorts.

MacPorts ticket: <https://trac.macports.org/ticket/28582>

[toc] | [prev] | [next] | [standalone]


#3249

Frombrabuhr@gmail.com
Date2011-04-20 10:34 -0500
Message-ID<BANLkTindb9w1DtTRHHiwrkhQ5b0obUGx3g@mail.gmail.com>
In reply to#3245
Martin Hansen <mail@maasha.dk> wrote:
> I reverted to ruby 1.9.1p129
>
> gem install RubyInline
> ERROR:  While executing gem ... (ArgumentError)
>    undefined class/module Digest::Base
>
> This happens on both Mac and Linux
>
> Darwin mel.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29
> 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386 i386
>
> Linux bixeonws 2.6.32-30-server #59-Ubuntu SMP Tue Mar 1 22:46:09 UTC
> 2011 x86_64 GNU/Linux
>
> What is this Digest::Base anyway?

"This abstract class provides a common interface to message digest
implementation classes written in C."  Maybe both machines are missing
something so it didn't compile properly?

> Can we have it shot and killed?

I don't think so :-)

[toc] | [prev] | [next] | [standalone]


#3250

Frombrabuhr@gmail.com
Date2011-04-20 10:39 -0500
Message-ID<BANLkTi=PsxOSGmoYdb77U4OsLiXs14t9eQ@mail.gmail.com>
In reply to#3249
On Wed, Apr 20, 2011 at 11:34 AM,  <brabuhr@gmail.com> wrote:
>> What is this Digest::Base anyway?
>
> "This abstract class provides a common interface to message digest
> implementation classes written in C."  Maybe both machines are missing
> something so it didn't compile properly?

$ cd ext/digest/
$ make
make: Nothing to be done for `all'.

$ make clean
$ make all
gcc -I. -I../../.ext/include/x86_64-linux -I../.././include
-I../.././ext/digest -DRUBY_EXTCONF_H=\"extconf.h\"    -fPIC -O3 -ggdb
-Wextra -Wno-unused-parameter -Wno-parentheses -Wpointer-arith
-Wwrite-strings -Wno-missing-field-initializers -Wno-long-long  -fPIC
-o digest.o -c digest.c
gcc -shared -o ../../.ext/x86_64-linux/digest.so digest.o -L. -L../..
-L.  -rdynamic -Wl,-export-dynamic   -Wl,-R
-Wl,/nas/fs/users/cameron/unix/.rvm/rubies/ruby-1.9.2-p180/lib
-L/nas/fs/users/cameron/unix/.rvm/rubies/ruby-1.9.2-p180/lib -lruby
-lpthread -lrt -ldl -lcrypt -lm   -lc
cp ../.././ext/digest/lib/digest/hmac.rb ../../.ext/common/digest
cp ../.././ext/digest/lib/digest.rb ../../.ext/common
cp ../.././ext/digest/digest.h ../../.ext/include/ruby

(Linux 2.6.32-71.24.1.el6.x86_64)

[toc] | [prev] | [next] | [standalone]


#3287

FromColin Bartlett <colinb2r@googlemail.com>
Date2011-04-20 22:39 -0500
Message-ID<BANLkTinhdZhq0fEW6aJ6L_HTWqWxbkLDyQ@mail.gmail.com>
In reply to#3173
On Tue, Apr 19, 2011 at 3:12 PM, Martin Hansen <mail@maasha.dk> wrote:
> Ruby I compiled and installed myself. Inline was installed: gem install RubyInline

I hesitate to suggest this, given that if you've compiled and
installed Ruby yourself you're probably well aware of the various
possibilities. But: does the extension have to be in C and does it
have to be the Ruby you've compiled and installed? If not, and you're
having problems getting the C extensions you want to work with Ruby,
have you considered using JRuby and Java? I've found the integration
of Java with JRuby to be fairly easy: I have had some problems, but
that's due to my ignorance, not JRuby or Java, and I've usually found
a way round my problem quickly. And using JRuby with Java for the bits
that need to be fast has had a similar speed to compiled FreePascal.

[toc] | [prev] | [next] | [standalone]


#4557

FromMartin Hansen <mail@maasha.dk>
Date2011-05-15 04:16 -0500
Message-ID<c16430d6dfe5b07ba3cf5e525e65ff30@ruby-forum.com>
In reply to#3105
Sorry guys, I have been busy with a few other things before continuing
on this one (also solving the pesky issue with my RubyInline install).

@Robert.

I have been thinking hard about your comments - these contain a lot of
programming insight of the kind you dont get from a "learnings
<programming language> book". However, I struggle with grasping the
wisdom:

"Frankly, I find your code has a design issue: it seems you mix data
and iteration in a single class.  This is visible from how #match
works

def match(pattern, pos = 0, max_edit_distance = 0)
    @pattern           = pattern
    @pos               = pos
    @max_edit_distance = max_edit_distance
    @vector            = vector_init

..

IMHO it would be better to separate representation of the sequence and
the matching process.  The matcher then would only carry a reference
to the sequence and all the data it needs to do matching.
"

What is this "mixing data with iteration in a single class"? To me you
have data and then you iterate - what exactly is the problem? What
should I do for separating these?


&&

"Maybe on the interface, but you create side effects on the String (Seq
in your case).  This is neither a clean separation (makes classes
bigger and harder to understand) nor is it thread safe (e.g. if you
want to try to concurrently match several patterns against the same
sequence)."

Again, "separation", but what is the problem with big classes? When is a
class too big? And how to divide your code in the best way?

Also, I managed to get down to a single vector, but I think I may have
more .dup's than needed - though removing any causes erroneous output:


http://pastie.org/1902844



Cheers,


Martin

-- 
Posted via http://www.ruby-forum.com/.

[toc] | [prev] | [next] | [standalone]


#4560

FromRobert Klemme <shortcutter@googlemail.com>
Date2011-05-15 13:46 +0200
Message-ID<939sphF22pU1@mid.individual.net>
In reply to#4557
On 15.05.2011 11:16, Martin Hansen wrote:
> Sorry guys, I have been busy with a few other things before continuing
> on this one (also solving the pesky issue with my RubyInline install).
>
> @Robert.
>
> I have been thinking hard about your comments - these contain a lot of
> programming insight of the kind you dont get from a "learnings
> <programming language>  book". However, I struggle with grasping the
> wisdom:
>
> "Frankly, I find your code has a design issue: it seems you mix data
> and iteration in a single class.  This is visible from how #match
> works
>
> def match(pattern, pos = 0, max_edit_distance = 0)
>      @pattern           = pattern
>      @pos               = pos
>      @max_edit_distance = max_edit_distance
>      @vector            = vector_init
>
> ..
>
> IMHO it would be better to separate representation of the sequence and
> the matching process.  The matcher then would only carry a reference
> to the sequence and all the data it needs to do matching.
> "
>
> What is this "mixing data with iteration in a single class"? To me you
> have data and then you iterate - what exactly is the problem? What
> should I do for separating these?

I am referring to http://pastie.org/1808127 : The issue with the code 
presented lies in the *state*: an instance of Seq represents a sequence 
of items (in your case amino acids, I guess).  You need state to 
represent this sequence.  This state is stored in instance variables of 
an instance of class Seq.

The first thing that your method #match (see above) does, is to set some 
other instance variables of Seq.  This poses a problem if

- the Seq instance is frozen, i.e. immutable,

- more than one matching processes are under way concurrently (i.e. 
#match is invoked from more than one thread at a time).

The reason is that you mixed state needed to represent your sequence 
with state needed to execute the matching process.  Apart from the 
problems listed above this also makes code harder to read and more error 
prone.  For example, you might modify the Seq implementation and 
accidentally reuse the name of an instance variable which then can make 
your code break in unpredictable ways.

> "Maybe on the interface, but you create side effects on the String (Seq
> in your case).  This is neither a clean separation (makes classes
> bigger and harder to understand) nor is it thread safe (e.g. if you
> want to try to concurrently match several patterns against the same
> sequence)."
>
> Again, "separation", but what is the problem with big classes? When is a
> class too big? And how to divide your code in the best way?

There is no easy answer to that.  In this case you violate modularity by 
lumping too many different functionalities together in a single class. 
Apart from the advantage to avoid mischief laid out above by decoupling 
the matching process from the sequence representation you also gain 
modularity if you let the matching process only rely on the public 
interface of Seq.  That way you can even have different implementations 
of sequences which can all be scanned by the same representation.  I am 
not saying you should necessarily do this as efficient matching might 
also need some knowledge of Seq internals, but this is to illustrate 
what kind of things you should consider when designing classes.

> Also, I managed to get down to a single vector, but I think I may have
> more .dup's than needed - though removing any causes erroneous output:
>
> http://pastie.org/1902844

That code still stores matching state in the Seq instance.  And you have 
quite a few #dups which have object creation overhead.  I think you 
should better work with two Array of Score instances and swap them at 
each sequence position.  IMHO you do not even need to reinitialize Score 
instances because of your dynamic programming approach it is guaranteed 
that you access instances sequentially from index 0 on and need to 
access at most last[i - 1], last[i] and current[i - 1].

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.ruby


csiph-web