Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #3668

Re: Pairwise count of frequency from an incidence matrix of group membership

From Peter Otten <__peter__@web.de>
Subject Re: Pairwise count of frequency from an incidence matrix of group membership
Date 2011-04-20 10:30 +0200
Organization None
References <367389.80532.qm@web65402.mail.ac4.yahoo.com>
Newsgroups comp.lang.python
Message-ID <mailman.628.1303288234.9059.python-list@python.org> (permalink)

Show all headers | View raw


Shafique, M. (UNU-MERIT) wrote:

> Hi,
> I have a number of different groups g1, g2, … g100 in my data. Each group
> is comprised of a known but different set of members from the population
> m1, m2, …m1000. The data has been organized in an incidence matrix:
> g1g2g3g4g5
> m111101
> m210010
> m301100
> m411011
> m500110
> 
> I need to count how many groups each possible pair of members share (i.e.,
> both are member of).
> I shall prefer the result in a pairwise edgelist with weight/frequency in
> a format like the following:
> m1, m1, 4
> m1, m2, 1
> m1, m3, 2
> m1, m4, 3
> m1, m5, 1
> m2, m2, 2
> ... and so on.
> 
> I shall highly appreciate if anybody could suggest/share some
> code/tool/module which could help do this.

Homework? What have you tried?

One strategy is to create a list of sets containing the groups from the 
initial matrix

matrix = [
[1, 1, 1, 0, 1],
[1, 0, 0, 1, 0],
]

sets = [ # zero-based indices
   set([0,1,2,4]),
   set([0,3]),
   ...
]

The enumerate() builtin may help you with the conversion. You can then find 
the shared groups with set arithmetic:

sets[0] & sets[1] #m1/m2

Back to comp.lang.python | Previous | Next | Find similar


Thread

Re: Pairwise count of frequency from an incidence matrix of group membership Peter Otten <__peter__@web.de> - 2011-04-20 10:30 +0200

csiph-web