Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #47444 > unrolled thread

I used defaultdic to store some variables but the output is blank

Started byclaire morandin <claire.morandin@gmail.com>
First post2013-06-09 03:12 -0700
Last post2013-06-09 14:46 +0200
Articles 4 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  I used defaultdic to store some variables but the output is blank claire morandin <claire.morandin@gmail.com> - 2013-06-09 03:12 -0700
    Re: I used defaultdic to store some variables but the output is blank Peter Otten <__peter__@web.de> - 2013-06-09 13:56 +0200
    Re: I used defaultdic to store some variables but the output is blank claire morandin <claire.morandin@gmail.com> - 2013-06-09 05:27 -0700
      Re: I used defaultdic to store some variables but the output is blank Peter Otten <__peter__@web.de> - 2013-06-09 14:46 +0200

#47444 — I used defaultdic to store some variables but the output is blank

Fromclaire morandin <claire.morandin@gmail.com>
Date2013-06-09 03:12 -0700
SubjectI used defaultdic to store some variables but the output is blank
Message-ID<35dd1554-de27-4209-b62e-6a2968c19d0c@googlegroups.com>
I have the following script which does not return anything, no apparent mistake but my output file is empty.I am just trying to extract some decimal number from a file according to their names which are in another file. from collections import defaultdict import numpy as np

[code]ercc_contigs= {}
for line in open ('Faq_ERCC_contigs_name.txt'):
    gene = line.strip().split()

ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))
output_file = open('out.txt','w')

rpkm_file = open('RSEM_Faq_Q1.genes.results.txt')
rpkm_file.readline()
for line in rpkm_file:
    line = line.strip()
    columns =  line.strip().split()
    gene = columns[0].strip()
    rpkm_value = float(columns[6].strip())
    if gene in ercc_contigs:
        ercc_rpkm[gene] += rpkm_value

ercc_fh = open ('out.txt','w')
for gene, rpkm_value in ercc_rpkm.iteritems():
    ercc = '{0}\t{1}\n'.format(gene, rpkm_value)
    ercc_fh.write (ercc)[/code]

If someone could help me spot what's wrong it would be much appreciate cheers

[toc] | [next] | [standalone]


#47452

FromPeter Otten <__peter__@web.de>
Date2013-06-09 13:56 +0200
Message-ID<mailman.2914.1370779004.3114.python-list@python.org>
In reply to#47444
claire morandin wrote:

> I have the following script which does not return anything, no apparent
> mistake but my output file is empty.I am just trying to extract some
> decimal number from a file according to their names which are in another
> file. from collections import defaultdict import numpy as np
> 
> [code]ercc_contigs= {}
> for line in open ('Faq_ERCC_contigs_name.txt'):
>     gene = line.strip().split()

You probably planned to use the loop above to populate the ercc_contigs 
dict, but there's no code for that.

 
> ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))
> output_file = open('out.txt','w')
> 
> rpkm_file = open('RSEM_Faq_Q1.genes.results.txt')
> rpkm_file.readline()
> for line in rpkm_file:
>     line = line.strip()
>     columns =  line.strip().split()
>     gene = columns[0].strip()
>     rpkm_value = float(columns[6].strip())

Remember that ercc_contigs is empty; therefore the test 

>     if gene in ercc_contigs:

always fails and the following line is never executed.

>         ercc_rpkm[gene] += rpkm_value
> 
> ercc_fh = open ('out.txt','w')
> for gene, rpkm_value in ercc_rpkm.iteritems():
>     ercc = '{0}\t{1}\n'.format(gene, rpkm_value)
>     ercc_fh.write (ercc)[/code]
> 
> If someone could help me spot what's wrong it would be much appreciate
> cheers

By the way: it is unclear to my why you are using a numpy array here:

> ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))

I think

ercc_rpkm = defaultdict(float)

should suffice. Also:

>     line = line.strip()
>     columns =  line.strip().split()
>     gene = columns[0].strip()
>     rpkm_value = float(columns[6].strip())

You can remove all strip() method calls here as line.split() implicitly 
removes all whitespace.

[toc] | [prev] | [next] | [standalone]


#47456

Fromclaire morandin <claire.morandin@gmail.com>
Date2013-06-09 05:27 -0700
Message-ID<7b40883a-4211-48cc-8db4-2414fa52e23a@googlegroups.com>
In reply to#47444
Thanks Peter, true I did not realize that ercc_contigs is empty, but I am not sure how to "populate" the dictionary if I only have one column for the value but no key

[toc] | [prev] | [next] | [standalone]


#47459

FromPeter Otten <__peter__@web.de>
Date2013-06-09 14:46 +0200
Message-ID<mailman.2917.1370782008.3114.python-list@python.org>
In reply to#47456
claire morandin wrote:

> Thanks Peter, true I did not realize that ercc_contigs is empty, but I am
> not sure how to "populate" the dictionary if I only have one column for
> the value but no key

You could use a "dummy value"

ercc_contigs = {}
for line in open('Faq_ERCC_contigs_name.txt'):
    gene = line.split()[0]
    ercc_contigs[gene] = None

but a better approach is to use a set instead of a dict:

ercc_contigs = set()
for line in open('Faq_ERCC_contigs_name.txt'):
    gene = line.split()[0]
    ercc_contigs.add(gene)

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web