Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #46994
| X-Received | by 10.180.11.239 with SMTP id t15mr1680876wib.3.1370400099001; Tue, 04 Jun 2013 19:41:39 -0700 (PDT) |
|---|---|
| X-Received | by 10.50.102.71 with SMTP id fm7mr574882igb.7.1370400098449; Tue, 04 Jun 2013 19:41:38 -0700 (PDT) |
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.fsmpi.rwth-aachen.de!proxad.net!feeder1-2.proxad.net!209.85.212.216.MISMATCH!h2no3879858wiw.1!news-out.google.com!fw11ni1470wic.0!nntp.google.com!h2no3879855wiw.1!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail |
| Newsgroups | comp.lang.python |
| Date | Tue, 4 Jun 2013 19:41:38 -0700 (PDT) |
| Complaints-To | groups-abuse@google.com |
| Injection-Info | glegroupsg2000goo.googlegroups.com; posting-host=203.181.243.17; posting-account=4fiQAQkAAAANjB-r4EnBE_ZGEHG3iI9X |
| NNTP-Posting-Host | 203.181.243.17 |
| User-Agent | G2/1.0 |
| MIME-Version | 1.0 |
| Message-ID | <d6c181ee-d2bf-4278-a26c-22bc26402271@googlegroups.com> (permalink) |
| Subject | Issue values dictionary |
| From | claire morandin <claire.morandin@gmail.com> |
| Injection-Date | Wed, 05 Jun 2013 02:41:38 +0000 |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Xref | csiph.com comp.lang.python:46994 |
Show key headers only | View raw
I have two text file with a bunch of transcript name and their corresponding length, it looks like this:
ERCC.txt
ERCC-00002 1061
ERCC-00003 1023
ERCC-00004 523
ERCC-00009 984
ERCC-00012 994
ERCC-00013 808
ERCC-00014 1957
ERCC-00016 844
ERCC-00017 1136
ERCC-00019 644
blast.tx
ERCC-00002 1058
ERCC-00003 1017
ERCC-00004 519
ERCC-00009 977
ERCC-00019 638
ERCC-00022 746
ERCC-00024 134
ERCC-00024 126
ERCC-00024 98
ERCC-00025 445
I want to compare the length of the transcript and see if the length in blast.txt is at least 90% of the length in ERCC.txt for the corresponding transcript name ( I hope I am clear!)
So I wrote the following script:
ercctranscript_size = {}
for line in open('ERCC.txt'):
columns = line.strip().split()
transcript = columns[0]
size = columns[1]
ercctranscript_size[transcript] = int(size)
unknown_transcript = open('Not_sequenced_ERCC_transcript.txt', 'w')
blast_file = open('blast.txt')
out_file = open ('out.txt', 'w')
blast_transcript = {}
blast_file.readline()
for line in blast_file:
blasttranscript = columns[0].strip()
blastsize = columns[1].strip()
blast_transcript[blasttranscript] = int(blastsize)
blastsize = blast_transcript[blasttranscript]
size = ercctranscript_size[transcript]
print size
if transcript not in blast_transcript:
unknown_transcript.write('{0}\n'.format(transcript))
else:
size = ercctranscript_size[transcript]
if blastsize >= 0.9*size:
print >> out_file, transcript, True
else:
print >> out_file, transcript, False
But I have a problem storing all size length to the value size as it is always comes back with the last entry.
Could anyone explain to me what I am doing wrong and how I should set the values for each dictionary? I am really new to python and this is my first script
Thanks for your help everybody!
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Issue values dictionary claire morandin <claire.morandin@gmail.com> - 2013-06-04 19:41 -0700
Re: Issue values dictionary alex23 <wuwei23@gmail.com> - 2013-06-04 20:17 -0700
Re: Issue values dictionary Peter Otten <__peter__@web.de> - 2013-06-05 09:43 +0200
Re: Issue values dictionary alex23 <wuwei23@gmail.com> - 2013-06-05 02:46 -0700
Re: Issue values dictionary claire morandin <claire.morandin@gmail.com> - 2013-06-04 21:05 -0700
csiph-web