Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: Matt Wheeler <m@funkyhat.org>
Newsgroups: comp.lang.python
Subject: Re: Review Request of Python Code
Date: Thu, 10 Mar 2016 18:51:46 +0000
Lines: 44
Message-ID: <mailman.145.1457635934.15725.python-list@python.org>
References: <f0973a0d-62ba-402b-ab23-cb68bdd15323@googlegroups.com> <af65a7a6-3179-4bca-9022-ae0d2ec61a11@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
In-Reply-To: <af65a7a6-3179-4bca-9022-ae0d2ec61a11@googlegroups.com>
Precedence: list
Xref: csiph.com comp.lang.python:104541

On 10 March 2016 at 18:12,  <subhabangalore@gmail.com> wrote:
> Matt, thank you for if...else suggestion, the data of NewTotalTag.txt
> is like a simple list of words with unconventional tags, like,
>
> w1 tag1
> w2 tag2
> w3 tag3
> ...
> ...
> w3  tag3
>
> like that.

I suspected so. The way your code currently works, if your input text
contains one of the tags, e.g. 'tag1' you'll get an entry in your
output something like 'tag1/w2'. I assume you don't want that :).

This is because you're using a single list to include all of the tags.
Try something along the lines of:

dict_word={} #empty dictionary
for line in dict_read.splitlines():
    word, tag = line.split(' ')
    dict_word[word] = tag

Notice I'm using splitlines() instead of split() to do the initial
chopping up of your input. split() will split on any whitespace by
default. splitlines should be self-explanatory.

I would split this and the file-open out into a separate function at
this point. Large blobs of sequential code are not particularly easy
on the eyes or the brain -- choose a sensible name, like
load_dictionary. Perhaps something you could call like:

dict_word = load_dictionary("NewTotalTag.txt")


You also aren't closing the file that you open at any point -- once
you've loaded the data from it there's no need to keep the file opened
(look up context managers).

-- 
Matt Wheeler
http://funkyh.at