Python – Parsing word list in python

listparsingpython

I have a wlist.txt file of about 58k words of the english language, a small excerpt of which looks like this :

aardvark
aardwolf
aaron
aback
abacus
abaft
abalone
abandon
abandoned
abandonment
abandons
abase
abased
abasement

What I would like to do is have a program search through the list and see if a word is contained in the list, and if so print the word. My issue is that the code i have written will constantly return that no, the word is not in the list, when i know for sure that it is. My code looks like this, anybody notice any bugs?

match = 'aardvark'
f = 'wlist.txt'
success = False
try:
    for word in open(f):
        if word == match:
            success = True
            break
except IOError:
    print f, "not found!"
if success:
    print "The word has been found with a value of", word
else:
    print "Word not found"

Thanks in advance everyone!!

Best Solution

As others have already said, your problem stems from the fact that the newline characters are part of the words you are reading in. The best way to get rid of these is to use the strip() method of str.

In addition, your code does too much to accomplish a simple task. All you need to do is build a set from your word list and look for the occurrence of your word in the set. A set is far better for this task than a list because checking for the occurrence of an element in a set is much faster. So something like this should work.

try:
    with open('wordlist.txt', 'rU') as infile:
        wordSet = set(line.strip() for line in infile)
except IOError:
       print 'error opening file'

aWord = 'aardvark'

if aWord in wordSet:
    print 'found word', aWord
else:
    print 'word not found'

Note: if aWord in wordSet is so much faster it isn't funny. If you're looking for a word closer to the end of the word list, set is nearly 60000 times faster for a 267000 word list. And it's still marginally faster even if you're looking for the very first word.