Whenever you are building an Indexing programming , you have to take care of "Word Sense Disambiguation". Because a word can have several meanings in different contexts . So you have to choose the right meaning for a given piece of text . Use Co -oocurence net work of the choosen word & then calculate PMI & Dice Coefficient of that word to eliminate sense disambiguation .
You have to also look into the fact the there are several IDIOMs used in English language . You have to properly choose which word used in IDIOM & which word is not . This is another important fact.There is a link of English IDIOMs below .
Link : http://www.usingenglish.com/reference/idioms/b.html
Tuesday, September 16, 2008
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment