Nounification

I could not see any similar tool on github, so here you go. This is a handy tool which performs "nounification" or nominalization.

This is often useful in keyword extraction based algorithms.

Usage

You can use two functions in nominalize.py, which are:

print(nounify_tag("elect", "VV")) would give you election

print(nounify_context("russian", "He is Russian.")) would give you russia

Python 3

NLTK (with WordNet)

Pickle

The word is lemmatized into its root form.
The synsets of the root word is obtained from the correponding POS tag.
The lemmas of each word in synset are collected (narrowed down to desired tag / adjective).
Derivationally related forms are calculated of each lemma which were obtained in step 3.
The given forms are filtered into the desired POS tag.
Filtered lemmas are converted into proper words.
The resulting list is lowercased and duplicates are removed.
Probabilistic distribution of frequency (based on Brown Corpus) is applied and the word with highest probability is returned.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
freqdict.pkl		freqdict.pkl
lemmatize.py		lemmatize.py
nominalize.py		nominalize.py