In this notebook, I've used the Pomegranate library to build a hidden Markov model for part of speech tagging with a universal tagset.
Hidden Markov models have been able to achieve >96% tag accuracy with larger tagsets on realistic text corpora. Hidden Markov models have also been used for speech recognition and speech generation, machine translation, gene recognition for bioinformatics, and human gesture recognition for computer vision, and more.
Part of the code was made available as a template. My task was to add some new functionality to the sections identified with 'IMPLEMENTATION' in the header.
This project was developed in the context of Udacity's Natural Language Processing nanodegree.