This hit quality algorithm categorizes MLB contacts into 7 different types based on launch velocity and launch angle. For this algorithm four supervised machine learning models were tested, Regresion Trees, Bagged Regression Trees, Support Vector Machine, and Naive Bayes. Naive Bayes performed the best and could be used to classify future contacts. In the beginning stages of the modeling process, hierarchical clustering, an unsupervised machine learning algorithm, was used to label the data. The purpose of the hierarchical clustering model is to simply label the data so that we can use the more robust algorithms available in supervised machine learning.
See the results section to see estimated batting average and woba for the 7 different hit types