Linguist's Classifier in Go.
Linguist can be used as a Go package by
import "github.com/lazywei/linguist"
and it also has a CLI (command line interface) in cli/
$ cd cli/
$ ./build.sh
$ ./mockingbird --help
- Clone the RosettaCodeData
git clone git@github.com:acmeism/RosettaCodeData.git
- Build this
cli
executable
cd cli/
./build.sh
- Run the
collectRosetta
according to the cloned RosettaCodeData, and collect files to../samples
./mockingbird collectRosetta path/to/clones/RosettaCodeData ../samples
Build from scratch
./mockingbird convertLibsvm ../samples ../
This will save libsvm.samples
and bow.gob
to ../
. The bow.gob
is the
parameters for constructing bag-of-words. This can be used afterward:
./mockingbird convertLibsvm ../samples ../ --bowPath ../bow.gob
For example, train a logisitic regression classifier:
./mockingbird train --sample=./test_fixture/test_samples.libsvm --solver 1
This will save a model file in $PWD/model/lr.model
, which can be used in
later prediction.
Full usage:
usage: mockingbird train [<flags>]
Train Classifier
Flags:
--help Show help (also see --help-long and --help-man).
--sample="samples.libsvm"
Path for samples (in libsvm format)
--output="model" Path for saving trained model
--solver=0 0 = NaiveBayes, 1 = LogisticRegression
For example, make prediction via previously trained logisitic regression classifier:
./mockingbird predict --model=./model/lr.model --data=./test_fixture/test_samples.libsvm --solver=1
Full usage:
usage: mockingbird predict --data=DATA [<flags>]
Predict via trained Classifier
Flags:
--help Show help (also see --help-long and --help-man).
--model="./model/naive_bayes.gob"
Path for loading saved model
--data=DATA Path for testing data (in libsvm format)
--solver=0 0 = NaiveBayes, 1 = LogisticRegression