SkimLit - NLP Project

This project aims to classify sentences into predefined categories using a combination of token-level, character-level, and positional embeddings. The workflow includes data preprocessing, model building, training, evaluation, and prediction.

Workflow

Data Preprocessing:

Load and preprocess data from the PubMed RCT dataset.
Convert text data into token-level and character-level sequences.
One-hot encode line numbers and total lines.

Model Building:

Create token-level and character-level embedding models.
Create models for line number and total lines features.
Combine embeddings using tf.keras.layers.Concatenate.
Build a tribrid embedding model using tf.keras.Model.

Training:

Compile the model with CategoricalCrossentropy loss and Adam optimizer.
Train the model on the training dataset with validation.

Evaluation:

Evaluate the model on the test dataset.
Calculate accuracy, precision, recall, and F1 score.
Display confusion matrix and analyze misclassifications.

Prediction:

Load the trained model and make predictions on new abstracts.
Visualize predicted labels for each sentence in the abstract.

Example of model classification

Files

char_vectorizer.pkl: Saved character vectorizer.
final_model.keras: Trained model.
label_encoder.pkl: Saved label encoder.

Run app on your local machine

Clone the repository:

git clone https://github.com/davydantoniuk/skimlit-nlp-project.git

Navigate to the project directory:

cd skimlit-nlp-project

Make virtual environment:

python -m venv venv

Activate virtual environment:

Windows:

venv\Scripts\activate

macOS/Linux:

source venv/bin/activate

Install the required packages:

pip install -r requirements.txt

Run the application:

python app/app.py

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app		app
description_images		description_images
model_components		model_components
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_abstracts.json		example_abstracts.json
requirements.txt		requirements.txt
skimlit.ipynb		skimlit.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkimLit - NLP Project

Workflow

Example of model classification

Files

Run app on your local machine

About

Releases

Packages

Languages

License

davydantoniuk/skimlit-nlp-project

Folders and files

Latest commit

History

Repository files navigation

SkimLit - NLP Project

Workflow

Example of model classification

Files

Run app on your local machine

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages