Evaluating a fine-tuned GPT-3.5 model’s effectiveness in distinguishing truthful from deceptive opinions.
LLM, AI, neuroscience, Fine-tuning
- About the Project
- Key Features
- Key Results
- Data Overview
- Methodology
- Screenshots and Graphs
- Technologies Used
- Setup & Installation
- Usage
- Contributing
- License
- Contact
This study explores the performance of a fine-tuned GPT-3.5 model in detecting deception, inspired by Loconte et al. (2023). We utilize a dataset of opinions labeled as truthful or deceptive and evaluate whether fine-tuning can improve detection accuracy. The study demonstrates that GPT-3.5 outperforms FLAN-T5 in this task, revealing a truth bias but showing enhanced generalization capabilities.
- Enhanced Detection Accuracy: The fine-tuned GPT-3.5 model achieved high accuracy (86.6%) in distinguishing between truthful and deceptive statements.
- Performance Comparison: GPT-3.5-turbo-0163 surpasses FLAN-T5 in accuracy, precision, and F-score across several opinion-based topics.
- Truth Bias Analysis: Examination of the model’s cognitive biases, particularly a truth bias resulting in higher false-positive rates.
- Model Accuracy: The average accuracy across folds is 0.866, indicating consistent performance and minimal overfitting.
- Truth Bias: The model displays a tendency to classify opinions as truthful, reflected in a higher number of false positives.
- Performance on Topics:
- Highest Accuracy: Gay Marriage (89.0%)
- Lowest Accuracy: Cannabis Legalization (83.8%)
The dataset contains opinion statements categorized by topic and labeled as either truthful or deceptive.
- Source: Scenario 1 opinions dataset as used by Loconte et al. (2023).
- Sample Size: 2500 statements.
- Cross-Validation: 4-fold cross-validation, with a 75%-25% train-test split.
The project uses a structured fine-tuning process on the GPT-3.5-turbo-0163 model, with cross-validation and explainability analysis:
- Fine-Tuning Process: GPT-3.5 is fine-tuned for 3 epochs per fold using the specified dataset, with labels assigned as either True (T) or False (F).
- Metrics: Key metrics—accuracy, precision, recall, and F-score—are calculated per cross-validation fold.
- Explainability Analysis: A Common Language Effect Size (CLES) analysis is conducted on linguistic features, identifying patterns associated with truthful or deceptive statements.
These visuals enhance understanding of the model’s performance and bias tendencies:
-
Cross-Validation Performance (Table)
The following table summarizes model evaluation metrics (accuracy, precision, recall, F-score) across four folds, showing consistency and reliability.Metric Model 1 Model 2 Model 3 Model 4 Average Value Standard Deviation Accuracy 0.8816 0.8512 0.8688 0.8624 0.8660 0.0110 Precision 0.8692 0.8904 0.8775 0.8558 0.8732 0.0126 Recall 0.8971 0.8100 0.8548 0.8669 0.8572 0.0313 F-score 0.8829 0.8483 0.8660 0.8571 0.8636 0.0128 -
Confusion Matrix (Heatmap)
Displays the distribution of true positives, true negatives, false positives, and false negatives, indicating a truth bias in the model’s classification.
-
Accuracy Comparison by Topic (Bar Chart)
Shows accuracy scores for various topics (e.g., Gay Marriage, Immigration), highlighting the model's effectiveness across distinct subjects.
-
Moving Average Training Loss and Accuracy (Line Charts)
Plots the training accuracy and loss across epochs, reflecting model convergence during fine-tuning.
-
CLES Analysis for Linguistic Features (Bar Chart)
Visual representation of the top linguistic features that differentiate truthful from deceptive statements, as measured by CLES.
🛠️ Highlighting essential tools and methods.
- : Main programming language.
- OpenAI API: For model fine-tuning and evaluation using GPT-3.5-turbo-0163.
- Explainability Analysis: Linguistic feature analysis using Common Language Effect Size (CLES).
Clone the repository and install dependencies to replicate the study:
# Clone the repository
git clone https://github.com/username/GPT-Truth.git
# Navigate to the project directory
cd GPT-Truth
# Install dependencies
pip install -r requirements.txt
The repository includes the following files:
GPT_3_5_Opinioni_Scenario1.ipynb
: Jupyter notebook for the full workflow, from data loading to model fine-tuning and evaluation.Report.pdf
: Detailed report of the study’s methodology and findings.Presentation.pdf
: Summary presentation of key points and visual results.
To run the project, open GPT_3_5_Opinioni_Scenario1.ipynb
in Jupyter Notebook and execute the cells sequentially.
Contributions are welcome! Please refer to the contributing guidelines for more details.