Before starting, make sure you meet the following requirements:
- Python 3.x installed.
- Operating System: Preferably Linux or macOS.
- Hardware Requirements: Minimum of four NVIDIA RTX A4500 GPUs, totaling approximately 80 GiB of GPU memory.
Install the required packages using the following command in your terminal:
pip install -r requirements.txt
To ensure the scripts function correctly, you need to update the file paths in each script according to your system setup:
base_model_name
: Identifier for the Hugging Face model.new_model_path
: Where new, unlearned models are saved.pretrained_model_name
: Where combined models are saved.data_name
: Path to the dataset.file_path
: Where evaluation outputs are logged.
base_model_name
: Identifier for the TinyLLaMA model.new_model_path
: Where unlearned model checkpoints are stored.new_model_retrained
: Where retrained TinyLLaMA models are saved.file_path
: Where evaluation outputs are logged.
model_path
: For saving tokenizer and model configurations.output_path
: Where distilled and pre-trained models are saved.data_name
: Name or path of the dataset file.
dataset_name
: Name or path of the dataset file.
- Set
OPENAI_API_KEY
in your environment variables for accessing OpenAI services.
Use the provided shell script main_run.sh
to run all models and scripts simultaneously. Ensure this script is correctly set up with paths to the Python files and is executable:
chmod +x main_run.sh
./main_run.sh
This script runs each Python script in parallel, directing their outputs to designated log files and ensuring comprehensive execution tracking.
Execute the models using the shell script:
./main_run.sh
This command initiates parallel processing of the models and logs their output for review.
You can contribute to this project in several ways:
- Reporting Bugs: Submit detailed reports of any issues encountered.
- Suggesting Enhancements: Propose ideas for improvements or new features.
- Making Pull Requests: Follow the guidelines to create and submit pull requests effectively.
Please refer to CONTRIBUTING.md
for detailed guidelines on contributing to the project.
This project is licensed under the Apache License Version 2.0, January 2004. Full license text is available in the LICENSE
file.