Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSDP-QLoRA doc updates for TRL integration #1471

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

blbadger
Copy link

@blbadger blbadger commented Jan 8, 2025

Small documentation changes to maintain compatibility as the SFTTrainer class received updates as of TRL v0.13.0:

  1. max_seq_length and dataset_text_field moved to become SFTConfig class arguments from the SFTTrainer, causing errors when calling the trainer if these arguments are retained.
  2. The tokenizer argument is being deprecated in favor of the more general processing_class, which can be a tokenizer.

The doc currently does not define the training_arguments arg, but we can assume that this is a an instance of the SFTConfig class in which case max_seq_length and dataset_text_field would be defined there. I have removed these args from the SFTTrainer instantiation because of this, although these arguments could be defined in a config if that is preferable.

Swapping the tokenizer for processing_class deals with the second change. This modification is optional as the tokenizer kwarg is deprecated and swapped automatically, but will probably be useful going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant