Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can it make Lora sft? #24

Open
ReverseSystem001 opened this issue Nov 6, 2023 · 2 comments
Open

can it make Lora sft? #24

ReverseSystem001 opened this issue Nov 6, 2023 · 2 comments

Comments

@ReverseSystem001
Copy link

Limited by graphics card devices. For most people, Lora is the only way to fine-tuning. can it make lora sft?

@CoinCheung
Copy link
Owner

Hi, thanks for paying attention to this !!

This repo is currently designed for full parameter finetuning, but lora freezes most of the parameters. Since they contradict with each other, currently this repo does not support lora.

This repo bases on pipeline method, which allows you to train your model with DP + PP (megatronLM is DP + PP + TP, the so-called 3D layout). This is faster and requires less memory than zero-based methods when there is not so many gpus (100+). You can train a 7b or 13b model on a server with 8 gpus (24G), which I believe many companies can afford to.

@ReverseSystem001
Copy link
Author

ReverseSystem001 commented Nov 7, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants