use token ids in vllm #153

AlexPiche · 2024-12-21T02:08:49Z

Use the token_ids produce by vLLM

Simplify the Trainable class, e.g. no need to remove leading white space, etc.
Process the log probs and record the prodsuced token ids
Remove all the token verification from rl_orchestrator.py
Modify finetune/data.py to overwrite the huggingface generated input_ids and labels when these are provided in entry.

Replications

DeepSeek RFT

Before

After

Gemma2 2b RFT

train_mean_reward does not improve as much as before, but test_mean_reward behaves similarly.

Before

After

tapeagents/llms.py

rizar

God stuff! LGTM! I'd try to simplify the code in finetune.data before the merge though. I feel like the way is written is a bit of a trap that can give us trouble later on.

examples/rl_gsm8k/orchestrate_rl.py

tapeagents/finetune/data.py

tapeagents/llms.py

AlexPiche added 28 commits December 21, 2024 02:08

use token ids

348ca2f

skip empty tokens

ce463f7

fix attention_mask

7bc2e83

fix try except

b591885

use complete

582a0e7

revert to chat complete

51107e9

rm rm_leading_white_space

06876e2

no max batch tokens

210a933

fix typo

03325c8

do not encode empty token

cda2e18

128_max_num_batched_tokens

0653aeb

128 max tokens

2ec847f

max num seqs

d58158f

--max-num-seqs: 64

7963a03

max num seqs 128

f4a781d

256 max num tokens

dbc0c0b

easier log prob

f74016d

64 max num seqs

91a1886

no chuncked prefill

aa44c4c

baseline

769dc64

max num seqs 64

51cf53c

clean up

3acd833

max num batched tokens 256

97757b1

clean vllm_args

3808056

add assert and logger info

7b1290b

Merge remote-tracking branch 'origin/main' into vllm_token_ids

7b021f5

check if entry input ids is empty

ed901c2

fix test

1a1ae2c

AlexPiche changed the title ~~[WIP] use token ids in vllm~~ use token ids in vllm Dec 27, 2024

rizar reviewed Jan 1, 2025

View reviewed changes

tapeagents/llms.py Show resolved Hide resolved

rizar approved these changes Jan 6, 2025

View reviewed changes

examples/rl_gsm8k/orchestrate_rl.py Outdated Show resolved Hide resolved

tapeagents/finetune/data.py Show resolved Hide resolved

tapeagents/llms.py Show resolved Hide resolved

AlexPiche added 2 commits January 6, 2025 15:54

simplify finetune preprocessing code

2dabb13

Merge remote-tracking branch 'origin/main' into vllm_token_ids

c3e39db

AlexPiche merged commit 535afdd into main Jan 6, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use token ids in vllm #153

use token ids in vllm #153

AlexPiche commented Dec 21, 2024 •

edited

Loading

rizar left a comment

use token ids in vllm #153

use token ids in vllm #153

Conversation

AlexPiche commented Dec 21, 2024 • edited Loading

Use the token_ids produce by vLLM

Replications

DeepSeek RFT

Before

After

Gemma2 2b RFT

Before

After

rizar left a comment

Choose a reason for hiding this comment

AlexPiche commented Dec 21, 2024 •

edited

Loading