-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION] Protocol of adding a new model (Stella_en_<*>_v5 family) implementation with Candle #2525
Comments
Adding popular models such as these ones is certainly fine (as long as the license permit it, e.g. yolo-v8 directly in candle-transformers or llama 3.2 vision would be problematic). On best practices:
|
Thanks a ton for the elaborate response @LaurentMazare, I'll take a stab at it after verifying the license. |
Am I missing something, the smaller stella_400M docs say its based off the same model but the config lists different activations functions and swapping of the model address produces: |
I don't think you are missing anything .. I was stumped there too. The 1.5B variant and 400M variant are not just config changes. I've not had the chance to work on the 400M variant yet for Candle transformers. Candle transformers support the 1.5B variant for now. |
Ah thanks for the heads up! If you dont mind what was your process for the 1.5B? The 400 has a modeling.py that Iv got converted and outputting embeddings but not matching the sample and they're way off. |
Yes, thats what I did .. picked up the base Qwen model Candle implementation, removed everything and kept adding one Just noting down a few misses I had during the phase you are in that resulted in output mismatch:
Some of the above may be obvious .. these are just mistakes I made. By the way .. a PR with 400M would be welcome. Would very much love to hear about the progress you make. |
The tokenizer padding_side comes from tokenization_qwen.py right but the 400M doesn't use that as far as I can tell. Seems like more based on gte-large-en-v1.5 |
The tokenizer config actually comes from tokenizer_config and yes they are using Most importantly, they define custom To implement this in candle we'll need to walk through and re-write starting with Checkout the linked By the way, for the |
Yea got which repo just wasn't clear on if there was an existing architecture reference in candle. Ended up just starting from scratch, seems like it needs this custom rope implementation also. Can get this output but not clear on how to hook in with the MLR layer. |
Below's what I got so far, can reproduce the repos example and followed your 1.5B example. |
Hi,
I have a working implementation of Stella_en_<*>_v5 family of models which is one of the top ranking model in the MTEB leaderboard for reranking and retrieval.
It's basically built on top of the
candle-transformers::qwen2
implementation with the language modeling head swapped for their pre-trained dense layer.I was hoping to open a pull request with candle along with an example.
Questions:
candle_transformers
? Something akin to How to add a model totransformers
embedding
models are there anycandle
standard API implementations required? My implementations just spits out the logits from the forward pass.Looking for some guidance. Thanks in advance.
CC:
@EricLBuehler @LaurentMazare
The text was updated successfully, but these errors were encountered: