DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
-
Updated
Jan 4, 2025 - Python
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Run Mixtral-8x7B models in Colab or consumer desktops
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Mixture-of-Experts for Large Vision-Language Models
Optimizing inference proxy for LLMs
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Codebase for Aria - an Open Multimodal Native MoE
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Surrogate Modeling Toolbox
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
GMoE could be the next backbone model for many kinds of generalization task.
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
MoH: Multi-Head Attention as Mixture-of-Head Attention
Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.
To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."