globus-labs / self-adaptive-llms Public

Notifications You must be signed in to change notification settings
Fork 0
Star 2

Per-Prompt Adaptive Parameter and Resource Allocation

2 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Repository files navigation

Per-Prompt Adaptive Parameter and Resource Allocation

Scale is the Name of the Game:

Need to cut the power-usage of large models by a factor of 1000
Not every prompt needs the same treatment

What:

Fine-grain selection of parameters based on the difficulty of the prompt
Allocate resources to prompt based off parameters/cost/system state
Adaptive tradeoff navigation at multiple levels (# of experts, # of agents, “thinking time”)

Roadmap:

Variable scaling of number of experts based off prompt
Integrating a allocation system per-prompt
Semantic Caching of parameters based on usage patterns; cache-aware routing
Prompt and Agent based feedback to the routing network

Members: Arham, Matt, Alok, Valerie, Hai, Haochen

About

Per-Prompt Adaptive Parameter and Resource Allocation

Custom properties

Report repository

Releases

No releases published

Packages

No packages published