Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[router] Improve HAR routing in Router #1414

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gaojieliu
Copy link
Contributor

@gaojieliu gaojieliu commented Dec 20, 2024

Update Venice Router HAR least-loaded algo to take group latency into account and here
are the reasons:
a. When the total qps is relatively low, the pending request count based load rebalancing
algo doesn't work well as most of the time, the pending count is 0. One example: let us
say, the avg latency of one group is 10ms, and if the group qps is 10, that means very likely
only 10-20% of the time, there are some pending requests and even another group latency is
much lower: 1-2ms, the faster group won't be selected, and the latency-based LB algo will kick
in when the above case happens to prefer the faster group if the pending count is equal among all
the groups.
b. Completely getting rid of pending request count based LB will result in another issue as the relatively
slower group will get almost zero request.
c. A combination of pending request and latency will select the faster groups when all groups are busy or idle,
and it will still send some amount of requests to the slower groups in case it is too idle and it will help
bring back the slower group into the rotation when it is recovered from the slowness.

How was this PR tested?

CI

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

Update Venice Router HAR least-loaded algo to take group latency into account and here
are the reasons:
   a. When the total qps is relatively low, the pending request count based load rebalancing
      algo doesn't work well as most of the time, the pending count is 0. One example: let us
      say, the avg latency of one group is 10ms, and if the group qps is 10, that means very likely
      only 10-20% of the time, there are some pending requests and even another group latency is
      much lower: 1-2ms, the faster group won't be selected, and the latency-based LB algo will kick
      in when the above case happens to prefer the faster group if the pending count is equal among all
      the groups.
   b. Completely getting rid of pending request count based LB will result in another issue as the relatively
      slower group will get almost zero request.
   c. A combination of pending request and latency will select the faster groups when all groups are busy or idle,
      and it will still send some amount of requests to the slower groups in case it is too idle and it will help
      bring back the slower group into the rotation when it is recovered from the slowness.
@gaojieliu gaojieliu force-pushed the Router_HAR_enhancement branch from d82217c to f84c662 Compare January 2, 2025 19:48
@gaojieliu gaojieliu changed the title [router][server] Fixed the conn metric in server and improve HAR routing in Router [router] Improve HAR routing in Router Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant