Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for cold start #9

Merged
merged 3 commits into from
Mar 12, 2024
Merged

Add test for cold start #9

merged 3 commits into from
Mar 12, 2024

Conversation

aarsilv
Copy link
Collaborator

@aarsilv aarsilv commented Mar 8, 2024

Eppo Internal: 🎟️ **Ticket: ** FF-1599 - Add test to Bandits Beta SDK for cold start (no coefficients) vs uninitialized (no entry at all)

When working on generating the bandit RAC (see this comment in Eppo PR #8785) we realized there are two different situations where a bandit would select a random action:

  1. Uninitialized - We have no knowledge of the bandit (i.e., the bandit key is unrecognized)
  2. Cold Start - We know about the bandit (e.g., it's created) but haven't trained it (i.e., no model parameters yet)

If our UI/UX works as expected, the first one--an uninitialized bandit--shouldn't happen. However, we still want the SDK to handle the case should it arrive.

This PR renamed the test of an uninitialized bandit to better reflect what is happening, and adds a new test for a cold-start bandit.

@@ -23,6 +23,6 @@ static public Variation selectVariation(String inputKey, int subjectShards, List
}

static public double variationProbability(Variation variation, int subjectShards) {
return (double)(variation.getShardRange().end - variation.getShardRange().start + 1) / subjectShards;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This math was wrong because the end of the range is exclusive.

This was caught by the unit test finding 0.3335 probability -- yay unit tests!

@@ -287,7 +287,7 @@ public void testBanditColdStartAction() {

// Attempt to get a bandit assignment
Optional<String> stringAssignment = EppoClient.getInstance().getStringAssignment(
"subject2",
"subject1",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub diff is a bit confused as the old "ColdStart" test was actually for the "Uninitialized" case. So that one has been adjusted, and this one added. Probably easiest to just look at the green and not look at the red part of this.

Comment on lines +323 to +324
assertEquals(0.3333, capturedBanditLog.actionProbability, 0.0002);
assertEquals("falcon cold start", capturedBanditLog.modelVersion);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ key part of this test: action was selected with 1/3 probability and logged as a cold start

}

@Test
public void testBanditUninitializedAction() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the old "ColdStart" test renamed.

verify(mockAssignmentLogger, times(1)).logAssignment(assignmentLogCaptor.capture());
AssignmentLogData capturedAssignmentLog = assignmentLogCaptor.getValue();
assertEquals("uninitialized-bandit-experiment-bandit", capturedAssignmentLog.experiment);
assertEquals("uninitialized-bandit-experiment", capturedAssignmentLog.featureFlag);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the bandit key to be very obvious what is happening

Copy link
Member

@leoromanovsky leoromanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock, I'm on mobile, if you feel confident merging.

Comment on lines +65 to +72
{
"name": "bandit",
"value": "this-bandit-does-not-exist",
"typedValue": "this-bandit-does-not-exist",
"shardRange": {
"start": 2000,
"end": 10000
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused what is being tested using this. IIUC, it's "a flag that will contain a bandit has been created, but the bandit itself hasn't been created". That feels like a weird edge case, especially with our creation flow.

A more important case is when the SDK attempts to get a bandit action for which no bandit exists, the flag that contains it doesn't exist, etc. It's just not in the RAC at all. Are we testing that case? (Please let me know if this question doesn't make sense – I might be misunderstanding the whole architecture.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had that test already--I've been calling it an "uninitialized bandit". (source)

This newly added "cold start" test is making sure we handle the more likely case that a user creates a bandit in our UI so the bandit exists, but it hasn't been trained yet.

@aarsilv aarsilv merged commit f501161 into main Mar 12, 2024
1 check passed
@aarsilv aarsilv deleted the aaron/ff-1599/test-for-cold-start branch March 12, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants