Add test for cold start #9

aarsilv · 2024-03-08T23:20:30Z

Eppo Internal: 🎟️ **Ticket: ** FF-1599 - Add test to Bandits Beta SDK for cold start (no coefficients) vs uninitialized (no entry at all)

When working on generating the bandit RAC (see this comment in Eppo PR #8785) we realized there are two different situations where a bandit would select a random action:

Uninitialized - We have no knowledge of the bandit (i.e., the bandit key is unrecognized)
Cold Start - We know about the bandit (e.g., it's created) but haven't trained it (i.e., no model parameters yet)

If our UI/UX works as expected, the first one--an uninitialized bandit--shouldn't happen. However, we still want the SDK to handle the case should it arrive.

This PR renamed the test of an uninitialized bandit to better reflect what is happening, and adds a new test for a cold-start bandit.

aarsilv · 2024-03-08T23:20:58Z

src/main/java/com/eppo/sdk/helpers/VariationHelper.java

@@ -23,6 +23,6 @@ static public Variation selectVariation(String inputKey, int subjectShards, List
    }

    static public double variationProbability(Variation variation, int subjectShards) {
-        return (double)(variation.getShardRange().end - variation.getShardRange().start + 1) / subjectShards;


This math was wrong because the end of the range is exclusive.

This was caught by the unit test finding 0.3335 probability -- yay unit tests!

aarsilv · 2024-03-08T23:23:14Z

src/test/java/com/eppo/sdk/EppoClientTest.java

@@ -287,7 +287,7 @@ public void testBanditColdStartAction() {

    // Attempt to get a bandit assignment
    Optional<String> stringAssignment = EppoClient.getInstance().getStringAssignment(
-      "subject2",
+      "subject1",


GitHub diff is a bit confused as the old "ColdStart" test was actually for the "Uninitialized" case. So that one has been adjusted, and this one added. Probably easiest to just look at the green and not look at the red part of this.

aarsilv · 2024-03-08T23:23:47Z

src/test/java/com/eppo/sdk/EppoClientTest.java

+    assertEquals(0.3333, capturedBanditLog.actionProbability, 0.0002);
+    assertEquals("falcon cold start", capturedBanditLog.modelVersion);


☝️ key part of this test: action was selected with 1/3 probability and logged as a cold start

aarsilv · 2024-03-08T23:24:11Z

src/test/java/com/eppo/sdk/EppoClientTest.java

+  }
+
+  @Test
+  public void testBanditUninitializedAction() {


This is the old "ColdStart" test renamed.

aarsilv · 2024-03-08T23:24:29Z

src/test/java/com/eppo/sdk/EppoClientTest.java

+    verify(mockAssignmentLogger, times(1)).logAssignment(assignmentLogCaptor.capture());
+    AssignmentLogData capturedAssignmentLog = assignmentLogCaptor.getValue();
+    assertEquals("uninitialized-bandit-experiment-bandit", capturedAssignmentLog.experiment);
+    assertEquals("uninitialized-bandit-experiment", capturedAssignmentLog.featureFlag);


renamed the bandit key to be very obvious what is happening

leoromanovsky

Approving to unblock, I'm on mobile, if you feel confident merging.

giorgiomartini0 · 2024-03-09T00:18:35Z

src/test/resources/bandits/rac-experiments-bandits-beta.json

+            {
+              "name": "bandit",
+              "value": "this-bandit-does-not-exist",
+              "typedValue": "this-bandit-does-not-exist",
+              "shardRange": {
+                "start": 2000,
+                "end": 10000
+              },


I'm a bit confused what is being tested using this. IIUC, it's "a flag that will contain a bandit has been created, but the bandit itself hasn't been created". That feels like a weird edge case, especially with our creation flow.

A more important case is when the SDK attempts to get a bandit action for which no bandit exists, the flag that contains it doesn't exist, etc. It's just not in the RAC at all. Are we testing that case? (Please let me know if this question doesn't make sense – I might be misunderstanding the whole architecture.)

We had that test already--I've been calling it an "uninitialized bandit". (source)

This newly added "cold start" test is making sure we handle the more likely case that a user creates a bandit in our UI so the bandit exists, but it hasn't been trained yet.

explicit test for cold start

8da6eba

aarsilv commented Mar 8, 2024

View reviewed changes

aarsilv requested review from leoromanovsky and giorgiomartini0 March 8, 2024 23:25

aarsilv assigned giorgiomartini0 Mar 8, 2024

aarsilv added 2 commits March 8, 2024 16:27

upgrade gcloud to appease linter

7fb1e2d

update deserializer test

4054d8c

leoromanovsky approved these changes Mar 8, 2024

View reviewed changes

giorgiomartini0 reviewed Mar 9, 2024

View reviewed changes

aarsilv merged commit f501161 into main Mar 12, 2024
1 check passed

aarsilv deleted the aaron/ff-1599/test-for-cold-start branch March 12, 2024 21:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for cold start #9

Add test for cold start #9

aarsilv commented Mar 8, 2024

aarsilv Mar 8, 2024

aarsilv Mar 8, 2024

aarsilv Mar 8, 2024

aarsilv Mar 8, 2024

aarsilv Mar 8, 2024

leoromanovsky left a comment

giorgiomartini0 Mar 9, 2024

aarsilv Mar 12, 2024

		assertEquals(0.3333, capturedBanditLog.actionProbability, 0.0002);
		assertEquals("falcon cold start", capturedBanditLog.modelVersion);

Add test for cold start #9

Add test for cold start #9

Conversation

aarsilv commented Mar 8, 2024

aarsilv Mar 8, 2024

Choose a reason for hiding this comment

aarsilv Mar 8, 2024

Choose a reason for hiding this comment

aarsilv Mar 8, 2024

Choose a reason for hiding this comment

aarsilv Mar 8, 2024

Choose a reason for hiding this comment

aarsilv Mar 8, 2024

Choose a reason for hiding this comment

leoromanovsky left a comment

Choose a reason for hiding this comment

giorgiomartini0 Mar 9, 2024

Choose a reason for hiding this comment

aarsilv Mar 12, 2024

Choose a reason for hiding this comment