Skip to content
This repository has been archived by the owner on Nov 29, 2024. It is now read-only.

Added shapley values to rest scorer endpoint #241

Merged
merged 21 commits into from
Aug 27, 2021
Merged

Conversation

Rajimut
Copy link
Contributor

@Rajimut Rajimut commented Aug 5, 2021

Aim: to provide Shapley values along with the /model/score endpoint optionally if requested by the client

The boolean is called shapley_results
(still on discussion here https://docs.google.com/document/d/1HWArQP7RTqJ7JHt2C_Tw8MAZkBCIfwCnLT9cR2kLdT8/edit#)
The boolean can take true or false.

When true - it provides Shapley values for transformed features
When false - it provides just the predictions

When the boolean is not specified the response contains just the predictions (for backward compatibility)

Sample request

curl -X POST -H "Content-Type: application/json" -d @- http://localhost:8080/model/score?shapley_results=true << EOF
> {
> "fields": [
> "SEX",
> "EDUCATION",
> "MARRIAGE",
> "AGE",
> "PAY_0",
> "PAY_2",
> "PAY_3",
> "PAY_5",
> "PAY_6",
> "BILL_AMT1",
> "BILL_AMT2",
> "BILL_AMT3",
> "BILL_AMT4",
> "BILL_AMT5",
> "BILL_AMT6",
> "default payment next month"
> ],
> "rows": [
> [
> "200000",
> "5656540",
> "545645",
> "46560",
> "5646540",
> "65650",
> "8970",
> "650",
> "65450",
> "650",
> "323",
> "65654610",
> "54654560",
> "6540",
> "5654640",
> "45"
> ]
> ]
> }
> EOF

Sample response:

{
    "fields": [
        "DEFAULT_PAYMENT_NEXT_MONTH.0",
        "DEFAULT_PAYMENT_NEXT_MONTH.1"
    ],
    "id": "d4d9f3b2-6684-11eb-b26b-0242ac160002",
    "inputShapleyContributions": {
        "contributions": [
            [
                "0.08655235576142849",
                "0.17755834498323503",
                "0.11726781304049533",
                "0.05944305682214447",
                "0.18670045406517202",
                "0.19617270279719623",
                "0.0235254064670083",
                "0.011334227180886265",
                "0.1293810222042146",
                "0.01399531834032121",
                "0.4860454284545842",
                "-0.017030369151606387",
                "-0.03653463864110455",
                "0.02236090691885419",
                "0.07098252251027308",
                "0.062368919563852994",
                "-0.018671770872832012",
                "0.04777002737189121",
                "0.2882069919194666",
                "-0.11866462772248354",
                "-1.5065382607973277",
                "0.005561879751873049",
                "0.012783472437596623",
                "0.0019412590251431153",
                "-0.020104641954568125",
                "-0.0013931671882688665",
                "-0.007807095850078645",
                "-0.0019267584840470028",
                "-0.004633151264516323"
            ]
        ],
        "fields": [
            "contrib_0_AGE",
            "contrib_12_PAY_5",
            "contrib_14_PAY_AMT1",
            "contrib_15_PAY_AMT2",
            "contrib_16_PAY_AMT3",
            "contrib_17_PAY_AMT4",
            "contrib_18_PAY_AMT5",
            "contrib_19_PAY_AMT6",
            "contrib_1_BILL_AMT1",
            "contrib_28_CVTE:EDUCATION.0",
            "contrib_29_NumToCatTE:PAY_1:PAY_2:PAY_3:PAY_6.0",
            "contrib_2_BILL_AMT2",
            "contrib_34_NumToCatTE:PAY_1:PAY_2:PAY_AMT2.0",
            "contrib_35_NumToCatWoE:PAY_1:PAY_2:PAY_3.0",
            "contrib_36_PAY_6",
            "contrib_3_BILL_AMT3",
            "contrib_5_BILL_AMT5",
            "contrib_6_BILL_AMT6",
            "contrib_7_LIMIT_BAL",
            "contrib_8_PAY_1",
            "contrib_bias",
            "contrib_39_CVCatNumEnc:EDUCATION:MARRIAGE:PAY_3.max",
            "contrib_39_CVCatNumEnc:EDUCATION:MARRIAGE:PAY_AMT2.max",
            "contrib_39_CVCatNumEnc:EDUCATION:MARRIAGE:PAY_AMT4.max",
            "contrib_41_CVCatNumEnc:EDUCATION:MARRIAGE:SEX:PAY_1.count",
            "contrib_41_CVCatNumEnc:EDUCATION:MARRIAGE:SEX:PAY_2.count",
            "contrib_42_CVCatNumEnc:EDUCATION:AGE.mean",
            "contrib_42_CVCatNumEnc:EDUCATION:PAY_1.mean",
            "contrib_43_Freq:EDUCATION:MARRIAGE:SEX"
        ]
    },
    "score": [
        [
            "0.43373027989329327",
            "0.5662697201067067"
        ]
    ]
}

@Rajimut Rajimut self-assigned this Aug 5, 2021
common/swagger/swagger.yaml Outdated Show resolved Hide resolved
common/swagger/swagger.yaml Outdated Show resolved Hide resolved
@mmalohlava
Copy link
Member

@Rajimut @nkpng2k i would suggest to use DAI scoring pipeline and output shapley from it. You will see the format - the main point the shap are not mapping 1:1 to input features - categorical are handles in expanded way + do not forget on bias term which is part of output (in DAI or in plain XGB)

@mmalohlava
Copy link
Member

one more point: pls make sure that user can:

  • ask only for predictions (avoid computing Shap values since it is time expensive)
  • ask only for shap values (unusual request)
  • ask for both (pretty usual in our support channel)

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 6, 2021

@Rajimut @nkpng2k i would suggest to use DAI scoring pipeline and output shapley from it. You will see the format - the main point the shap are not mapping 1:1 to input features - categorical are handles in expanded way + do not forget on bias term which is part of output (in DAI or in plain XGB)

@mmalohlava are not the columns seen in the sample response (first comment) #241 (comment) not a decent representation of what we should expect?... At least, of shapley on transformed features?

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 6, 2021

@Rajimut @nkpng2k i would suggest to use DAI scoring pipeline and output shapley from it. You will see the format - the main point the shap are not mapping 1:1 to input features - categorical are handles in expanded way + do not forget on bias term which is part of output (in DAI or in plain XGB)

@mmalohlava -
Checked it here
Looks like we are using a boolean called pred_contribs which is set as true to get Shapley values alone without predictions.
But I do see that we only provide contributions without column names
Screenshot 2021-08-05 at 6 05 27 PM
Since shap values are not 1:1 won't we need input columns this way as shown here?

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 6, 2021

@Rajimut @nkpng2k i would suggest to use DAI scoring pipeline and output shapley from it. You will see the format - the main point the shap are not mapping 1:1 to input features - categorical are handles in expanded way + do not forget on bias term which is part of output (in DAI or in plain XGB)

@mmalohlava -
Checked it here
Looks like we are using a boolean called pred_contribs which is set as true to get Shapley values alone without predictions.
But I do see that we only provide contributions without column names
Screenshot 2021-08-05 at 6 05 27 PM
Since shap values are not 1:1 won't we need input columns this way as shown here?

@Rajimut I think this is different... these are shap contributions on the original features. cc @mmalohlava

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 6, 2021

@Rajimut @nkpng2k i would suggest to use DAI scoring pipeline and output shapley from it. You will see the format - the main point the shap are not mapping 1:1 to input features - categorical are handles in expanded way + do not forget on bias term which is part of output (in DAI or in plain XGB)

@mmalohlava -
Checked it here
Looks like we are using a boolean called pred_contribs which is set as true to get Shapley values alone without predictions.
But I do see that we only provide contributions without column names
Screenshot 2021-08-05 at 6 05 27 PM
Since shap values are not 1:1 won't we need input columns this way as shown here?

@Rajimut I think this is different... these are shap contributions on the original features. cc @mmalohlava

Ah I see.. thanks, so that leaves us with what we have column names listed out along with their contributions for transformed features

@mmalohlava
Copy link
Member

mmalohlava commented Aug 6, 2021

Here is example: a simple pipeline for
iris.csv - only orig transformer (means no FE - only identity) - multinomial problem (3 output classes):

Call of scoring pipeline:

columns = [
    pd.Series(['4.599999904632568', '4.900000095367432', '4.800000190734863', '4.699999809265137', '5.099999904632568', '5.199999809265137', '4.400000095367432', '5.0', '5.199999809265137', '4.900000095367432', '5.099999904632568', '4.300000190734863', '4.800000190734863', '4.300000190734863', '5.199999809265137'], name='Sepal_Length', dtype='float32'),
    pd.Series(['1.600000023841858', '1.2000000476837158', '1.0', '1.5', '1.2000000476837158', '1.600000023841858', '1.2999999523162842', '1.7000000476837158', '1.0', '3.0', '1.0', '1.2999999523162842', '1.2000000476837158', '1.2999999523162842', '1.100000023841858'], name='Petal_Length', dtype='float32'),
]
df = pd.concat(columns, axis=1)
preds = (scorer.score_batch(df, apply_data_recipes=False,  pred_contribs=True, pred_contribs_original=True))
print(preds)
preds.to_csv("preds.csv")

result:

    contrib_Petal_Length.Iris-setosa  contrib_Petal_Width.Iris-setosa  contrib_Sepal_Length.Iris-setosa  ...  contrib_Sepal_Length.Iris-virginica  contrib_Sepal_Width.Iris-virginica  contrib_bias.Iris-virginica
0                          28.859739                              0.0                         -5.731611  ...                             1.898529                                 0.0                    -3.937054
1                          34.209087                              0.0                         -4.348648  ...                             1.440438                                 0.0                    -3.937054
2                          36.883762                              0.0                         -4.809634  ...                             1.593135                                 0.0                    -3.937054
3                          30.197077                              0.0                         -5.270624  ...                             1.745832                                 0.0                    -3.937054
4                          34.209087                              0.0                         -3.426673  ...                             1.135045                                 0.0                    -3.937054
5                          28.859739                              0.0                         -2.965687  ...                             0.982349                                 0.0                    -3.937054
6                          32.871750                              0.0                         -6.653584  ...                             2.203922                                 0.0                    -3.937054
7                          27.522402                              0.0                         -3.887660  ...                             1.287742                                 0.0                    -3.937054
8                          36.883762                              0.0                         -2.965687  ...                             0.982349                                 0.0                    -3.937054
9                          10.137017                              0.0                         -4.348648  ...                             1.440438                                 0.0                    -3.937054
10                         36.883762                              0.0                         -3.426673  ...                             1.135045                                 0.0                    -3.937054
11                         32.871750                              0.0                         -7.114572  ...                             2.356618                                 0.0                    -3.937054
12                         34.209087                              0.0                         -4.809634  ...                             1.593135                                 0.0                    -3.937054
13                         32.871750                              0.0                         -7.114572  ...                             2.356618                                 0.0                    -3.937054
14                         35.546429                              0.0                         -2.965687  ...                             0.982349                                 0.0                    -3.937054

See full file:
preds.csv

💡 Observation:

  • see the this is multinomial problem - 3 output classes
  • each contrib feature has 3 elements per output class
  • there are 3 bias terms per output class

@mmalohlava
Copy link
Member

mmalohlava commented Aug 6, 2021

This is another example, using the same dataset, but forcing DAI to generate some transformed features. Then if i score data:

  1. user is interested only in Shap values of original features - this situation is similar to the situation above:
columns = [
    pd.Series(['4.599999904632568' ], name='Sepal_Length', dtype='float32'),
    pd.Series(['2.700000047683716'], name='Sepal_Width', dtype='float32'),
    pd.Series(['1.2999999523162842'], name='Petal_Length', dtype='float32'),
    pd.Series(['1.2999999523162842'], name='Petal_Width', dtype='float32'),
]
df = pd.concat(columns, axis=1)
preds = (scorer.score_batch(df, apply_data_recipes=False, pred_contribs=True, pred_contribs_original=True))
print(preds)
preds.to_csv("preds_only_original.csv")

then result is:

   contrib_Petal_Length.Iris-setosa  contrib_Petal_Width.Iris-setosa  contrib_Sepal_Length.Iris-setosa  ...  contrib_Sepal_Length.Iris-virginica  contrib_Sepal_Width.Iris-virginica  contrib_bias.Iris-virginica
0                          1.367255                          1.23179                          0.050558  ...                            -0.104741                           -0.370407                    -1.846507

(see full results here: preds_only_original.csv)

  1. user is interested in Shap values of transformed features - this will return all contributions of all features which are directly connected to internal models (XGB, lgbm):
print('---------- Score Frame ----------')
columns = [
    pd.Series(['4.599999904632568' ], name='Sepal_Length', dtype='float32'),
    pd.Series(['2.700000047683716'], name='Sepal_Width', dtype='float32'),
    pd.Series(['1.2999999523162842'], name='Petal_Length', dtype='float32'),
    pd.Series(['1.2999999523162842'], name='Petal_Width', dtype='float32'),
]
df = pd.concat(columns, axis=1)

preds = (scorer.score_batch(df, apply_data_recipes=False, pred_contribs=True, pred_contribs_original=False))
print(preds)
preds.to_csv("preds_only_transformed.csv")

result:

  contrib_0_InteractionAdd:Petal_Length:Petal_Width.Iris-setosa  ...  contrib_bias.Iris-virginica
0                                           2.645117              ...                    -1.846507

the full results:
preds_only_transformed.csv

Copy link

@zoido zoido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the example output correct? I cannot match it to the https://github.com/h2oai/mojo2/blob/7a1ab76b09f056334842a5b442ff89859aabf518/doc/shap.md

What if we have "richer" data structure in the output? Something that would not to be parsed to get the combination. Is it useful? Is the notation in the example and the description something standard?

common/swagger/swagger.yaml Outdated Show resolved Hide resolved
Copy link
Member

@orendain orendain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New to Shap values, apologies if the following may not be as relevant as I think it is.


Seems like the separatoring character between InputName and OutputIndex (i.e. _, according to MOJO2 doc) might generate non-regular field names. E.g., since the InputNames (e.g., fields) that Shap is generating also have at least one underscores themselves, it seems like it might be difficult to deterministically parse into InputName and OutputIndex.

I assume the reason why the sample output in the original description doesn't include the OutputIndex (i.e. output class) is because the sample problem is binomial. Had the problem been multinomial, and the number of shap values returned been multipied, I imagine it would be difficult to grok what each Shap field was referring to (because of the way the contrib labels are generated).

The sample output shows the result as a mapping from shap label to shap value. I wonder what a nested structure would look like, where the top-level is a mapping from output class to the same structure in the sample response.

E.g.:

shap_val = [
  {
    output_class = "iris-setosa"
    data = {
      fields = [ "contrib_0_AGE", "contrib_12_Pay" ...]
      contributions = [ "0.123", "0.456", ...]
    }
  },

  {
    output_class = "iris-virginica"
    data = {
      ...
    }
  }
}

That type of structure may be more intuitive to traverse, considering it seems we may have dozens++ of shap fields that are similarly labeled? At least, the user doesn't have to attempt to parse the Shap-generated field names. Unless there's a simple grammar that exists that just isn't documented.


Q: Looking at @mmalohlava 's example: Does the MOJO2 lib have a way for clients to define that they want Shap values for just original features? The MOJO2 doc shows setShapPredictContrib(boolean), but no parameter for specifying if values should be generated for original or also transformed columns. If it does (doesn't look like it?), would that be a good param for this API to expose as well?

Else, at the moment it looks like MOJO2 gives us Shap values for all original + transformed columns by default. In which case, do users expect Shap values to be lumped together, or separated into original-only and transformed-only?

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 6, 2021

So, I discussed little bit with @mmalohlava earlier... so will try my best to summarize all the comments in to somethings somewhat cohesive:

  1. java mojo currently can only provide shapley values for the transformed feature set NOT the original features. However, the python scoring pipeline is capable of providing both.
  2. as per the examples provided for the iris dataset (multiclass -- 3 classes in this case) there will be extra bias columns that are included in the shapley predictions as well as potentially additional columns for each feature + class
  • Ex. input = [col1, col2, col3], labels = [label1, label2] ==> predictions = [target.label1, target.label2], shapley = [contrib_col1, contrib_col2, contrib_interaction_col1_col2, contrib_other_feature_transformation_col2_col3, contrib_col1.label1, contrib_col1.label2, contrib_bias_label1, contrib_bias_label2 ... etc]
  • The number of columns ouput by the shapley predictions can be quite large, even for a small feature space.
  1. Shapley predictions will come as a 2D array/table in which the index of the table is directly related to the index of the predictions.

so for input:

id col1 col2 col3
0 dog 1 0.55
1 cat 4 1.34
2 bird 2 -5.2

you could get (assuming you joined the predictions and shapley results

id target.label1 target.label2 contrib_col1 contrib_interaction_col1_col2 contrib_cvte:col1_col2 contrib_bias_label1 etc
0 0.6 0.4 0.123 - 22.33 162.99 - 32.532 ...
1 0.3 0.7 - 32.11 1.342 0.233 - 0.44 ...
2 0.9 0.1 55.323 -2.365 2.5873 - 5.9867 ...

Related to 2/3, it is important I think, to acknowledge that output of shapley values can be quite large/complex, but in the context of the output of response from the api, this is less important.

My suggestion for the api is as follows:

  1. /model/score update the ScoreRequest to include an optional parameter shapValues (or something of the sort). This, I think needs to be an enum [None, Original, Transformed]
  • None --> return just predictions
  • Original --> return a single table with both predictions and shapley values on original features (not yet available for java mojo)
  • Transformed --> return a single table with both predictions and shapley values on transformed features
  1. add endpoint for /model/contrib which would accept ContribScoreRequest and return either shapley values for transformed or original features dependent on the parameters.
  2. add endpoint for /model/predict which would assume the same behavior as /model/score prior to api changes. (we could alternatively add this as THE endpoint for scoring optionally predictions + shapley values and leave /model/score untouched.

^^ or something of the sort. My reasoning is as follows:

  1. this leaves the api for /model/score relatively unscathed. The default behavior would be only to return the predictions, and means we wouldn't break backwards compatibility (as far as I would know)
  2. having some endpoint, either /model/score or /model/predict to allow users to get all possible predictions with single http request is ideal
  3. users will have option to get ONLY predictions or shapley values if they want

@orendain @mmalohlava @mwysokin @zoido @Rajimut WDYT?

@mmalohlava
Copy link
Member

mmalohlava commented Aug 6, 2021

@zoido

Is the example output correct? I cannot match it to the

yes, it is copy-paste from the actual output of Python scoring pipeline, i can provide reproducible example.

What if we have "richer" data structure in the output? Something that would not to be parsed to get the combination. Is it useful?

not sure, if i understand the question

Is the notation in the example and the description something standard?

It is Python scoring pipeline output, in MOJO we trying to get close to it as much as possible - what is important is: (1) names of original features, (2) names of output category in case of multinomial problem, (3) clear separation of bias term (that it cannot be confused with any of feature shap value)

@mmalohlava
Copy link
Member

@orendain

reading your comment and you have good points there: i think if we are producing shapley values for original features (the inputs of pipeline) we do not need to list names (like contrib_0_AGE, ...) - we just need to output them in the order of input features. However, if we produce Shap values for transformed features, we have to output names since they reflect names of internally engineered features.

Looking at @mmalohlava 's example: Does the MOJO2 lib have a way for clients to define that they want Shap values for just original features? The MOJO2 doc shows setShapPredictContrib(boolean), but no parameter for specifying if values should be generated for original or also transformed columns. If it does (doesn't look like it?), would that be a good param for this API to expose as well?

not yet, right now only transformed, but we will have to separate the API calls (CC: @pkozelka ) - original vs transformed.

Else, at the moment it looks like MOJO2 gives us Shap values for all original + transformed columns by default. In which case, do users expect Shap values to be lumped together, or separated into original-only and transformed-only?

At the moment MOJO provides only Shap values of transformed features (note: the list can still contain original features if they are input for any of internal models).
IMHO the end-users will be mostly interested in Shap values of original features.

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 6, 2021

@mmalohlava I would argue we should always output the name of the column. Noting that in the api the user can request for a certain column to be included in the response. includeFieldsInOutput parameter https://github.com/h2oai/dai-deployment-templates/blob/master/common/swagger/swagger.yaml#L147.... therefore there is quite possibly going to be collision between this and the response from the shapley predictions. Prefixing the shapley response columns with contrib_ regardless of whether its on original features or transformed features more or less eliminates this issue, and reduces confusion.

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 9, 2021

Reg the api request

@nkpng2k

/model/score update the ScoreRequest to include an optional parameter shapValues (or something of the sort). This, I think needs to be an enum [None, Original, Transformed]
None --> return just predictions
Original --> return a single table with both predictions and shapley values on original features (not yet available for java mojo)
Transformed --> return a single table with both predictions and shapley values on transformed features
add endpoint for /model/contrib which would accept ContribScoreRequest and return either Shapley values for transformed or original features dependent on the parameters.

I definitely like this idea. I think if users are not providing any enum parameter then we could default it to None, thus providing backward compatibility for the existing /score endpoint.

Reg api Response
@mmalohlava - Is there a way to know if the shap outputs will be binomial or multinomial before hand? In other words can we have the Output class names separated from the contrib column contrib_ by . - like contrib_Petal_Length.Iris-setosa to determine if the obtained output is a 2D array of shap values?

In that case it might make sense to define the following, output class can be parsed for the 2D structure below as per @orendain 's example

E.g.:

shap_val = [
{
output_class = "iris-setosa"
data = {
fields = [ "contrib_0_AGE", "contrib_12_Pay" ...]
contributions = [ "0.123", "0.456", ...]
}
},

{
output_class = "iris-virginica"
data = {
...
}
}
}

@arnocandel
Copy link
Member

https://github.com/h2oai/h2oai/blob/418cb3eab13e69d01d72a9e8a2a213e4c9981091/h2oaicore/transformer_utils.py#L4598-L4636 this makes the column names

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 18, 2021

@orendain - as discussed the support for shap values for h2o-3 mojos is not available currently in the mojo2 library

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 19, 2021

FYI I tried this on an H2O-3 MOJO (gbm_multinomial.zip). Regular scoring came back fine, but asking for Shap values gives back

2021-08-17 04:23:07.001  INFO 68016 --- [nio-8080-exec-1] a.h.m.d.l.r.c.ModelsApiController        : Failed scoring request: class ScoreRequest {
    shapleyResults: TRANSFORMED
    includeFieldsInOutput: null
    noFieldNamesInOutput: null
    idField: null
    fields: [Time, Temp, Relative_Humidity, TGS2600, TGS2602A, TGS2602B, TGS2620A, TGS2612, TGS2620B, TGS2611, TGS2610]
    rows: [class Row {
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    }]
}, due to: No columns in output frame

With stacktrace:

java.lang.IllegalStateException: No columns in output frame
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl$AllocatedBuffers.<init>(SourceFile:277) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.buffers(SourceFile:211) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.getMeta(SourceFile:67) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.getFrameBuilder(SourceFile:60) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipeline.getInputFrameBuilder(MojoPipeline.java:95) ~[mojo2-runtime-api-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.deploy.common.transform.MojoScorer.contributionResponse(MojoScorer.java:120) ~[transform-1.1.3-SNAPSHOT.jar!/:na]
        at ai.h2o.mojos.deploy.common.transform.MojoScorer.scoreResponse(MojoScorer.java:94) ~[transform-1.1.3-SNAPSHOT.jar!/:na]
        at ai.h2o.mojos.deploy.local.rest.controller.ModelsApiController.getScore(ModelsApiController.java:56) ~[classes!/:na]

@orendain - This PR is not handling h2o3 mojos yet.. I think it has to be handled in a separate PR.

@Rajimut sorry for delayed response back... I think there is way in the pipeline to know if is h2o3 model or not. I feel like simple logic would be able to return some warning/error no?

if (isH2o3Mojo) {
  return UnimplementedRepsonse
}

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 19, 2021

FYI I tried this on an H2O-3 MOJO (gbm_multinomial.zip). Regular scoring came back fine, but asking for Shap values gives back

2021-08-17 04:23:07.001  INFO 68016 --- [nio-8080-exec-1] a.h.m.d.l.r.c.ModelsApiController        : Failed scoring request: class ScoreRequest {
    shapleyResults: TRANSFORMED
    includeFieldsInOutput: null
    noFieldNamesInOutput: null
    idField: null
    fields: [Time, Temp, Relative_Humidity, TGS2600, TGS2602A, TGS2602B, TGS2620A, TGS2612, TGS2620B, TGS2611, TGS2610]
    rows: [class Row {
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    }]
}, due to: No columns in output frame

With stacktrace:

java.lang.IllegalStateException: No columns in output frame
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl$AllocatedBuffers.<init>(SourceFile:277) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.buffers(SourceFile:211) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.getMeta(SourceFile:67) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipelineProtoImpl.getFrameBuilder(SourceFile:60) ~[mojo2-runtime-impl-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.runtime.MojoPipeline.getInputFrameBuilder(MojoPipeline.java:95) ~[mojo2-runtime-api-2.6.1.jar!/:2.6.1]
        at ai.h2o.mojos.deploy.common.transform.MojoScorer.contributionResponse(MojoScorer.java:120) ~[transform-1.1.3-SNAPSHOT.jar!/:na]
        at ai.h2o.mojos.deploy.common.transform.MojoScorer.scoreResponse(MojoScorer.java:94) ~[transform-1.1.3-SNAPSHOT.jar!/:na]
        at ai.h2o.mojos.deploy.local.rest.controller.ModelsApiController.getScore(ModelsApiController.java:56) ~[classes!/:na]

@orendain - This PR is not handling h2o3 mojos yet.. I think it has to be handled in a separate PR.

@Rajimut sorry for delayed response back... I think there is way in the pipeline to know if is h2o3 model or not. I feel like simple logic would be able to return some warning/error no?

if (isH2o3Mojo) {
  return UnimplementedRepsonse
}

I think currently it is not possible for us to know whether the uploaded mojo is from h2o3, since it is not exposed by the mojo pipeline. We can do a work-around by unzipping the file and reading the extension but it might not be a correct way as the pipeline is doing this already

@nkpng2k
Copy link
Contributor

nkpng2k commented Aug 19, 2021

As per discussion on slack, my above comment is related to:
#165
#62

Since not specific to shapley scores. Deferring to later changes. Discussed with @Rajimut that we can add additional field in response:
message which is a map of {message_level: <enum of: LOG, WARN, ERROR>, message: <text of message>} or something.

common/swagger/swagger.yaml Outdated Show resolved Hide resolved
common/swagger/swagger.yaml Outdated Show resolved Hide resolved
common/swagger/swagger.yaml Outdated Show resolved Hide resolved
Comment on lines 95 to 110
try {
ShapleyType requestedShapleyType = shapleyType(request.getShapleyValuesRequested());
switch (requestedShapleyType) {
case TRANSFORMED:
response.setFeatureShapleyContributions(computeContribution(request));
break;
case ORIGINAL:
log.info(UNIMPLEMENTED_MESSAGE);
break;
default:
break;
}
} catch (Exception e) {
log.info("Failed shapley values: {}, due to: {}", request, e.getMessage());
log.debug(" - failure cause: ", e);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always when we cannot fulfill the request. "Failing silently" causes a lot of "why does it not return what I have requested" situations.

Copy link
Contributor Author

@Rajimut Rajimut Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was reg a discussion about

As per discussion on slack, my above comment is related to:
#165
#62

Since not specific to shapley scores. Deferring to later changes. Discussed with @Rajimut that we can add additional field in response:
message which is a map of {message_level: <enum of: LOG, WARN, ERROR>, message: } or something.

The idea is not to fail the scoring request when shapley values cannot be computed for a model - And in mojos that are not obtained from DAI like h2o-3 mojos the shapley requests will fail causing the scoring response to fail as well. In order to handle this, as per discussion above, we have decided to provide 400 response on shapley exclusive endpoint /model/contribution. The existing endpoint /model/score will remain unaffected if the shapley values are not available.

[For the future] We also want to include a message field with the score and shapley response to provide some additional information which will describe the reason for the failure

Copy link
Member

@orendain orendain Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts:

  • If a user explicitly requests Shap scores for a scorer that has loaded an H2O-3 MOJO (perhaps the client even knowing that the feature isn't supported), would we consider that to be client error? Should it be the server's responsibility to "fix/adjust" a client's request if the client hasn't requested adjustment?
  • The try-catch block is catching all shap errors. If the error is due to something other than compatibility, silently failing would hide all of it.
  • I envision the two points above on the same level as how we fail if one of the regular client-provided scoring row is malformed. We don't currently ignore one row while allowing the rest to continue.

Copy link
Contributor Author

@Rajimut Rajimut Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error is due to something other than compatibility, silently failing would hide all of it.

Yes, that is true.. we have scenarios where the error could be caused because the models are not supporting it. My thoughts: This could be best conveyed to the user by providing a message instead of throwing the exception. But we could handle the user related error like passing in a wrong Shapley Value by throwing an error.

Errors due to h2o-3 mojo - shouldn't we fail this silently?

@Rajimut Rajimut requested a review from zoido August 20, 2021 21:47
@orendain
Copy link
Member

Side note: Could we bump the (minor) version of the API?

https://github.com/h2oai/dai-deployment-templates/blob/master/common/swagger/swagger.yaml#L2-L7

May need to coordinate with @mmalohlava to double check it does not interfere with any existing plans.

@Rajimut Rajimut requested a review from orendain August 25, 2021 00:20
Copy link

@zoido zoido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still two questions.

LGTM. I don't want to approve myself as there are people more competent in java and this repo.

common/swagger/swagger.yaml Outdated Show resolved Hide resolved
common/swagger/swagger.yaml Outdated Show resolved Hide resolved
Copy link
Member

@orendain orendain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, LGTM.

Thanks for untangling all of the data science + implementation + future API concerns and tackling it in this PR. Not only a huge enhancement to the API and REST scorer, but a good lesson on Shapley and how it all now connects with the H2O.ai tech stack!

Copy link

@mwysokin mwysokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🖖

I can simply say WOW! What an enormous effort and what a great result! I'm not super proficient with Java and the scorers' code so I'm giving my LGTM as a general ACK.

Let's have a final confirmation from @mmalohlava and lets merge it!

response.setFeatureShapleyContributions(transformedFeatureContribution(request));
break;
case ORIGINAL:
log.info(UNIMPLEMENTED_MESSAGE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the latest runtime MOJO2 2.7.0 should support original shap as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created an issue for this here #247

Copy link
Member

@mmalohlava mmalohlava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rajimut thank you! nice result!

Tiny request: can you create issue for parts which are missing/not implemented:

  • original shapley
  • any changes in MOJO API you would suggest.

@Rajimut
Copy link
Contributor Author

Rajimut commented Aug 27, 2021

Created some issues for other things needed to be done after this PR
In Mojo runtime

In local rest scorer

@Rajimut Rajimut merged commit 4df9f4f into master Aug 27, 2021
@Rajimut Rajimut deleted the raji/shapley-rest-scorer branch August 27, 2021 20:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants