Skip to content

MitsuruFujiwara/KDD-Cup-2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KDD-Cup-2019

This repository is my solution of KDD Cup 2019 Regular ML track (Context-Aware Multi-Modal Transportation Recommendation). See Competition Website for the details. In this competiton, I got the 57th place at phase1 and 52nd at phase2 (could not enter to phase3).

Phase1

Result

57th place of 1702 teams.

  • LB score: 0.69917984
  • Local cv score: 0.678330

lb_phase1

Model Pipeline

phase1_model_pipeline See phase1 final version for the details.

Key Findings

  • Features
    See features I used. The most important feature was plan_0_transport_mode. In phase1, people click plan_0_transport_mode in about 60% of sessions (it means people likely to click a plan displayed on the top?). I also used count & target encoded features for these categorical variables. As a result, my best single model scored 0.6925 on LB.
  • Sub Models
    I prepared two sub models, one trained by queries and the other by queries & profiles. By adding their outputs to the main model's features, LB score improved from 0.6925 to 0.6945.
  • Post Processing
    Post processing improved LB score from 0.6945 to 0.6991. Some classes (0, 3, 4, 6) in out of fold predictions accounted for smaller percentage compared to that of train data. So I adjusted predictions for these classes by constant multiples. The multiples were dicided by maxmizing out of fold f1 score (see blending).

Phase2

Result

52nd place of 100teams.

  • LB score: 0.69362814
  • Local cv score: 0.657519

lb_phase2

Model Pipeline

phase2_model_pipeline See phase2 final version for the details.

Key Findings

  • Splitting Model
    In phase2, there were 3 cities in dataset. I splitted main model by cities since the distribution of transport mode were quite diffirent. After splitting model, LB score reached to 0.6900.
  • Features
    Features were almost the same as that of phase1 but I did target encoding by every 3 cities.
  • Post Processing
    The same post processing as phase1 applyed for class 0, 3, 4. Finally the best LB score was 0.6936.

About

52nd solution of KDD Cup 2019 Regular ML track

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published