Skip to content

Version 6.0.0

Compare
Choose a tag to compare
@joshuaspear joshuaspear released this 17 Jul 16:31
· 27 commits to master since this release
  • Updated PropensityModels structure for sklearn and added a helper class for compatability with torch
  • Full runtime typechecking with jaxtyping
  • Fixed bug with IS methods where the average was being taken twice
  • Significantly simplified API, especially integrating Policy classes with propensity models
  • Generalised d3rlpy API to allow for wrapping continuous policies with D3RlPyTorchAlgoPredict
  • Added explicit stochastic policies for d3rlpy
  • Introduced 'policy_func' which is any function/method which outputs type Union[TorchPolicyReturn, NumpyPolicyReturn]
  • Simplified and unified ISCallback in d3rlpy/api using PolicyFactory
  • Added 'premade' doubly robust estimators for vanilla DR, weighted DR, per-decision DR and weighted per-decision DR