Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCRL2/UCFH confidence intervals are incorrect #5

Open
vzhuang opened this issue Jan 29, 2020 · 2 comments
Open

UCRL2/UCFH confidence intervals are incorrect #5

vzhuang opened this issue Jan 29, 2020 · 2 comments

Comments

@vzhuang
Copy link

vzhuang commented Jan 29, 2020

As per Jaksch et. al 2010, the confidence intervals for UCRL2 use t_k := the timestep at the start of episode k. However, in run_finite_tabular_experiment in experiment.py, the episode index is wrongly passed instead of the timestep.

UCFH is also affected by this bug.

@vzhuang vzhuang changed the title UCRL2 confidence intervals are incorrect UCRL2/UCFH confidence intervals are incorrect Jan 29, 2020
@iosband
Copy link
Owner

iosband commented Jan 30, 2020

Are you 100% sure this is a bug?

If the episodes are of fixed length (they are) then you can compute t_k from just k as (k * episode_length).

My belief is this is what is happening?

@vzhuang
Copy link
Author

vzhuang commented Jan 30, 2020

Right, it's a simple fix. Since the time is inside a log factor, this can't be "fixed" by adjusting the scaling constant. I'm guessing it probably has at least a small impact on your results depending on if you tune the scaling factor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants