-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
183 lines (155 loc) · 6.35 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# netidmtpreg
<!-- badges: start -->
[![Lifecycle Status](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![Tests](https://github.com/qmarcou/netidmtpreg/actions/workflows/test-coverage.yaml/badge.svg)](https://github.com/qmarcou/netidmtpreg/actions/workflows/test-coverage.yaml)
[![codecov](https://codecov.io/gh/qmarcou/netidmtpreg/graph/badge.svg?token=BMMMFGO30V)](https://codecov.io/gh/qmarcou/netidmtpreg)
[![R-CMD-check](https://github.com/qmarcou/netidmtpreg/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/qmarcou/netidmtpreg/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
The goal of netidmtpreg is to enable net survival estimation through direct
binomial regression, allowing modeling continuous covariate effects that could
not be handled through e.g stratified Pohar-Perme estimation, all this in a
multistate Illness-Death setting.
## Installation
You can install the development version of netidmtpreg like so:
```{r eval=FALSE}
devtools::install_github("qmarcou/netidmtpreg")
```
If working from source through a git clone, the package should be installed locally to get tests running `future::multisession` planning working.
This can be accomplished using `devtools`
```{r eval=FALSE}
devtools::dev_mode(on = TRUE)
devtools::install_local(force = TRUE) # force package update
devtools::load_all() # required to make tests visible
testthat::test_package("netidmtpreg")
# or testthat::test_check("netidmtpreg") to run R CMD check
```
## Example
This is a basic example illustrating net survival estimation on data simulated
using the package. Let's first generate some data:
```{r Generate random data}
library(netidmtpreg)
library(tidyverse)
n_ind <- 5e2 # number of simulated individuals
# Generate exponentially distributed event times without censoring
synth_idm_data <- generate_uncensored_ind_exp_idm_data(
n_individuals = n_ind,
lambda_illness = 1.0,
lambda_death = 1.0
)
# Generate random age and sex labels
synth_idm_data <-
synth_idm_data %>% tibble::add_column(
sex = ifelse(rbinom(n_ind, 1, prob = .5), "male", "female"),
age = runif(n = n_ind, min = 50, max = 80)
)
# Generate random start of follow up dates
synth_idm_data <-
synth_idm_data %>% tibble::add_column(start_date = as.Date.numeric(
x = rnorm(n = n_ind, mean = 0, sd = 1e2),
origin = as.Date("15/06/1976", "%d/%m/%Y")
))
# Generate population mortality assuming equal constant population rate
l_pop_death <- 1.0 # extra disease mortality doubles the population mortality
population_death_times <- generate_exponential_time_to_event(
n_individuals = n_ind,
lambda = l_pop_death
)
# create a corresponding ratetable object
const_ratetable <- survival::survexp.us
const_ratetable[] <- l_pop_death # ratetable's covariate do not matter
# Update death time accordingly to create an observed crude survival dataset
crude_synth_idm_data <- netidmtpreg:::apply_iddata_death(
synth_idm_data,
population_death_times
)
```
Now let's carry net survival estimation:
```{r estimation, echo=TRUE, warning=FALSE, message=FALSE}
# Estimation can be sped up and carried in parrallel using futures:
future::plan("multisession") # will work on any OS
# future::plan("multicore") # more efficient but only works on UNIX systems
net_estimate <- fit_netTPreg(
formula = ~1, # intercept only model, similar to Pohar-Perme estimation
data = crude_synth_idm_data,
# Use a standard ratetable
ratetable = const_ratetable,
rmap = list(
age = age,
sex = sex,
year = start_date
),
time_dep_popvars = list("age", "year"),
s = 0.2,
by = n_ind / 10,
trans = "11",
link = "logit",
R = 100 # Number of bootstraps
)
future::plan("sequential") # close the multisession, see future's documentation
```
We obtain a `TPreg` object with a dedicated plotting method using ggplot:
```{r plot}
plot(net_estimate) + ggplot2::coord_cartesian(ylim = c(-5, 5), clip = "off")
```
*Warning: note the use of `ggplot2::coord_cartesian` with `clip="off"` instead of `ggplot2::ylim`.
The former acts as a "zoom" command, while the latter acts as a filter on input
data, and as such might prevent display of large confidence intervals.*
The obtained TPreg plots are composable ggplot objects that can be combined.
Let's use this to compare our results with: 1) what would have been obtained
without background mortality ('ground truth'), and 2) what would have been
obtained on the same data without taking into account population mortality
('crude estimate').
```{r estimation2, echo=TRUE, warning=FALSE, message=FALSE}
# Estimation can be sped up and carried in parrallel using futures:
future::plan("multisession") # will work on any OS
# future::plan("multicore") # more efficient but only works on UNIX systems
crude_estimate <- fit_netTPreg(
formula = ~1, # intercept only model, similar to Pohar-Perme estimation
data = crude_synth_idm_data,
# Use a standard ratetable
ratetable = NULL,
s = 0.2,
by = n_ind / 10,
trans = "11",
link = "logit",
R = 100 # Number of bootstraps
)
truth_estimate <- fit_netTPreg(
formula = ~1, # intercept only model, similar to Pohar-Perme estimation
data = synth_idm_data, # data without added population mortality
# Use a standard ratetable
ratetable = NULL,
s = 0.2,
by = n_ind / 10,
trans = "11",
link = "logit",
R = 100 # Number of bootstraps
)
future::plan("sequential") # close the multisession, see future's documentation
```
You can use the reserved keyword `model` to set the model name and automatically
set the linetype accordingly.
```{r plot_compare}
autoplot(net_estimate, model = "Net estimate") +
autolayer(crude_estimate, model = "Crude estimate") +
autolayer(truth_estimate, model = "Ground Truth") +
ggplot2::coord_cartesian(ylim = c(-5, 5), xlim = c(0, 2), clip = "off")
```
These automated plotting routines are designed for quick inspection of the model.
If you need more flexibility you can rely on
[tidymodels `broom`](https://broom.tidymodels.org/articles/broom.html) style
methods `tidy`, `augment` (*to be implemented*) and `glance` (*to be implemented*)
to leverage `ggplot`'s flexibility.
More examples to come!