A
tidyverse
toolkit to visualize, wrangle, and transform time series data
Download the development version with latest features:
remotes::install_github("business-science/timetk")
Or, download CRAN approved version:
install.packages("timetk")
-
Full Time Series Machine Learning and Feature Engineering Tutorial:
-
API Documentation for articles and a complete list of function references.
There are many R packages for working with Time Series data. Here’s
how timetk
compares to the “tidy” time series R packages for data
visualization, wrangling, and feature engineeering (those that leverage
data frames or tibbles).
Task | timetk | tsibble | feasts | tibbletime |
---|---|---|---|---|
Structure | ||||
Data Structure | tibble (tbl) | tsibble (tbl_ts) | tsibble (tbl_ts) | tibbletime (tbl_time) |
Visualization | ||||
Interactive Plots (plotly) | ✅ | ❌ | ❌ | ❌ |
Static Plots (ggplot) | ✅ | ❌ | ✅ | ❌ |
Time Series | ✅ | ❌ | ✅ | ❌ |
Correlation, Seasonality | ✅ | ❌ | ✅ | ❌ |
Anomaly Detection | ✅ | ❌ | ❌ | ❌ |
Data Wrangling | ||||
Time-Based Summarization | ✅ | ❌ | ❌ | ✅ |
Time-Based Filtering | ✅ | ❌ | ❌ | ✅ |
Padding Gaps | ✅ | ✅ | ❌ | ❌ |
Low to High Frequency | ✅ | ❌ | ❌ | ❌ |
Imputation | ✅ | ✅ | ❌ | ❌ |
Sliding / Rolling | ✅ | ✅ | ❌ | ✅ |
Feature Engineering (recipes) | ||||
Date Feature Engineering | ✅ | ❌ | ❌ | ❌ |
Holiday Feature Engineering | ✅ | ❌ | ❌ | ❌ |
Fourier Series | ✅ | ❌ | ❌ | ❌ |
Smoothing & Rolling | ✅ | ❌ | ❌ | ❌ |
Padding | ✅ | ❌ | ❌ | ❌ |
Imputation | ✅ | ❌ | ❌ | ❌ |
Cross Validation (rsample) | ||||
Time Series Cross Validation | ✅ | ❌ | ❌ | ❌ |
Time Series CV Plan Visualization | ✅ | ❌ | ❌ | ❌ |
More Awesomeness | ||||
Making Time Series (Intelligently) | ✅ | ✅ | ❌ | ✅ |
Handling Holidays & Weekends | ✅ | ❌ | ❌ | ❌ |
Class Conversion | ✅ | ✅ | ❌ | ❌ |
Automatic Frequency & Trend | ✅ | ❌ | ❌ | ❌ |
Timetk is an amazing package that is part of the modeltime
ecosystem
for time series analysis and forecasting. The forecasting system is
extensive, and it can take a long time to learn:
- Many algorithms
- Ensembling and Resampling
- Machine Learning
- Deep Learning
- Scalable Modeling: 10,000+ time series
Your probably thinking how am I ever going to learn time series forecasting. Here’s the solution that will save you years of struggling.
Become the forecasting expert for your organization
High-Performance Time Series Course
Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.
High-Performance Forecasting Systems will save companies by improving accuracy and scalability. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. You will learn:
- Time Series Machine Learning (cutting-edge) with
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more) - Deep Learning with
GluonTS
(Competition Winners) - Time Series Preprocessing, Noise Reduction, & Anomaly Detection
- Feature engineering using lagged variables & external regressors
- Hyperparameter Tuning
- Time series cross-validation
- Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
- Scalable Forecasting - Forecast 1000+ time series in parallel
- and more.
Become the Time Series Expert for your organization.
Take the High-Performance Time Series Forecasting Course
The timetk
package wouldn’t be possible without other amazing time
series packages.
- stats - Basically
every
timetk
function that uses a period (frequency) argument owes it tots()
.plot_acf_diagnostics()
: Leveragesstats::acf()
,stats::pacf()
&stats::ccf()
plot_stl_diagnostics()
: Leveragesstats::stl()
- lubridate:
timetk
makes heavy use offloor_date()
,ceiling_date()
, andduration()
for “time-based phrases”.- Add and Subtract Time (
%+time%
&%-time%
):"2012-01-01" %+time% "1 month 4 days"
useslubridate
to intelligently offset the day
- Add and Subtract Time (
- xts: Used to calculate periodicity and fast lag automation.
- forecast (retired):
Possibly my favorite R package of all time. It’s based on
ts
, and it’s predecessor is thetidyverts
(fable
,tsibble
,feasts
, andfabletools
).- The
ts_impute_vec()
function for low-level vectorized imputation using STL + Linear Interpolation usesna.interp()
under the hood. - The
ts_clean_vec()
function for low-level vectorized imputation using STL + Linear Interpolation usestsclean()
under the hood. - Box Cox transformation
auto_lambda()
usesBoxCox.Lambda()
.
- The
- tibbletime
(retired): While
timetk
does not importtibbletime
, it uses much of the innovative functionality to interpret time-based phrases:tk_make_timeseries()
- Extendsseq.Date()
andseq.POSIXt()
using a simple phase like “2012-02” to populate the entire time series from start to finish in February 2012.filter_by_time()
,between_time()
- Uses innovative endpoint detection from phrases like “2012”slidify()
is basicallyrollify()
usingslider
(see below).
- slider: A powerful R
package that provides a
purrr
-syntax for complex rolling (sliding) calculations.slidify()
usesslider::pslide
under the hood.slidify_vec()
usesslider::slide_vec()
for simple vectorized rolls (slides).
- padr: Used for padding time
series from low frequency to high frequency and filling in gaps.
- The
pad_by_time()
function is a wrapper forpadr::pad()
. - See the
step_ts_pad()
to apply padding as a preprocessing recipe!
- The
- TSstudio: This is the
best interactive time series visualization tool out there. It
leverages the
ts
system, which is the same system theforecast
R package uses. A ton of inspiration for visuals came from usingTSstudio
.