Katz - Time Series Analysis Framework
Kats aims to provide the one-stop shop for time series analysis, from understanding the key statistics and detecting anomalies, to forecasting trends, feature extraction/embedding, multivariate analysis, etc.
Time series analysis is a fundamental domain in data science and machine learning, with massive applications in various sectors such as e-commerce, finance, capacity planning, supply chain management, medicine, weather, energy, astronomy, and many others.
Time Series Analysis
Time series analysis as a statistical technique is used to examine and model time-dependent data. Some common features of time series analysis tools include:
- Time series decomposition: the ability to break down a time series into its component parts, such as trend, seasonality, and residuals
- Forecasting: the ability to predict future values of a time series based on past data
- Anomaly detection: the ability to identify unusual or unexpected behavior in a time series
- Multivariate analysis: the ability to analyze multiple time series simultaneously, taking into account the relationships between them
- Feature extraction/embedding: the ability to extract meaningful features from time series data or to represent time series data in a lower-dimensional space for further analysis.
These are just a few examples of the types of functionality that may be included in a time series analysis tool. Let’s see what Kats can provide us with.
Kats is a one-stop shop
Kats is a lightweight, easy-to-use, and generalizable framework for generic time series analysis, including forecasting, anomaly detection, multivariate analysis, and feature extraction/embedding.
Kats is the first comprehensive Python library for generic time series analysis, which provides both classical and advanced techniques to model time series data.
Kats connects various domains in time series analysis, where the users can explore the basic characteristics of their time series data, predict the future values, monitor the anomalies, and incorporate them into their ML models and pipelines.
What it does
Kats provides a set of algorithms and models for four domains in time series analysis: forecasting, detection, feature extraction and embedding, and multivariate analysis.
-
Forecasting: Kats provides a full set of tools for forecasting that includes 10+ individual forecasting models, ensembling, a self-supervised learning (meta-learning) model, backtesting, hyperparameter tuning, and empirical prediction intervals.
-
Detection: Kats supports functionalities to detect various patterns on time series data, including seasonalities, outlier, change point, and slow trend changes.
-
Feature extraction and embedding: The time series feature (TSFeature) extraction module in Kats can produce 65 features with clear statistical definitions, which can be incorporated in most machine learning (ML) models, such as classification and regression.
-
Useful utilities: Kats also provides a set of useful utilities, such as time series simulators.
Installation in Python
Kats is on PyPI, so you can use pip
to install it.
|
|
Forecasting Example
Using Prophet
model to forecast the air_passengers
data set.
|
|
Detection Examples
The following inferences can be obtained with Kats:
- Outlier Detection: It detects an anomaly increase or decrease within the time series.
- Change Point Detection: It detects sudden changes in the time series. There are 3 different algorithms in Kats for this process:
- CUSUM Detection
- Bayesian Online Change Point Detection (BOCPD)
- Stat Sig Detection
- Trend Change Detection: It detects the trend changes of the time series using the Mann-Kendall Detection algorithm.
Outlier Detection
A minimum of 24 lines of data is required for Outlier detection.
|
|
Outliers detected with Kats can also be cleaned with the help of Kats. Kats offers 2 methods for this:
- No Interpolation: Fills outliers with NaN without applying the interpolation operation.
- With Interpolation: Fills out outliers by applying linear interpolation.
Change Point Detection
With Kats it is possible to detect the change points in the time series. There are 3 different algorithms in Kats for this process:
- CUSUMDetector
- BOCPDetector
- RobustStatDetector
Using CUSUM
detection algorithm on simulated data set.
|
|
Trend Change Detection
It is also possible to detect the trend direction of a series with Kats. Kats uses the MKDetector algorithm for this process. The basis of this algorithm is the Mann-Kendall Test, which is a non-parametric test.
|
|
Useful links
- Homepage: https://facebookresearch.github.io/Kats/
- Kats Python package: https://pypi.org/project/kats/0.1.0/
- Facebook Engineering: https://engineering.fb.com/2021/06/21/open-source/kats/
- Source code repository: https://github.com/facebookresearch/kats
- Contributing: https://github.com/facebookresearch/Kats/blob/master/CONTRIBUTING.md
- Tutorials: https://github.com/facebookresearch/Kats/tree/master/tutorials
Conclusion
Kats is a time series analysis tool that uses a metalearning method to identify the most appropriate model and corresponding parameters for a given time series. It does this by using metadata obtained with TSFeatures and applying the Random Forest algorithm to determine the best model based on this metadata. This feature of Kats allows users to create their own automatic machine learning (autoML) tool.