使用sktime進行時間序列預測

2021-03-02 沙克芬 SharkFin

是對原文的翻譯，如有不當之處，敬請諒解。放在了語雀上https://www.yuque.com/alipayqgthu1irbf/sharkfin/que2qh

請在語雀上閱讀，效果更佳。

使用sktime進行時間序列預測

https://www.sktime.org/en/latest/examples/01_forecasting.html

使用sktime預測

在預測中，我們對利用過去的數據進行對未來進行預測很感興趣。sktime提供了常用的統計預測算法和用於建立複合機器學習模型的工具。

更多細節，請看我們關於用sktime進行預測的論文，其中我們更詳細地討論了 forecasting API，並使用它來複製和擴展M4研究。

準備工作

[2]:
from warnings import simplefilter
import numpy as np
import pandas as pd
from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA, AutoARIMA
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.compose import (
EnsembleForecaster,
ReducedRegressionForecaster,
TransformedTargetForecaster,
)
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.model_selection import (
ForecastingGridSearchCV,
SlidingWindowSplitter,
temporal_train_test_split,
)
from sktime.forecasting.naive import NaiveForecaster
from sktime.forecasting.theta import ThetaForecaster
from sktime.forecasting.trend import PolynomialTrendForecaster
from sktime.performance_metrics.forecasting import sMAPE, smape_loss
from sktime.transformations.series.detrend import Deseasonalizer, Detrender
from sktime.utils.plotting import plot_series
simplefilter("ignore", FutureWarning)
%matplotlib inline

數據

首先，我們使用Box-Jenkins單變量航空數據集，該數據集顯示了1949-1960年每月的國際航空乘客數量。

[3]:
y = load_airline()
plot_series(y);

一個時間序列由一系列時間點-數值對組成，其中數值代表我們觀察到的數值，時間點代表我們觀察到該數值的時間點。

我們將時間序列表示為pd.Series，其中索引代表時間點。sktime支持pandas的integer, period 和timestamp。在這個例子中，我們有一個period index。

[4]:
y.index
[4]:
PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
'1949-07', '1949-08', '1949-09', '1949-10',
...
'1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
'1960-09', '1960-10', '1960-11', '1960-12'],
dtype='period[M]', name='Period', length=144, freq='M')

明確預測任務

接下來我們將定義一個預測任務。

我們可以對數據進行如下拆分。

[5]:
y_train, y_test = temporal_train_test_split(y, test_size=36)
plot_series(y_train, y_test, labels=["y_train", "y_test"])
print(y_train.shape[0], y_test.shape[0])
108 36

當我們進行預測時，我們需要指定預測範圍，並將其傳遞給我們的預測算法。

相對預測範圍

One of the simplest ways is to define a np.array with the steps ahead that you want to predict relative to the end of the training series.

（這句不知咋翻譯好）

[6]:
fh = np.arange(len(y_test)) + 1
fh
[6]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36])

所以這裡我們感興趣的是從第一步到第三十六步的預測。當然你也可以你使用其他的預測範圍。例如，如果只預測前面的第二步和第五步，你可以寫：

import numpy as np
fh = np.array([2, 5]) # 2nd and 5th step ahead

絕對預測範圍

另外，我們也可以使用我們想要預測的絕對時間點來指定預測範圍。為了做到這一點，我們需要使用sktime的ForecastingHorizon類。這樣，我們就可以簡單地從測試集的時間點中創建預測範圍。

[7]:
fh = ForecastingHorizon(y_test.index, is_relative=False)
fh
[7]:
ForecastingHorizon(['1958-01', '1958-02', '1958-03', '1958-04', '1958-05', '1958-06',
'1958-07', '1958-08', '1958-09', '1958-10', '1958-11', '1958-12',
'1959-01', '1959-02', '1959-03', '1959-04', '1959-05', '1959-06',
'1959-07', '1959-08', '1959-09', '1959-10', '1959-11', '1959-12',
'1960-01', '1960-02', '1960-03', '1960-04', '1960-05', '1960-06',
'1960-07', '1960-08', '1960-09', '1960-10', '1960-11', '1960-12'],
dtype='period[M]', name='Period', freq='M', is_relative=False)

進行預測

像在scikit-learn中一樣，為了進行預測，我們需要先指定（或建立）一個模型，然後將其擬合到訓練數據中，最後調用predict來生成給定預測範圍的預測。

sktime附帶了幾種預測算法（或叫forecasters）和建立綜合模型的工具。所有forecasters都有一個共同的界面。forecasters根據單一系列數據進行訓練，並對所提供的預測範圍進行預測。

先來兩個naïve預測策略，可以作為比較複雜方法的參考。

預測最後的數值

[8]:
# using sktime
forecaster = NaiveForecaster(strategy="last")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_pred, y_test)
[8]:
0.23195770387951434

預測同季最後的數值

[9]:
forecaster = NaiveForecaster(strategy="last", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_pred, y_test)
[9]:
0.145427686270316

為什麼不直接用scikit-learn？

你可能會有疑問，為什麼我們不乾脆用scikit-learn來做預測呢？預測說到底不就是一個回歸問題嗎？

原則上，是的。但是 scikit-learn 並不是為解決預測任務而設計的，所以要小心陷阱!

[10]:
from sklearn.model_selection import train_test_split
y_train, y_test = train_test_split(y)p
plot_series(y_train.sort_index(), y_test.sort_index(), labels=["y_train", "y_test"]);

這就導致了

你用來訓練機器學習算法的數據恰好有你想要預測的信息。

但是train_test_split(y, shuffle=False)是可以的，這就是sktime中temporal_train_test_split(y)的作用。

[11]:
y_train, y_test = temporal_train_test_split(y)
plot_series(y_train, y_test, labels=["y_train", "y_test"]);

為了使用scikit-learn，我們必須首先將數據轉換為所需的表格格式，然後擬合regressor ，最後生成預測。

關鍵思想：精簡

預測通常是通過回歸來解決的。這種方法有時被稱為還原法，因為我們將預測任務還原為更簡單但相關的表格回歸任務。這樣就可以對預測問題應用任何回歸算法。

精簡為回歸的工作原理如下。我們首先需要將數據轉化為所需的表格格式。我們可以通過將訓練序列切割成固定長度的窗口，並將它們疊加在一起來實現。我們的目標變量由每個窗口的後續觀測值組成。

我們可以寫一些代碼來做到這一點，例如在M4比賽中。

[12]:
# slightly modified code from the M4 competition
def split_into_train_test(data, in_num, fh):
"""
Splits the series into train and test sets.
Each step takes multiple points as inputs
:param data: an individual TS
:param fh: number of out of sample points
:param in_num: number of input points for the forecast
:return:
"""
train, test = data[:-fh], data[-(fh + in_num) :]
x_train, y_train = train[:-1], np.roll(train, -in_num)[:-in_num]
x_test, y_test = test[:-1], np.roll(test, -in_num)[:-in_num]
# x_test, y_test = train[-in_num:], np.roll(test, -in_num)[:-in_num]
# reshape input to be [samples, time steps, features]
# (N-NF samples, 1 time step, 1 feature)
x_train = np.reshape(x_train, (-1, 1))
x_test = np.reshape(x_test, (-1, 1))
temp_test = np.roll(x_test, -1)
temp_train = np.roll(x_train, -1)
for x in range(1, in_num):
x_train = np.concatenate((x_train[:-1], temp_train[:-1]), 1)
x_test = np.concatenate((x_test[:-1], temp_test[:-1]), 1)
temp_test = np.roll(temp_test, -1)[:-1]
temp_train = np.roll(temp_train, -1)[:-1]
return x_train, y_train, x_test, y_test
[13]:
# here we split the time index, rather than the actual values,
# to show how we split the windows
feature_window, target_window, _, _ = split_into_train_test(
np.arange(len(y)), 10, len(fh)
)

為了更好地理解事先的數據轉換，我們可以看看如何將訓練序列分割成窗口。這裡我們展示了以整數指數表示的生成窗口。

[14]:
feature_window[:5, :]
[14]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]])
[15]:
target_window[:5]
[15]:
array([10, 11, 12, 13, 14])
[16]:
# now we can split the actual values of the time series
x_train, y_train, x_test, y_test = split_into_train_test(y.values, 10, len(fh))
print(x_train.shape, y_train.shape)
(98, 10) (98,)
[17]:
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(x_train, y_train)
[17]:
RandomForestRegressor()

這裡有哪些潛在的隱患？

這需要大量的手寫代碼，而這些代碼往往容易出錯，不模塊化，也不可調。

還需要注意的是，這些步驟涉及到一些隱含的超參數。將時間序列切成窗口的方式（如窗口長度）生成預測的方式（遞歸策略、直接策略、其他混合策略）遞歸策略是指將時間序列切成窗口的方式。

陷阱三：給定一個擬合回歸算法，我們如何生成預測？

[18]:
print(x_test.shape, y_test.shape)
# add back time index to y_test
y_test = pd.Series(y_test, index=y.index[-len(fh) :])
(36, 10) (36,)
[19]:
y_pred = model.predict(x_test)
smape_loss(pd.Series(y_pred, index=y_test.index), y_test)
[19]:
0.11455911283150787

但這裡的問題是什麼？

實際上，我們並不進行多步預測，直到第36步。取而代之的是，我們總是使用最新的數據進行36個單步前的預測。但這是另一種學習任務的解決方案!

為了解決這個問題，我們可以像M4比賽中一樣，寫一些代碼來進行遞歸。

[20]:
# slightly modified code from the M4 study
predictions = []
last_window = x_train[-1, :].reshape(1, -1) # make it into 2d array
last_prediction = model.predict(last_window)[0] # take value from array
for i in range(len(fh)):
# append prediction
predictions.append(last_prediction)
# update last window using previously predicted value
last_window[0] = np.roll(last_window[0], -1)
last_window[0, (len(last_window[0]) - 1)] = last_prediction
# predict next step ahead
last_prediction = model.predict(last_window)[0]
y_pred_rec = pd.Series(predictions, index=y_test.index)
smape_loss(y_pred_rec, y_test)
[20]:
0.15670668827071418

使用sktime預測

sktime為這種方法提供了一個meta-estimator，即：

模塊化，並且兼容scikit-learn，因此我們可以很容易地應用任何scikit-learn回歸器來解決我們的預測問題。

可調整，允許我們調整超參數，如窗口長度或策略，以生成預測。

自適應，即它將scikit-learn的估計器界面調整為預測器界面，確保我們能夠調整並正確評估我們的模型。

[21]:
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
print(y_train.shape[0], y_test.shape[0])
108 36
[22]:
from sklearn.neighbors import KNeighborsRegressor
regressor = KNeighborsRegressor(n_neighbors=1)
forecaster = ReducedRegressionForecaster(
regressor=regressor, window_length=12, strategy="recursive"
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[22]:
0.14008272913734346

sktime有許多統計預測算法，基於statsmodels的實現。例如，為了使用帶有加法趨勢成分和乘法季節性的指數平滑算法，我們可以寫如下：

請注意，由於這是月度數據，季節性周期(sp)或每年的周期數為12。

[23]:
forecaster = ExponentialSmoothing(trend="add", seasonal="multiplicative", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[23]:
0.05108252343492944

狀態空間模型的指數平滑也可以類似於R中的ets函數自動進行。

[24]:
from sktime.forecasting.ets import AutoETS
forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[24]:
0.06317467074033545

另一種常見的模型是ARIMA模型。在 sktime中，我們與 pmdarima https://github.com/alkaline-ml/pmdarima`__接口，這是一個自動選擇最佳ARIMA模型的軟體包。這是因為要在許多可能的模型參數上進行搜索，所以可能要花一點時間。

[25]:
forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[25]:
0.04117062367656992

也可以手動配置單個ARIMA模型。

[26]:
forecaster = ARIMA(
order=(1, 1, 0), seasonal_order=(0, 1, 0, 12), suppress_warnings=True
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[26]:
0.04257105737228371

BATS和TBATS是另外兩種時間序列預測算法，通過封裝``tbats<https://github.com/intive-DataScience/tbats>__，包含在sktime`中。

[27]:
from sktime.forecasting.bats import BATS
forecaster = BATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[27]:
0.08689500756325415

[28]:
from sktime.forecasting.tbats import TBATS
forecaster = TBATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[28]:
0.08493353477049964

sktime還提供了Facebook的``fbprophet<https://github.com/facebook/prophet>__的接口。請注意，fbprophet與時間戳類型為pd.DatetimeIndex`的數據密切相關，所以我們必須先轉換索引類型。

[30]:
# Convert index to pd.DatetimeIndex
z = y.copy()
z = z.to_timestamp(freq="M")
z_train, z_test = temporal_train_test_split(z, test_size=36)
[32]:
from sktime.forecasting.fbprophet import Prophet
forecaster = Prophet(
seasonality_mode="multiplicative",
n_changepoints=int(len(y_train) / 12),
add_country_holidays={"country_name": "Germany"},
yearly_seasonality=True,
)
forecaster.fit(z_train)
y_pred = forecaster.predict(fh.to_relative(cutoff=y_train.index[-1]))
y_pred.index = y_test.index
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
[32]:
0.06939056917256975

構建組合模型

sktime為預測的複合模型構建提供了一個模塊化的API。

與scikit-learn一樣，sktime提供了一個meta-forecaster，用於組合多種預測算法。例如，我們可以將指數平滑的不同變體組合如下。

[ ]:
forecaster = EnsembleForecaster(
[
("ses", ExponentialSmoothing(seasonal="multiplicative", sp=12)),
(
"holt",
ExponentialSmoothing(
trend="add", damped_trend=False, seasonal="multiplicative", sp=12
),
),
(
"damped",
ExponentialSmoothing(
trend="add", damped_trend=True, seasonal="multiplicative", sp=12
),
),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)

調優

在 ReducedRegressionForecaster 中，window_length 和 strategy 參數都是我們可能想要優化的超參數。

[31]:
forecaster = ReducedRegressionForecaster(
regressor=regressor, window_length=15, strategy="recursive"
)
param_grid = {"window_length": [5, 10, 15]}
# we fit the forecaster on the initial window,
# and then use temporal cross-validation to find the optimal parameter
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5))
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
[32]:
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[32]:
0.14187443909112035

[33]:
gscv.best_params_
[33]:
{'window_length': 15}

使用scikit-learn的GridSearchCV，除了調整window_length，我們還可以調整從scikit-learn導入的regressors 。

[34]:
from sklearn.model_selection import GridSearchCV
# tuning the 'n_estimator' hyperparameter of RandomForestRegressor from scikit-learn
regressor_param_grid = {"n_estimators": [100, 200, 300]}
forecaster_param_grid = {"window_length": [5, 10, 15, 20, 25]}
# create a tunnable regressor with GridSearchCV
regressor = GridSearchCV(RandomForestRegressor(), param_grid=regressor_param_grid)
forecaster = ReducedRegressionForecaster(
regressor, window_length=15, strategy="recursive"
)
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5))
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[34]:
0.12834791719456862

[35]:
print(gscv.best_params_, gscv.best_forecaster_.regressor_.best_params_)
{'window_length': 25} {'n_estimators': 200}

在調優過程中，我們可以使用ForecastingGridSearchCV的scoring參數來獲取某個特定指標的性能。

[36]:
gscv = ForecastingGridSearchCV(
forecaster, cv=cv, param_grid=forecaster_param_grid, scoring=sMAPE()
)
gscv.fit(y_train)
pd.DataFrame(gscv.cv_results_)
[36]:

mean_fit_timemean_score_timeparam_window_lengthparamsmean_test_sMAPErank_test_sMAPE05.0046881.6408305{'window_length': 5}0.296896514.7951891.55963010{'window_length': 10}0.269926424.7773401.65204515{'window_length': 15}0.245826334.6344981.15086820{'window_length': 20}0.242409244.7683821.57821225{'window_length': 25}0.2378391

請注意，到目前為止，上面的還原方法沒有考慮任何季節性或趨勢，但我們可以很容易地指定一個pipeline ，它首先對數據進行detrends。

sktime提供了一個通用的detrender，一個使用任何預測器並返回預測器預測值的樣本內殘差的變換器。例如，為了去除時間序列的線性趨勢，我們可以寫：

[37]:
# liner detrending
forecaster = PolynomialTrendForecaster(degree=1)
transformer = Detrender(forecaster=forecaster)
yt = transformer.fit_transform(y_train)
# internally, the Detrender uses the in-sample predictions
# of the PolynomialTrendForecaster
forecaster = PolynomialTrendForecaster(degree=1)
fh_ins = -np.arange(len(y_train)) # in-sample forecasting horizon
y_pred = forecaster.fit(y_train).predict(fh=fh_ins)
plot_series(y_train, y_pred, yt, labels=["y_train", "fitted linear trend", "residuals"]);

讓我們在pipeline中使用去季節化的同時，也使用detrender。需要注意的是，在預測中，當我們在擬合前應用數據變換時，我們需要對預測值進行逆向變換。為此，我們提供了以下pipeline 類。

[38]:
forecaster = TransformedTargetForecaster(
[
("deseasonalise", Deseasonalizer(model="multiplicative", sp=12)),
("detrend", Detrender(forecaster=PolynomialTrendForecaster(degree=1))),
(
"forecast",
ReducedRegressionForecaster(
regressor=regressor, window_length=12, strategy="recursive"
),
),
]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[38]:
0.05448013755454164

當然，我們可以再嘗試優化pipeline各組件的超參數。

下面我們討論預測的另外兩個方面：online learning，我們要隨著新數據的到來動態更新預測；預測區間，讓我們可以量化預測的不確定性。

Online Forecasting

對於模型評估，我們有時想要評估多個預測，使用測試數據的滑動窗口進行時間交叉驗證。為此，我們可以利用online_forecasting模塊中的預測器，它使用複合預測器PredictionWeightedEnsemble來跟蹤每個預測器積累的損失，並創建一個由最 "準確 "的預測器的預測加權的預測。

請注意，預測任務發生了變化：我們進行35次預測，因為我們需要第一次預測來幫助更新權重，我們不提前36步預測。

[39]:
from sklearn.metrics import mean_squared_error
from sktime.forecasting.online_learning import (
NormalHedgeEnsemble,
OnlineEnsembleForecaster,
)

首先，我們需要初始化一個PredictionWeightedEnsembler，它將跟蹤每個forecaster 積累的損失，並定義我們想要使用的損失函數。

[40]:
hedge_expert = NormalHedgeEnsemble(n_estimators=3, loss_func=mean_squared_error)

然後我們可以通過定義各個forecaster 並指定我們使用的PredictionWeightedEnsembler來創建forecaster 。然後通過擬合我們的forecaster ，並用update_predict函數進行更新和預測，我們得到。

[41]:
forecaster = OnlineEnsembleForecaster(
[
("ses", ExponentialSmoothing(seasonal="multiplicative", sp=12)),
(
"holt",
ExponentialSmoothing(
trend="add", damped_trend=False, seasonal="multiplicative", sp=12
),
),
(
"damped",
ExponentialSmoothing(
trend="add", damped_trend=True, seasonal="multiplicative", sp=12
),
),
],
ensemble_algorithm=hedge_expert,
)
forecaster.fit(y_train)
y_pred = forecaster.update_predict(y_test)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test[1:], y_pred)
[41]:
0.04998488843486813

對於單次更新，您可以使用update方法。

預測區間

到目前為止，我們只研究了點預測。在很多情況下，我們也對預測區間感興趣。sktime的接口支持預測區間，但我們還沒有為所有算法實現它們。

在這裡，我們使用Theta預測算法。

[42]:
forecaster = ThetaForecaster(sp=12)
forecaster.fit(y_train)
alpha = 0.05 # 95% prediction intervals
y_pred, pred_ints = forecaster.predict(fh, return_pred_int=True, alpha=alpha)
smape_loss(y_test, y_pred)
[42]:
0.08661467699983212
[43]:
fig, ax = plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
ax.fill_between(
ax.get_lines()[-1].get_xdata(),
pred_ints["lower"],
pred_ints["upper"],
alpha=0.2,
color=ax.get_lines()[-1].get_c(),
label=f"{1 - alpha}% prediction intervals",
)
ax.legend();

總結

正如我們所看到的，為了進行預測，我們需要首先指定（或建立）一個模型，然後將其與訓練數據相適應，最後調用predict來生成給定預測範圍的預測。

更多資料

更多細節，請看我們關於用sktime進行預測的論文，在這篇論文中，我們更詳細地討論了預測API，並使用它來複製和擴展M4研究。

關於預測的良好介紹，請參見[Hyndman, Rob J., and George Athanasopoulos. Forecasting: principles and practice. OTexts，2018]（https://otexts.com/fpp2/）。

關於比較基準研究/預測競賽，見M4競賽和正在進行的M5競賽。

[ ]:

由nbsphinx生成。Jupyter筆記本可以在這裡找到。

使用sktime進行時間序列預測

相關焦點

Sktime:用於時間序列機器學習的Python庫

使用DeepAR 進行時間序列預測

如何使用XGBoost模型進行時間序列預測

如何使用 Python 進行時間序列預測?

pytorch入門-使用PyTorch進行LSTM時間序列預測

時間序列預測:I概述

用Python進行時間序列分解和預測

利用深度學習進行時間序列預測

手把手教你用Python進行時間序列分解和預測

Power BI的時間序列預測——視覺對象使用盤點

使用LSTM深度學習模型進行溫度的時間序列單步和多步預測

乾貨 :手把手教你用Python進行時間序列分解和預測

獨家 | 手把手教你用Python進行時間序列分解和預測

時域卷積網絡TCN詳解:使用卷積進行序列建模和預測

時間序列預測方法總結

Pytorch實現LSTM時間序列預測

【深度學習】利用深度學習進行時間序列預測

基於圖卷積神經網絡GCN的時間序列預測

課程解析|時間序列分析與預測

Keras 實現 LSTM時間序列預測