Seasonality, holiday effects and regressors (2023)

Modeling of holidays and special events

If you want to model holidays or other recurring events, you need to create a dataframe for them. It has two columns (The holidayandDS) and a line for each occurrence of the holiday. It must contain all occurrences of the holiday, both in the past (back to historical dates) and in the future (back to forecasts). If they don't repeat themselves in the future, Prophet will model them and then not include them in the forecast.

You can also include columnslower_windowandupper_windowwho extend the vacation[bottom_window, top_window]days around the date. For example, if you wanted to include Christmas Eve in addition to Christmas, you would include thatlower_window=-1,upper_window=0. If you want to take advantage of Black Friday in addition to Thanksgiving, you would include thatlower_window=0,upper_window=1. You can also include a columnprior_scaleto set the prior scale for each holiday separately, as described below.

Here we create a dataframe that contains the dates of all of Peyton Manning's playoff appearances:

12345678910111213141516171819
# Rlibrary(dplyr)endgames <- data frame( The holiday = 'game start', DS = as.date(c('2008-01-13', '2009-01-03', '2010-01-16', '2010-01-24', '2010-02-07', '2011-01-08', '2013-01-12', '2014-01-12', '2014-01-19', '2014-02-02', '2015-01-11', '2016-01-17', '2016-01-24', '2016-02-07')), lower_window = 0, upper_window = 1)Superbowl <- data frame( The holiday = 'Super Bowl', DS = as.date(c('2010-02-07', '2014-02-02', '2016-02-07')), lower_window = 0, upper_window = 1)public holidays <- bind_rows(endgames, Superbowl)
123456789101112131415161718
#Pythonendgames = pd.data frame({ 'The Holiday': 'game start', 'ds': pd.to_datetime(['2008-01-13', '2009-01-03', '2010-01-16', '2010-01-24', '2010-02-07', '2011-01-08', '2013-01-12', '2014-01-12', '2014-01-19', '2014-02-02', '2015-01-11', '2016-01-17', '2016-01-24', '2016-02-07']), 'lower_window': 0, 'upper_window': 1,})Superbowl = pd.data frame({ 'The Holiday': 'Super Bowl', 'ds': pd.to_datetime(['2010-02-07', '2014-02-02', '2016-02-07']), 'lower_window': 0, 'upper_window': 1,})public holidays = pd.concatenated((endgames, Superbowl))

Above we have listed the Superbowl days as both playoff games and Superbowl games. This means that the superbowl effect is an added bonus to the playoff effect.

Once the table is created, holiday effects are included in the forecast by passing them alongpublic holidaysFight. Here we do it with the data from Peyton Manning from theQuick Start:

123
# Rm <- a Prophet(df, public holidays = public holidays)forecast <- predict(m, future)
123
#Pythonm = Prophet(public holidays=public holidays)forecast = m.fit(df).predict(future)

The holiday effect can be seen inforecastData Frame:

12345
# Rforecast %>% choose(DS, game start, Super Bowl) %>% Filter(Abs(game start + Super Bowl) > 0) %>% tail(10)
123
#Pythonforecast[(forecast['game start'] + forecast['Super Bowl']).Abs() > 0][ ['ds', 'game start', 'Super Bowl']][-10:]
DS game start Super Bowl
2190 2014-02-02 1.223965 1.201517
2191 2014-02-03 1.901742 1.460471
2532 2015-01-11 1.223965 0,000000
2533 2015-01-12 1.901742 0,000000
2901 2016-01-17 1.223965 0,000000
2902 2016-01-18 1.901742 0,000000
2908 2016-01-24 1.223965 0,000000
2909 2016-01-25 1.901742 0,000000
2922 2016-02-07 1.223965 1.201517
2923 2016-02-08 1.901742 1.460471

The holiday effects will also show up in the component chart, where we see that there is a spike in the days surrounding playoff appearances, with a particularly large spike for the Superbowl:

12
# Rprophet_plot_components(m, forecast)
(Video) 시계열 데이터 분석 #4: 페이스북 Prophet - Seasonality, Holiday Effects, And Regressors
12
#PythonFeige = m.plot_components(forecast)

Seasonality, holiday effects and regressors (1)

Individual public holidays can also be drawn inplot_forecast_componentFunction (imported fromprophet.plotin Python) likeplot_forecast_component(m, forecast, 'superbowl')to plan just the Superbowl holiday component.

Built-in shore leave

You can have a built-in collection of country-specific holidays by using theadd_country_holidaysMethod (Python) or Function (R). The name of the country is given and then the major holidays for that country are included in addition to any holidays given viapublic holidaysargument described above:

1234
# Rm <- a Prophet(public holidays = public holidays)m <- add_country_holidays(m, country name = 'US')m <- becomes a prophet(m, df)
1234
#Pythonm = Prophet(public holidays=public holidays)m.add_country_holidays(country name='US')m.fit(df)

You can see which public holidays were includedtrain_holiday_names(Python) ortrain.holiday.names(R) Attribute of the model:

12
# Rm$train.holiday.names
12345678
[1] "Playoff" "Superbowl" [3] "New Year's Day" "Martin Luther King Jr. Day" [5] "Washington's Birthday" "Memorial Day" [7] "Independence Day" "Labor Day" [9] "Columbus Day" "Veterans Day"[11] "Veterans Day (observed)" "Thanksgiving"[13] "Christmas Day" "Independence Day (observed)"[15] "Christmas Day (observed)" "New Year's Day (observed) "
12
#Pythonm.train_holiday_names
1234567891011121314151617
0 Playoff1 Superbowl2 New Year's Day3 Martin Luther King Jr. Day4 Washington's Birthday5 Memorial Day6 Independence Day7 Labor Day8 Columbus Day9 Veterans Day10 Thanksgiving11 Christmas Day12 Christmas Day (note)13 Veterans Day (note)14 Independence Day (note)15 New Year's Day (note)dtype: object

The public holidays for each country are provided by thepublic holidaysPackage in Python. For a list of available countries and the country name to use, see their page: https://github.com/dr-prodigy/python-holidays. In addition to these countries, Prophet includes public holidays for these countries: Brazil (BR), Indonesia (ID), India (IN), Malaysia (MY), Vietnam (VN), Thailand (TH), Philippines (PH), Pakistan (PK) , Bangladesh (BD), Egypt (EG), China (CN) and Russia (RU), Korea (KR), Belarus (BY) and United Arab Emirates (AE).

In Python, most holidays are calculated deterministically and are therefore available for any date range. A warning will be issued if dates are outside the supported range of that country. In R, holiday dates for 1995 to 2044 are calculated and stored in the package asdata-raw/generated_holidays.csv. If a wider date range is needed, this script can be used to replace this file with a different date range: https://github.com/facebook/prophet/blob/main/python/scripts/generate_holidays_file.py.

As above, the country-level holidays are then displayed in the component chart:

(Video) Multivariate Time Series Forecasting with Seasonality and Holiday Effect Using Prophet in Python

123
# Rforecast <- predict(m, future)prophet_plot_components(m, forecast)
123
#Pythonforecast = m.predict(future)Feige = m.plot_components(forecast)

Seasonality, holiday effects and regressors (2)

Fourier order for seasonality

Seasonalities are estimated using a partial Fourier sum. Seethe paperfor full details andThis illustration is available on Wikipediato illustrate how a partial Fourier sum can approximate any periodic signal. The number of terms in the subtotal (the order) is a parameter that determines how quickly the seasonality can change. To illustrate, consider Peyton Manning's data from theQuick Start. The default Fourier order for annual seasonality is 10, resulting in this fit:

123
# Rm <- a Prophet(df)a Prophet:::plot_yearly(m)
1234
#Pythonout prophet.plot import plot_yearlym = Prophet().fit(df)a = plot_yearly(m)

Seasonality, holiday effects and regressors (3)

The default values ​​are often reasonable, but can be increased if seasonality needs to adjust to more frequent changes, and are generally less consistent. The Fourier order can be specified for each built-in seasonality when instantiating the model, here it is increased to 20:

123
# Rm <- a Prophet(df, annual.seasonality = 20)a Prophet:::plot_yearly(m)
1234
#Pythonout prophet.plot import plot_yearlym = Prophet(yearly_seasonality=20).fit(df)a = plot_yearly(m)

Seasonality, holiday effects and regressors (4)

Increasing the number of Fourier terms allows the seasonality to adjust for faster changing cycles, but can also lead to overfitting: N Fourier terms correspond to 2N variables used to model the cycle

Setting custom seasonalities

By default, Prophet adjusts weekly and annual seasonalities when the time series is longer than two cycles. It also adjusts daily seasonality for an intraday time series. With you can add further seasonalities (monthly, quarterly, hourly).add_seasonalityMethod (Python) or Function (R).

The inputs to this function are a name, the period of seasonality in days, and the Fourier order for seasonality. For reference, by default, Prophet uses a Fourier order of 3 for weekly seasonality and 10 for yearly seasonality. An optional input foradd_seasonalityis the prior scale for this seasonal component - discussed below.

As an example, here we fit Peyton Manning's dataQuick Start, but replace weekly seasonality with monthly seasonality. The monthly seasonality then appears in the component chart:

(Video) How to Add Seasonal Dummy Variables (EViews 8.1)

123456
# Rm <- a Prophet(weekly.seasonality=NOT CORRECT)m <- add_seasonality(m, Name='monthly', Period=30.5, fourier.order=5)m <- becomes a prophet(m, df)forecast <- predict(m, future)prophet_plot_components(m, forecast)
12345
#Pythonm = Prophet(weekly_seasonality=NOT CORRECT)m.add_seasonality(Name='monthly', Period=30.5, fourier_order=5)forecast = m.fit(df).predict(future)Feige = m.plot_components(forecast)

Seasonality, holiday effects and regressors (5)

Seasonalities that depend on other factors

In some cases, seasonality may depend on other factors, such as B. a weekly seasonal pattern that is different in summer than the rest of the year, or a daily seasonal pattern that is different on weekends than weekdays. These types of seasonalities can be modeled using conditional seasonalities.

Consider the example of Peyton Manning from theQuick Start. The default weekly seasonality assumes the weekly seasonality pattern to be the same year-round, but we expect the weekly seasonality pattern to be different during the season (when games are played every Sunday) and the off-season. We can use conditional seasonalities to construct separate weekly in-season and off-season seasonalities.

First, let's add a boolean column to the dataframe that indicates whether each date is in season or in the off-season:

12345678
# Ris_nfl_season <- function(DS) { Term <- as.date(DS) Month <- as.numeric(Format(Term, '%m')) return(Month > 8 | Month < 2)}df$on_season <- is_nfl_season(df$DS)df$off-season <- !is_nfl_season(df$DS)
1234567
#Pythondef is_nfl_season(DS): Datum = pd.to_datetime(DS) return (Datum.Month > 8 or Datum.Month < 2)df['on_season'] = df['ds'].use(is_nfl_season)df['off-season'] = ~df['ds'].use(is_nfl_season)

Then we disable the built-in weekly seasonality and replace it with two weekly seasonalities that have these columns specified as a condition. This means that seasonality is only applied to dates where thecondition namecolumn isTRUE. We also need to add the column toofutureData frame for which we make predictions.

12345678910
# Rm <- a Prophet(weekly.seasonality=NOT CORRECT)m <- add_seasonality(m, Name='weekly_on_season', Period=7, fourier.order=3, Bedingung.Name='on_season')m <- add_seasonality(m, Name='weekly_off_season', Period=7, fourier.order=3, Bedingung.Name='off-season')m <- becomes a prophet(m, df)future$on_season <- is_nfl_season(future$DS)future$off-season <- !is_nfl_season(future$DS)forecast <- predict(m, future)prophet_plot_components(m, forecast)
123456789
#Pythonm = Prophet(weekly_seasonality=NOT CORRECT)m.add_seasonality(Name='weekly_on_season', Period=7, fourier_order=3, condition name='on_season')m.add_seasonality(Name='weekly_off_season', Period=7, fourier_order=3, condition name='off-season')future['on_season'] = future['ds'].use(is_nfl_season)future['off-season'] = ~future['ds'].use(is_nfl_season)forecast = m.fit(df).predict(future)Feige = m.plot_components(forecast)

Seasonality, holiday effects and regressors (6)

Both seasonalities are now shown in the component charts above. We can see that during the on-season when games are played every Sunday, there are big increases on Sunday and Monday that are completely absent during the off-season.

Prior scale for holidays and seasonality

If you find that the holidays are over-adjusted, you can adjust their previous scaling to smooth them with the parameterholiday_prior_scale. By default, this parameter is 10, which offers very little regularization. Reducing this parameter dampens holiday effects:

1234567
# Rm <- a Prophet(df, public holidays = public holidays, holidays.before.the.season = 0,05)forecast <- predict(m, future)forecast %>% choose(DS, game start, Super Bowl) %>% Filter(Abs(game start + Super Bowl) > 0) %>% tail(10)
(Video) Forecasting with Multiple Seasonality
12345
#Pythonm = Prophet(public holidays=public holidays, holiday_prior_scale=0,05).fit(df)forecast = m.predict(future)forecast[(forecast['game start'] + forecast['Super Bowl']).Abs() > 0][ ['ds', 'game start', 'Super Bowl']][-10:]
DS game start Super Bowl
2190 2014-02-02 1.206086 0,964914
2191 2014-02-03 1.852077 0,992634
2532 2015-01-11 1.206086 0,000000
2533 2015-01-12 1.852077 0,000000
2901 2016-01-17 1.206086 0,000000
2902 2016-01-18 1.852077 0,000000
2908 2016-01-24 1.206086 0,000000
2909 2016-01-25 1.852077 0,000000
2922 2016-02-07 1.206086 0,964914
2923 2016-02-08 1.852077 0,992634

The magnitude of the holiday effect has been reduced from what it used to be, particularly for superbowls, which had the fewest observations. There is one parameterseasonal_priority_scaleThis similarly adjusts the extent to which the seasonality model is fitted to the data.

Advance staggers can be set separately for individual public holidays by adding a columnprior_scalein the holiday data frame. Prior scales for individual seasonalities can be passed to as an argumentadd_seasonality. For example, the prior scale for weekly seasonality only can be set with:

1234
# Rm <- a Prophet()m <- add_seasonality( m, Name='weekly', Period=7, fourier.order=3, prior.scale=0,1)
1234
#Pythonm = Prophet()m.add_seasonality( Name='weekly', Period=7, fourier_order=3, prior_scale=0,1)

Additional regressors

Additional regressors can be added to the linear part of the modeladd_regressormethod or function. A column with the regressor value must exist in both the fit and prediction data frames. For example, we can add an additional effect on Sundays during the NFL season. In the component graph, this effect is shown in the extra_regressors graph:

12345678910111213141516
# Rnfl_sunday <- function(DS) { Term <- as.date(DS) Month <- as.numeric(Format(Term, '%m')) as.numeric((days of the week(Term) == "Sunday") & (Month > 8 | Month < 2))}df$nfl_sunday <- nfl_sunday(df$DS)m <- a Prophet()m <- add_regressor(m, 'nfl_sunday')m <- becomes a prophet(m, df)future$nfl_sunday <- nfl_sunday(future$DS)forecast <- predict(m, future)prophet_plot_components(m, forecast)
1234567891011121314151617
#Pythondef nfl_sunday(DS): Datum = pd.to_datetime(DS) if Datum.weekday() == 6 and (Datum.Month > 8 or Datum.Month < 2): return 1 anders: return 0df['nfl_sunday'] = df['ds'].use(nfl_sunday)m = Prophet()m.add_regressor('nfl_sunday')m.fit(df)future['nfl_sunday'] = future['ds'].use(nfl_sunday)forecast = m.predict(future)Feige = m.plot_components(forecast)

Seasonality, holiday effects and regressors (7)

NFL Sundays could also have been handled using the "Holidays" interface described above by creating a list of past and future NFL Sundays. Thatadd_regressorfunction provides a more general interface for defining additional linear regressors and, in particular, does not require the regressor to be a binary indicator. Another time series could be used as a regressor, but the future values ​​would have to be known.

This notebookshows an example of using weather factors as additional regressors in a bicycle usage forecast and provides a good illustration of how other time series can be included as additional regressors.

Thatadd_regressorThe function has optional arguments to specify the a priori scale (by default the holiday a priori scale is used) and whether the regressor is standardized or not - see the docstring withhelp(Prophet.add_regressor)in Python u?add_regressorin R. Note that regressors must be added before fitting the model. The Prophet will also report an error if the regressor is constant throughout history as there is nothing to match.

The additional regressor must be known for both historical and future data. So it must either be something that has known future values ​​(such asnfl_sunday) or something that has been forecast separately elsewhere. The weather regressors used in the notebook linked above are a good example of an additional regressor that contains forecasts that can be used for future values. One can also use another time series as a regressor, which was predicted with a time series model like Prophet. For example whenr(t)is included as a regressor fory(t), Prophet can be used for predictionr(t)and then this forecast can be included in the forecast as future valuesy(t). A note of caution with this approach: this probably won't be useful unlessr(t)is then somehow easier to predicty(t). This is because there are errors in the forecast ofr(t)will generate errors in the forecast ofy(t). One setting where this can be useful is in hierarchical time series where there is a top level forecast which has a higher signal to noise ratio and is therefore easier to forecast. Its forecast can be included in the forecast for each subseries.

Additional regressors are inserted into the linear component of the model such that the underlying model appears to depend on the additional regressor as either an additive or a multiplicative factor for the time series (see the next section for multiplicativeness).

(Video) 5.18: Seasonal dummy variables in time series in R

Coefficients of additional regressors

To extract the beta coefficients of the additional regressors use the utility functionregressor_coefficients(aus prophet.utilities import regressor_coefficientsin Python,prophet::regressor_coefficientsin R) on the fitted model. The estimated beta coefficient for each regressor approximately represents the increase in predicted value for a one unit increase in regressor value (note that the coefficients returned are always on scale with the original data). ifmcmc_samplesis specified, it also returns a credible interval for each coefficient that can be used to determine whether each regressor is "statistically significant".

Edit on GitHub

Videos

1. What is Facebook Prophet? Time Series Forecasting
(Professor Ryan)
2. Adding Regressors to Facebook Prophet Forecasting Model
(Data Heroes)
3. fasster: Forecasting multiple seasonality with state switching
(R Consortium)
4. Facebook Prophet - Time Series with Trend and Holiday Modeling
(AIEngineering)
5. Multiple Time Series Modeling using Facebook Prophet
(AIEngineering)
6. Time series analysis using Prophet in Python — Math explained
(Sophia Yang)

References

Top Articles
Latest Posts
Article information

Author: Francesca Jacobs Ret

Last Updated: 18/09/2023

Views: 5465

Rating: 4.8 / 5 (48 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Francesca Jacobs Ret

Birthday: 1996-12-09

Address: Apt. 141 1406 Mitch Summit, New Teganshire, UT 82655-0699

Phone: +2296092334654

Job: Technology Architect

Hobby: Snowboarding, Scouting, Foreign language learning, Dowsing, Baton twirling, Sculpting, Cabaret

Introduction: My name is Francesca Jacobs Ret, I am a innocent, super, beautiful, charming, lucky, gentle, clever person who loves writing and wants to share my knowledge and understanding with you.