Non-Parametric Product Life Cycle Modeling

In this notebook we present an example of how to use a combination of Bayesian hierarchical models and the non-parametric methods , namely bayesian additive trees (BART), to model the product life cycles. This approach is motivated by previous work in cohort analysis, see here.

As a case study we use the Google search index (trends) data for iPhones worldwide. We use the data of four different models to predict the development of the latest iPhone. The model presented for this specific example can be easily extended for other products and product life cycles structures.

Prepare Notebook

import arviz as az
import holidays
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import preliz as pz
import pymc as pm
import pymc_bart as pmb
import pytensor.tensor as pt
import seaborn as sns
from pymc_bart.split_rules import ContinuousSplitRule, OneHotSplitRule, SubsetSplitRule

az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [12, 7]
plt.rcParams["figure.dpi"] = 100
plt.rcParams["figure.facecolor"] = "white"

%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = "retina"

seed: int = sum(map(ord, "iphone_trends"))
rng: np.random.Generator = np.random.default_rng(seed=seed)

Read Data

We provide the data downloaded from Google Trends for the following search terms (weekly granularity):

raw_df = pd.read_csv(
    "https://raw.githubusercontent.com/juanitorduz/website_projects/master/data/iphone_trends.csv",
    parse_dates=["week"]
)

raw_df.tail()

	week	iphone_11	iphone_12	iphone_13	iphone_14	iphone_15
256	2023-10-22	10	10	14	16	23
257	2023-10-29	10	9	13	15	22
258	2023-11-05	11	10	15	17	22
259	2023-11-12	11	10	16	17	24
260	2023-11-19	12	11	17	17	24

We see a clear structure in the searches development for each model:

Steep peak at the beginning of the product life cycle (maximum on the release week)
A slow long term decay with a mild yearly seasonality.

In order to capture the peak at the beginning of the product life cycle we collect the release data for such iPhone models:

releases_df = pd.DataFrame(
    data=[
        {"model": "iphone_11", "release_date": "September 20, 2019"},
        {"model": "iphone_12", "release_date": "October 23, 2020"},
        {"model": "iphone_13", "release_date": "September 24, 2021"},
        {"model": "iphone_14", "release_date": "September 16, 2022"},
        {"model": "iphone_15", "release_date": "September 22, 2023"},
    ]
)

# We use the same week-starting date as the Google Trends data
release_dates_df = releases_df.assign(
    release_week=lambda x: (
        pd.to_datetime(x["release_date"]).dt.to_period("W-SAT").dt.start_time
        - pd.DateOffset(days=7)
    )
)

release_dates_df.head()

	model	release_date	release_week
0	iphone_11	September 20, 2019	2019-09-08
1	iphone_12	October 23, 2020	2020-10-11
2	iphone_13	September 24, 2021	2021-09-12
3	iphone_14	September 16, 2022	2022-09-04
4	iphone_15	September 22, 2023	2023-09-10

Feature Engineering

To start, we collect two holiday events that are clear from the image above: Black Friday and Christmas.

holidays_list = ["Christmas Day", "Thanksgiving"]

holidays_df = (
    pd.DataFrame.from_dict(
        data=holidays.country_holidays(
            country="US", years=[2019, 2020, 2021, 2022, 2023, 2024]
        ),
        orient="index",
        columns=["holiday"],
    )
    .query("holiday in @holidays_list")
    .reset_index(drop=False)
    .rename(columns={"index": "date"})
    .assign(
        week=lambda x: pd.to_datetime(x["date"]).dt.to_period("W-SAT").dt.start_time
    )
)

holidays_df.head(12)

	date	holiday	week
0	2019-11-28	Thanksgiving	2019-11-24
1	2019-12-25	Christmas Day	2019-12-22
2	2020-11-26	Thanksgiving	2020-11-22
3	2020-12-25	Christmas Day	2020-12-20
4	2021-11-25	Thanksgiving	2021-11-21
5	2021-12-25	Christmas Day	2021-12-19
6	2022-11-24	Thanksgiving	2022-11-20
7	2022-12-25	Christmas Day	2022-12-25
8	2023-11-23	Thanksgiving	2023-11-19
9	2023-12-25	Christmas Day	2023-12-24
10	2024-11-28	Thanksgiving	2024-11-24
11	2024-12-25	Christmas Day	2024-12-22

Next, motivated by the features used in the previous work on cohort analysis, we create the following features:

age: Number of weeks since the release of the iPhone model relative to today’s date.
model_age: Number of weeks since the release of the iPhone model relative to the release of the latest iPhone model. Not that this feature is negative for weeks before the release.
is_release_week: Binary feature indicating if the week is the release week of the iPhone model.
month: Month of the year.

data_df = (
    raw_df.copy()
    .melt(id_vars=["week"], var_name="model", value_name="search")
    .merge(right=release_dates_df, on="model")
    .drop(columns=["release_date"])
    .assign(
        age=lambda x: (x["week"].max() - x["week"]).dt.days / 7,
        model_age=lambda x: (x["week"] - x["release_week"]).dt.days / 7,
        is_release=lambda x: (x["model_age"] == 0).astype(float),
        month=lambda x: x["week"].dt.month,
        # holidays
        is_christmas=lambda x: x["week"]
        .isin(holidays_df.query("holiday == 'Christmas Day'")["week"])
        .astype(float),
        is_thanksgiving=lambda x: x["week"]
        .isin(holidays_df.query("holiday == 'Thanksgiving'")["week"])
        .astype(float),
    )
    .query(
        "model_age >= -50"
    )  # Drop data one year before release (as it is mainly noise)
    .reset_index(drop=True)
)

data_df.head()

	week	model	release_week	age	model_age	month
0	2018-11-25	iphone_11	2019-09-08	260.0	-41.0	11
1	2018-12-02	iphone_11	2019-09-08	259.0	-40.0	12
2	2018-12-09	iphone_11	2019-09-08	258.0	-39.0	12
3	2018-12-16	iphone_11	2019-09-08	257.0	-38.0	12
4	2018-12-23	iphone_11	2019-09-08	256.0	-37.0	12

EDA

We plot the data for each iPhone model:

fig, ax = plt.subplots(figsize=(15, 7))

sns.lineplot(
    data=data_df,
    x="week",
    y="search",
    hue="model",
    ax=ax,
)

for i, model in enumerate(release_dates_df["model"]):
    release_week = release_dates_df.query(f"model == '{model}'")["release_week"].item()
    ax.axvline(
        release_week,
        color=f"C{i}",
        linestyle="--",
        alpha=0.5,
        label=f"release week {model}",
    )

ax.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
ax.set_title(
    label="Google Trends for iPhone Models (Worldwide)", fontsize=18, fontweight="bold"
)

We do see how the peak maximum coincides with the release date.

Note that all of the models have a similar structure, however the maximum and decay forms differ from model to model. This motivates the use of a hierarchical model.

On the other hand, if we plot the search index in a heat map we doo see the immediate similarity to th cohort analysis problem where we use age and cohort age as key futures to model the revenue and retention:

fig, ax = plt.subplots()
(
    data_df.assign(week=lambda x: x["week"].dt.date)[["model", "week", "search"]]
    .pivot_table(index="model", columns="week", values="search")
    .pipe(
        (sns.heatmap, "data"),
        cmap="viridis_r",
        ax=ax,
    )
)
ax.set_title(
    label="Google Trends for iPhone Models (Worldwide)", fontsize=18, fontweight="bold"
)

This motivates the use of BART as a non-parametric component to model seasonal and long term behavior of the search index.

Remark: This is of course not the only possibility. We could for example use (hierarchical) Gaussian processes.

Prepare Data

We now prepare the data for the model.

# model for the holdout data
test_model = "iphone_15"

train_df = data_df.query(f"model != '{test_model}'")
test_df = data_df.query(f"model == '{test_model}'")

train_obs = train_df.index.to_numpy()
train_iphone_model_idx, train_iphone_model = train_df["model"].factorize(sort=True)
train_month_idx, train_month = train_df["month"].factorize(sort=True)
train_age = train_df["age"].to_numpy()
train_model_age = train_df["model_age"].to_numpy()
train_is_release = train_df["is_release"].to_numpy()
features = ["age", "model_age", "month", "is_christmas", "is_thanksgiving"]
x_train = train_df[features]
train_search = train_df["search"].to_numpy()

Model Specification

Main Idea

The main idea is to model the search index through a negative binomial likelihood where the mean is the sum of the release phase plus the long term decay. For the release phase we use a non-symmetric Gaussian bump function while for the long term decay we use a BART model.

Remark: We use the recommended parametrization for the negative binomial distribution, see here.

Remark: We weight the \(\alpha\) parameter in the likelihood to place more importance on the search index values closer to the release date. The motivation is that we are often interested in this first phase much more than the relatively stable decay (see here).

Mathematical Formulation

Here are the main ingredients of the model:

\[ \begin{align*} \beta^{[m]}_{\text{release}} \sim & \text{HalfNormal}(\sigma_{\beta}) \\ \ell^{[m]}_{\pm} \sim & \text{HalfNormal}(\sigma_{\ell}) \\ \mu_{\text{release}} = & \beta^{[m]}_{\text{release}} \left( I\{\text{model_age} \leq 0\} \exp(-{\text{model_age}^{p_{-}} / \ell^{[m]}_{-}}) \right. \\ & \left.+ I\{\text{model_age} \geq 0\} \exp(-{\text{model_age}^{p_{+}} / \ell^{[m]}_{+}} ) \right)\\ \mu_{\text{decay}} = & \text{BART}(\text{age}, \text{model_age}, \text{month}, \text{holidays}) \\ \mu = & \mu_{\text{release}} + \mu_{\text{decay}} \\ \alpha = & \frac{1}{\sqrt{\tilde{\alpha}}} \\ \alpha_{\text{weighted}} = & \frac{\alpha}{|\text{model_age}|} \\ \text{search} \sim & \text{NegativeBinomial}(\mu, \alpha_{\text{weighted}}) \end{align*} \]

The hierarchical structure is encoded in \(\beta{\text{release}}\) and the \(\ell_{\pm}\) which come from a common prior for each model respectively.

I believe this mathematical formulation is not very intuitive, so let’s go to the PyMC code:

# small number for numerical stability
eps = np.finfo(float).eps

coords = {"month": train_month, "feature": features}

with pm.Model(coords=coords) as model:
    # --- Data Containers ---

    model.add_coord(name="iphone_model", values=train_iphone_model, mutable=True)
    model.add_coord(name="obs", values=train_obs, mutable=True)
    iphone_model_idx_data = pm.MutableData(
        name="iphone_model_idx_data", value=train_iphone_model_idx, dims="obs"
    )
    model_age_data = pm.MutableData(
        name="model_age_data", value=train_model_age, dims="obs"
    )
    x_data = pm.MutableData(name="x_data", value=x_train, dims=("obs", "feature"))
    search_data = pm.MutableData(name="search_data", value=train_search, dims="obs")

    # --- Priors ---

    power_minus = pm.Beta(name="power_minus", alpha=5, beta=15)
    power_plus = pm.Beta(name="power_plus", alpha=5, beta=15)

    sigma_ell_is_release_minus = pm.Exponential(
        name="sigma_ell_is_release_minus", lam=1
    )
    sigma_ell_is_release_plus = pm.Exponential(
        name="sigma_ell_is_release_plus", lam=1 / 4
    )
    sigma_is_release = pm.Exponential(name="sigma_is_release", lam=1 / 100)

    ell_is_release_minus = pm.HalfNormal(
        name="ell_is_release_minus",
        sigma=sigma_ell_is_release_minus,
        dims="iphone_model",
    )

    ell_is_release_plus = pm.HalfNormal(
        name="ell_is_release_plus", sigma=sigma_ell_is_release_plus, dims="iphone_model"
    )

    beta_is_release = pm.HalfNormal(
        name="beta_is_release", sigma=sigma_is_release, dims="iphone_model"
    )

    inv_alpha_lam = pm.HalfNormal(name="inv_alpha_lam", sigma=500)

    inv_alpha = pm.Exponential(name="inv_alpha", lam=inv_alpha_lam, dims="iphone_model")

    bart_mu_log = pmb.BART(
        name="bart_mu_log",
        X=x_data,
        Y=np.log1p(train_search),
        m=100,
        response="mix",
        split_rules=[
            ContinuousSplitRule(),
            ContinuousSplitRule(),
            SubsetSplitRule(),
            OneHotSplitRule(),
            OneHotSplitRule(),
        ],
        dims="obs",
    )

    # --- Parametrization ---

    is_release_bump = pm.Deterministic(
        name="is_release_bump",
        var=beta_is_release[iphone_model_idx_data]
        * (
            (
                (model_age_data < 0)
                * pt.exp(
                    -pt.pow(pt.abs(model_age_data), power_minus)
                    / ell_is_release_minus[iphone_model_idx_data]
                )
            )
            + (
                (model_age_data >= 0)
                * pt.exp(
                    -pt.pow(pt.abs(model_age_data), power_plus)
                    / ell_is_release_plus[iphone_model_idx_data]
                )
            )
        ),
        dims="obs",
    )

    mu = pm.Deterministic(
        name="mu",
        var=pt.exp(bart_mu_log) + is_release_bump,
        dims="obs",
    )

    alpha = pm.Deterministic(
        name="alpha",
        var=1 / (pt.sqrt(inv_alpha) + eps),
        dims="iphone_model",
    )

    alpha_weighted = pm.Deterministic(
        name="alpha_weighted",
        var=alpha[iphone_model_idx_data] / (pt.abs(model_age_data) + 1),
        dims="obs",
    )

    # --- Likelihood ---

    _ = pm.NegativeBinomial(
        name="likelihood",
        mu=mu,
        alpha=alpha_weighted,
        observed=search_data,
        dims="obs",
    )

pm.model_to_graphviz(model=model)

We can take a look into some of the prior distributions for the Gaussian bump parameters:

fig, ax = plt.subplots(
    nrows=2, ncols=1, figsize=(9, 7), sharex=False, sharey=False, layout="constrained"
)
pz.Beta(alpha=5, beta=15).plot_pdf(ax=ax[0])
ax[0].set_title(r"$p_{\pm}$")
pz.Exponential(lam=1 / 4).plot_pdf(ax=ax[1])
ax[1].set_title(r"$\sigma_{\ell}$")

Prior Predictive

Let’s start with the prior predictive distribution to check the feasibility of the priors and the model specification:

with model:
    prior_predictive = pm.sample_prior_predictive(samples=1_000, random_seed=rng)

fig, ax = plt.subplots()
az.plot_ppc(data=prior_predictive, group="prior", kind="kde", ax=ax)
ax.set_title(label="Prior Predictive", fontsize=18, fontweight="bold")
ax.set(xscale="log")

Overall the prior predictive looks reasonable.

Model Fitting

The model takes around \(10\) minutes to fit (Mac Book Pro, Intel Core i7).

with model:
    idata = pm.sample(
        tune=1_000,
        target_accept=0.90,
        draws=3_000,
        chains=4,
        random_seed=rng,
    )
    posterior_predictive = pm.sample_posterior_predictive(trace=idata, random_seed=rng)

Diagnostics

We look into the model diagnostics. First we check divergences:

idata["sample_stats"]["diverging"].sum().item()

No divergences! Let’s look into additional diagnostics:

var_names = [
    "power_minus",
    "power_plus",
    "sigma_is_release",
    "sigma_ell_is_release_minus",
    "sigma_ell_is_release_plus",
    "beta_is_release",
    "ell_is_release_minus",
    "ell_is_release_plus",
    "inv_alpha_lam",
    "inv_alpha",
    "alpha",
]

az.summary(data=idata, var_names=var_names, round_to=3)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
power_minus	0.298	0.028	0.246	0.352	0.001	0.000	2391.351	4144.913	1.002
power_plus	0.252	0.020	0.215	0.288	0.002	0.001	86.111	1099.446	1.042
sigma_is_release	102.876	42.193	46.924	180.339	0.490	0.368	10416.997	6657.539	1.001
sigma_ell_is_release_minus	0.676	0.304	0.267	1.206	0.004	0.003	8418.883	6663.825	1.000
sigma_ell_is_release_plus	1.975	0.920	0.740	3.520	0.011	0.009	9682.738	7021.077	1.001
beta_is_release[iphone_11]	87.811	7.197	74.774	101.641	0.109	0.077	4319.084	6775.800	1.002
beta_is_release[iphone_12]	104.559	8.481	89.382	120.857	0.109	0.077	6011.804	7460.644	1.000
beta_is_release[iphone_13]	71.477	6.613	58.896	83.784	0.097	0.068	4645.736	6530.297	1.003
beta_is_release[iphone_14]	77.189	7.859	62.536	91.975	0.110	0.078	5044.312	7110.363	1.000
ell_is_release_minus[iphone_11]	0.306	0.092	0.090	0.443	0.003	0.002	887.354	876.292	1.005
ell_is_release_minus[iphone_12]	0.637	0.056	0.536	0.743	0.001	0.001	2138.258	3615.357	1.002
ell_is_release_minus[iphone_13]	0.508	0.054	0.405	0.610	0.002	0.001	1283.886	2246.039	1.010
ell_is_release_minus[iphone_14]	0.596	0.061	0.483	0.711	0.002	0.001	1061.666	2279.684	1.015
ell_is_release_plus[iphone_11]	1.621	0.165	1.323	1.932	0.004	0.003	1851.762	3902.532	1.005
ell_is_release_plus[iphone_12]	1.240	0.108	1.047	1.450	0.002	0.002	2247.391	4202.864	1.003
ell_is_release_plus[iphone_13]	1.653	0.164	1.358	1.965	0.003	0.002	2443.428	4825.193	1.003
ell_is_release_plus[iphone_14]	1.434	0.151	1.148	1.703	0.003	0.002	2339.363	4509.314	1.002
inv_alpha_lam	1063.267	343.859	455.498	1731.624	2.941	2.080	12516.787	5576.722	1.001
inv_alpha[iphone_11]	0.000	0.000	0.000	0.000	0.000	0.000	11882.618	6562.296	1.000
inv_alpha[iphone_12]	0.000	0.000	0.000	0.000	0.000	0.000	9734.909	5760.883	1.000
inv_alpha[iphone_13]	0.000	0.000	0.000	0.000	0.000	0.000	11053.589	5939.772	1.000
inv_alpha[iphone_14]	0.000	0.000	0.000	0.000	0.000	0.000	11624.208	6221.281	1.000
alpha[iphone_11]	145142.927	323221.101	14576.774	364749.684	4811.220	3442.385	11882.618	6562.296	1.000
alpha[iphone_12]	74385.878	189051.386	5906.571	193317.310	2613.356	1848.025	9734.909	5760.883	1.000
alpha[iphone_13]	44040.603	74711.095	4282.518	116911.536	995.960	704.287	11053.589	5939.772	1.000
alpha[iphone_14]	15155.624	24059.611	1308.656	40757.582	369.597	269.143	11624.208	6221.281	1.000

Overall, it looks fine.

axes = az.plot_trace(
    data=idata,
    var_names=var_names,
    compact=True,
    backend_kwargs={"figsize": (12, 15), "layout": "constrained"},
)
plt.gcf().suptitle("Model Trace", fontsize=18, fontweight="bold")

Posterior Predictive

We now look into the posterior predictive distribution:

az.plot_ppc(
    data=posterior_predictive,
    num_pp_samples=1_000,
    observed_rug=True,
    random_seed=seed,
)

It looks the model has captured most of the variance from the data.

We can now look for each model separately:

fig, axes = plt.subplots(
    nrows=train_iphone_model.size,
    ncols=1,
    sharex=True,
    sharey=True,
    figsize=(12, 15),
    layout="constrained",
)

for i, iphone_model in enumerate(train_iphone_model):
    ax = axes[i]

    condition = train_df["model"] == iphone_model
    temp_df = train_df[condition]
    temp_likelihood = posterior_predictive["posterior_predictive"]["likelihood"][
        :, :, condition.to_numpy()
    ]

    sns.lineplot(
        data=temp_df,
        x="model_age",
        y="search",
        color="black",
        label="Observed",
        ax=ax,
    )
    sns.lineplot(
        x=temp_df["model_age"],
        y=temp_likelihood.mean(dim=("chain", "draw")),
        color=f"C{i}",
        label="Posterior Predictive Mean",
        ax=ax,
    )
    az.plot_hdi(
        x=temp_df["model_age"],
        y=temp_likelihood,
        hdi_prob=0.94,
        color=f"C{i}",
        smooth=False,
        fill_kwargs={"alpha": 0.3, "label": r"$94\%$ HDI"},
        ax=ax,
    )
    az.plot_hdi(
        x=temp_df["model_age"],
        y=temp_likelihood,
        hdi_prob=0.50,
        color=f"C{i}",
        smooth=False,
        fill_kwargs={"alpha": 0.6, "label": r"$50\%$ HDI"},
        ax=ax,
    )
    ax.legend(loc="upper right")
    ax.set(title=iphone_model, xlabel="Model Age (weeks)", ylabel="Search")

We indeed see we the model has captures the release peak, the decay and the yearly seasonality 🙌!

Life Cycle Components

We now split the posterior predictive mean into the release peak and decay components:

fig, ax = plt.subplots(figsize=(15, 7))

sns.lineplot(
    data=train_df,
    x="week",
    y="search",
    hue="model",
    ax=ax,
)

for i, iphone_model in enumerate(
    release_dates_df.query(f"model != '{test_model}'")["model"]
):
    condition = (train_df["model"] == iphone_model).to_numpy()

    release_week = release_dates_df.query(f"model == '{iphone_model}'")[
        "release_week"
    ].item()

    ax.axvline(
        release_week,
        color=f"C{i}",
        linestyle="--",
        label=f"release week {iphone_model}",
    )

    sns.lineplot(
        x=train_df["week"][condition],
        y=idata["posterior"]["is_release_bump"].mean(dim=("chain", "draw"))[condition],
        color=f"C{i}",
        alpha=0.5,
        label="Release Bump: (mean)",
        ax=ax,
    )

    sns.lineplot(
        x=train_df["week"][condition],
        y=np.exp(
            idata["posterior"]["bart_mu_log"].mean(dim=("chain", "draw"))[condition]
        ),
        color=f"C{i}",
        alpha=0.5,
        linestyle="-.",
        label="Decay component: (mean)",
        ax=ax,
    )

ax.legend(loc="upper center", bbox_to_anchor=(0.5, -0.12), ncol=3)
ax.set_title(
    label="Google Trends for iPhone Models (Worldwide)", fontsize=18, fontweight="bold"
)

Most of the variance is explained by the release Gaussian bump components. The decay component acs as a small collection over the initial peak decay.

We can decompose further de decay component by looking into the partial dependence plots:

axes = pmb.plot_pdp(
    bartrv=bart_mu_log,
    X=x_train,
    Y=train_search,
    func=np.exp,
    xs_interval="insample",
    samples=1_000,
    grid="long",
    color="C0",
    color_mean="C0",
    var_discrete=[2, 3, 4],
    figsize=(10, 12),
    random_seed=seed,
)
plt.gcf().suptitle(
    "Partial Dependency Plots (PDP)",
    fontsize=18,
    fontweight="bold",
    y=1.05,
)

Here are some remarks from these partial dependence plots:

Overall, the contribution of the decay component is small 9expected from the plot above).
The age’s contribution decays monotonically.
The variable model_age presents a bump around week \(50\), which seems to be be related to the release of the next iPhone model. We could add this feature to the model to capture this behavior.
We see a mild seasonality contribution from the month variable, attaining a maximum around September when the new iPhone models are usually released.
We also a mild contribution from the holidays variable. A better encoding would be a multiplicative interaction with the model age.

Out-of-Sample Predictions

We now generate out of sample prediction for the unseen iPhone model:

test_obs = test_df.index.to_numpy()
test_iphone_model_idx, test_iphone_model = test_df["model"].factorize(sort=True)
test_month_idx, test_month = test_df["month"].factorize(sort=True)
test_age = test_df["age"].to_numpy()
test_model_age = test_df["model_age"].to_numpy()
test_is_release = test_df["is_release"].to_numpy()
x_test = test_df[features]
test_search = test_df["search"].to_numpy()

with model:
    pm.set_data(
        new_data={
            "iphone_model_idx_data": test_iphone_model_idx,
            "model_age_data": test_model_age,
            "x_data": x_test,
            "search_data": np.ones_like(test_search),  # Dummy data to make coords work!
            # We are not using this at prediction time!
        },
        coords={"iphone_model": test_iphone_model, "obs": test_obs},
    )
    idata.extend(
        pm.sample_posterior_predictive(
            trace=idata,
            var_names=["is_release_bump", "bart_mu_log", "likelihood"],
            idata_kwargs={
                "coords": {"iphone_model": test_iphone_model, "obs": test_obs}
            },
            random_seed=rng,
        )
    )

Finally, we plot the posterior predictive distribution for the unseen iPhone model:

fig, ax = plt.subplots()

test_likelihood = idata["posterior_predictive"]["likelihood"]

i = 4

sns.lineplot(
    data=test_df,
    x="model_age",
    y="search",
    color="black",
    label="Observed",
    ax=ax,
)
sns.lineplot(
    x=test_df["model_age"],
    y=test_likelihood.mean(dim=("chain", "draw")),
    color=f"C{i}",
    label="Posterior Predictive Mean",
    ax=ax,
)
az.plot_hdi(
    x=test_df["model_age"],
    y=test_likelihood,
    hdi_prob=0.94,
    color=f"C{i}",
    smooth=False,
    fill_kwargs={"alpha": 0.3, "label": r"$94\%$ HDI"},
    ax=ax,
)
az.plot_hdi(
    x=test_df["model_age"],
    y=test_likelihood,
    hdi_prob=0.50,
    color=f"C{i}",
    smooth=False,
    fill_kwargs={"alpha": 0.6, "label": r"$50\%$ HDI"},
    ax=ax,
)
sns.lineplot(
    x=test_df["model_age"],
    y=idata["posterior_predictive"]["is_release_bump"].mean(dim=("chain", "draw")),
    color="C7",
    linewidth=2,
    alpha=0.7,
    label="Release Bump: Posterior Predictive Mean",
    ax=ax,
)
sns.lineplot(
    x=test_df["model_age"],
    y=np.exp(idata["posterior_predictive"]["bart_mu_log"].mean(dim=("chain", "draw"))),
    color="C7",
    linewidth=2,
    linestyle="-.",
    alpha=0.7,
    label="Decay component: Posterior Predictive Mean",
    ax=ax,
)
ax.legend(loc="upper left")
ax.set(title=test_model, xlabel="Model Age (weeks)", ylabel="Search")

We see the model has captured the release peak and the decay within the \(94\%\) HDI. In particular, the predicted peak maximum is very close to the observed one.

Final Remarks

This example shows how to build a custom model for a family of products with similar life cycle structures. The model is flexible enough to capture the release peak and the long term decay. The model can be easily extended to other products and life cycle structures. Here we just illustrated some of the possibilities.