Measuring Upper-Funnel Impact with PyMC‑Marketing#

This notebook is a practical, end-to-end walkthrough for quantifying the impact of upper-funnel channels on a downstream business outcome using Bayesian Marketing Mix Modeling (MMM). We will combine a causal DAG, a transparent data-generating process, and a PyMC-Marketing implementation that respects time dynamics (adstock) and nonlinear response (saturation). The emphasis is on causal reasoning first, modeling second.

What you’ll build (at a glance)#

A PyMC-Marketing MMM with Geometric Adstock and Michaelis–Menten saturation for the outcome.
A mediator model that links upper-funnel activity to downstream exposure.
A simulation-based effect estimator that converts “what-if” changes in the upper funnel into outcome deltas, with uncertainty.

Why this matters#

Upper-funnel signals rarely act instantaneously or directly to our target variable. Treating them as simple features in a single regression collapses mediation, ignores carryover, and obscures the operating point where marginal returns live. A causal-first MMM clarifies where effects originate.

Who this is for#

Analysts and data scientists who need a defensible way to attribute upper-funnel investments to outcomes, and who are comfortable with Bayesian inference, time-series structure, and basic causal diagrams.

By the end, you’ll have a repeatable template for answering the “eternal question” of upper-funnel measurement—not by asking MMM to do everything, but by aligning MMM with your causal understanding of the marketing system.

What this notebook covers#

Framing the problem: Why upper-funnel measurement is hard and how naive regressions misattribute lift.
Causal scaffold: A minimal DAG that separates drivers, mediators, and outcomes, clarifying what the total effect means.
Dynamic response modeling: How adstock (carryover) and saturation (diminishing returns) shape observable performance.
Two-block estimation strategy:
1. A mediator block to translate upper-funnel shifts into lower-funnel impressions.
2. An outcome block to map those impressions into business impact.
Counterfactual thinking: Turning interventions on upper-funnel channels into predicted changes in the outcome (g-computation).
Model checking: Posterior predictive checks and sanity tests that prevent self-deception.
Reporting effects: Impulse responses, cumulative lift, and state-dependent elasticities that decision-makers can use.

Business Challenge: Untangling Upper → Mid → Lower Funnel Effects#

A consumer brand invests heavily in awareness (video, influencers, PR). Leadership sees short-term sales plateauing and asks: “Are our upper-funnel dollars doing anything?” Standard dashboards show weak contemporaneous relations, and channel teams argue that “it all pays back later.” We need a defensible way to quantify how upper-funnel exposure propagates through mid-funnel engagement to lower-funnel conversions—without relying on feelings or opinions only.

We start by sketching the world we believe we operate in: a small causal story that will guide every modeling decision that follows.

import graphviz

# Create the causal DAG
causal_dag = graphviz.Digraph("causal_dag")
causal_dag.attr(rankdir="LR")
causal_dag.attr("node", fontsize="12")

# Observed nodes
causal_dag.node("x1", "impressions_x1")
causal_dag.node("x2", "impressions_x2")
causal_dag.node("x3", "impressions_x3")
causal_dag.node("x4", "impressions_x4")
causal_dag.node("holidays", "holidays", shape="ellipse", style="dashed")
causal_dag.node("exogenous", "exogenous", shape="ellipse", style="dashed")
causal_dag.node("y", "new users")

# Exogenous vectors (unobserved)
causal_dag.node("u1", shape="ellipse", style="dashed")
causal_dag.node("u2", shape="ellipse", style="dashed")
causal_dag.node("u3", shape="ellipse", style="dashed")
causal_dag.node("u4", shape="ellipse", style="dashed")

# Exogenous parents
causal_dag.edge("u1", "x1", style="dashed")
causal_dag.edge("u2", "x2", style="dashed")
causal_dag.edge("u3", "x3", style="dashed")
causal_dag.edge("u4", "x4", style="dashed")

# Structural edges with coefficient labels
causal_dag.edge("x1", "x2")
causal_dag.edge("x1", "x3")
causal_dag.edge("x2", "x4")
causal_dag.edge("x3", "x4")

# add x4 to y, and holidays to y, and exogenous to y
causal_dag.edge("x4", "y")
causal_dag.edge("holidays", "y", style="dashed")
causal_dag.edge("exogenous", "y", style="dashed")

# holiday to x1
causal_dag.edge("holidays", "x1", style="dashed")

causal_dag

../../_images/b869c59a4c35c71e97ed695216a28ccbfcd8a753facc9d217b19e55c4a6ec874.svg

The Ecosystem based on the Causal DAG#

The previous DAG shows three types of effects on it.

\(X1\) Upper funnel (Awareness impressions):
Broad reach media (e.g., online video, display, influencer bursts). Drives latent consideration and brand memory, not immediate purchases.
\(X2\) & \(X3\) Mid funnel (Consideration touchpoints):
- \(X2\): Social media impressions / site landings (people start looking).
- \(X3\): Remarketing impressions (people re-engage after a first touch).
  These are mediators—they translate awareness into actionable intent.
\(X4\) Lower funnel (High-intent exposures):
Brand search. These are the proximal drivers of the commercial outcome.
U1, U4 Exogenous shocks:
Competitor moves, Budgets allocated to teams, or market dynamics changes can perturb each node separately.

Traditional MMMs often toss all channels into one regression and look for a global fit. For upper funnel channels, that approach can:

Under-credit X1 (because its effect is indirect and delayed)
Overfit to X4 (because it is nearest to the outcome)

The Business Questions#

Q1. How much does a marginal change in awareness (X1) ultimately shift the outcome, via the funnel?
Q2. Where are we on the response curve (are we close to saturation)?
Q3. What is the expected cumulative lift from a planned upper-funnel flight?

Imports#

We’ll work with PyMC-Marketing for MMM components, ArviZ for diagnostics, and Pytensor to keep our simulator honest.

import warnings

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import preliz as pz
import pytensor.tensor as pt
import xarray as xr
from pymc_extras.prior import Censored, Prior

from pymc_marketing.mmm import (
    GeometricAdstock,
    MichaelisMentenSaturation,
    NoAdstock,
    NoSaturation,
)
from pymc_marketing.mmm.multidimensional import MMM

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
/Users/carlostrujillo/Documents/GitHub/pymc-marketing/pymc_marketing/mmm/multidimensional.py:216: FutureWarning: This functionality is experimental and subject to change. If you encounter any issues or have suggestions, please raise them at: https://github.com/pymc-labs/pymc-marketing/issues/new
  warnings.warn(warning_msg, FutureWarning, stacklevel=1)

Notebook settings#

To keep figures consistent and runs reproducible, we lock a seed and a plotting style.

SEED = 142
n_observations = 1050

warnings.filterwarnings("ignore")

# Set the style
az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [8, 4]
plt.rcParams["figure.dpi"] = 100
plt.rcParams["axes.labelsize"] = 6
plt.rcParams["xtick.labelsize"] = 6
plt.rcParams["ytick.labelsize"] = 6

%config InlineBackend.figure_format = "retina"

Data generation process#

Based on the previous DAG, we’ll start to simulate data using pytensor, during the simulation we’ll need to state how the precise relationships (arrows between nodes) are represented and as consequence understand by the model.

Here, we’ll assume a fully additive model, and will understand the interactions between channels as additive models as well, by consequence the structural causal equation to build will be:

Data generation process#

Based on the previous DAG, we’ll simulate data in pytensor. We make the relationships additive at each node and allow nonlinear response only in the final mapping from lower-funnel exposure to outcome. Time is indexed by \(t=1,\dots,T\).

Structural causal equations#

Let \(X_1\) (upper funnel), \(X_2,X_3\) (mid funnel), \(X_4\) (lower funnel), and \(Y\) (business outcome). Exogenous shocks \(U_j\) enter additively.

\[\begin{split} \begin{aligned} X_{1,t} &= \mu_{1,t} + \beta_{0}\ E_t \\ X_{2,t} &= \mu_{2,t} + \beta_{12}\,X_{1,t} \\ X_{3,t} &= \mu_{3,t} + \beta_{13}\,X_{1,t} \\ X_{4,t} &= \mu_{4,t} + \beta_{24}\,X_{2,t} + \beta_{34}\,X_{3,t} \end{aligned} \end{split}\]

\(\mu_{j,t}\) are the exogenous components of the impressions (e.g: those changes not dependent on other nodes).

Outcome variable#

Let \(f(\cdot)\) denote the nonlinear response function (e.g., Geometric Adstock, Michaelis–Menten). The outcome combines transformed lower-funnel pressure with additive controls:

\[ Y_t \;=\; f\!\big(X_{4_t};\,\theta\big) \;+\; \text{Trend}_t \;+\; \text{Events}_t \;+\; \varepsilon_t, \]

\(\text{Trend}_t\): smooth baseline evolution.
\(\text{Events}_t\): superposition of localized event basis functions.
\(\theta\): parameters of the nonlinear response function (e.g., saturation/adstock).

Summary: We assume additive structure at each node, linear mediation \(X_1\!\to\!(X_2,X_3)\!\to\!X_4\), dynamic carryover on \(X_4\) via adstock, and a nonlinear mapping \(f\) from adstocked exposure to \(Y\), plus additive trend/events and noise.

Our story unfolds day by day; we create a daily timeline with a few simple calendar features.

min_date = pd.to_datetime("2022-01-01")
max_date = min_date + pd.Timedelta(days=n_observations)

date_range = pd.date_range(start=min_date, end=max_date, freq="D")

df = pd.DataFrame(data={"date_week": date_range}).assign(
    year=lambda x: x["date_week"].dt.year,
    month=lambda x: x["date_week"].dt.month,
    dayofyear=lambda x: x["date_week"].dt.dayofyear,
)

Using Pytensor, we can symbolically represent the causal Directed Acyclic Graph (DAG) in an abstract way before diving into any actual computation. This framework allows us to clearly define the relationships that our data generation process must adhere to. Pytensor simplifies this task and enables us to visualize the resulting graphical model—though it’s a computational DAG rather than a causal one—helping us to confirm that the entire process aligns with our expectations.

# Trend
trend = pt.vector("trend")
# Noise
global_noise = pt.vector("global_noise")
# Events
pt_event_signal = pt.vector("event_signal")  # raw signal
pt_event_contributions = pt.vector("event_contributions")  # contribution to y
# Outcome
y = trend + global_noise + pt_event_contributions
y.dprint();

Add [id A]
 ├─ Add [id B]
 │  ├─ trend [id C]
 │  └─ global_noise [id D]
 └─ event_contributions [id E]

Take a quick glance at the computational graph so we don’t fool ourselves: the ingredients combine exactly as intended. Next, we write down how awareness spills into consideration and then into high‑intent exposures. These links are simple—by design—so we can focus on the causal path rather than curve fitting.

beta_event_x1 = pt.scalar("beta_event_x1")
impressions_x1 = pt.vector("impressions_x1") + (pt_event_signal * beta_event_x1)
beta_x1_x2 = pt.scalar("beta_x1_x2")
beta_x1_x3 = pt.scalar("beta_x1_x3")
impressions_x2 = pt.vector("impressions_x2") + (impressions_x1 * beta_x1_x2)
impressions_x3 = pt.vector("impressions_x3") + (impressions_x1 * beta_x1_x3)
beta_x2_x4 = pt.scalar("beta_x2_x4")
beta_x3_x4 = pt.scalar("beta_x3_x4")
impressions_x4 = (
    pt.vector("impressions_x4")
    + (impressions_x2 * beta_x2_x4)
    + (impressions_x3 * beta_x3_x4)
)

impressions_x4.dprint();

Add [id A]
 ├─ Add [id B]
 │  ├─ impressions_x4 [id C]
 │  └─ Mul [id D]
 │     ├─ Add [id E]
 │     │  ├─ impressions_x2 [id F]
 │     │  └─ Mul [id G]
 │     │     ├─ Add [id H]
 │     │     │  ├─ impressions_x1 [id I]
 │     │     │  └─ Mul [id J]
 │     │     │     ├─ event_signal [id K]
 │     │     │     └─ ExpandDims{axis=0} [id L]
 │     │     │        └─ beta_event_x1 [id M]
 │     │     └─ ExpandDims{axis=0} [id N]
 │     │        └─ beta_x1_x2 [id O]
 │     └─ ExpandDims{axis=0} [id P]
 │        └─ beta_x2_x4 [id Q]
 └─ Mul [id R]
    ├─ Add [id S]
    │  ├─ impressions_x3 [id T]
    │  └─ Mul [id U]
    │     ├─ Add [id H]
    │     │  └─ ···
    │     └─ ExpandDims{axis=0} [id V]
    │        └─ beta_x1_x3 [id W]
    └─ ExpandDims{axis=0} [id X]
       └─ beta_x3_x4 [id Y]

Creating the data for our symbolic place holders.#

First, we kick-off filling in the values for our event and trend variables. Then we craft a gently rising baseline, sprinkle in Gaussian event pulses, and add a whisper of noise.

np_trend = (np.linspace(start=0.0, stop=0.40, num=n_observations) + 0.06) ** (0.1 / 0.2)
pz_global_noise = pz.Normal(mu=0, sigma=0.005).rvs(
    size=n_observations, random_state=SEED
)

event_dates = ["24-12", "09-07"]  # List of events as month-day strings
std_devs = [25, 15]  # List of standard deviations for each event
events_coefficients = [0.080, 0.010]

signals_independent = []

# Initialize the event effect array
event_signal = np.zeros(len(date_range))
event_contributions = np.zeros(len(date_range))

# Generate event signals
for event, std_dev, event_coef in zip(
    event_dates, std_devs, events_coefficients, strict=False
):
    # Find all occurrences of the event in the date range
    event_occurrences = date_range[date_range.strftime("%d-%m") == event]

    for occurrence in event_occurrences:
        # Calculate the time difference in days
        time_diff = (date_range - occurrence).days

        # Generate the Gaussian basis for the event
        _event_signal = np.exp(-0.5 * (time_diff / std_dev) ** 2)

        # Add the event signal to the event effect
        signals_independent.append(_event_signal)
        event_signal += _event_signal

        event_contributions += _event_signal * event_coef

np_event_signal = event_signal
np_event_contributions = event_contributions

We plot the pieces before we mix them.

plt.plot(pz_global_noise, label="Global Noise")
plt.plot(np_trend, label="Trend")
plt.plot(np_event_signal, label="Event Contributions")
plt.title("Components of the Time Series Model")
plt.xlabel("Time (days)")
plt.ylabel("Value")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

../../_images/fa4b074f67c22850516743bf6a419e34d1ae39d7b1ba91e83adc1821b7046a18.png

We’ll simulate our media variables (e.g: Impressions) as random walks. Upper, mid, and lower‑funnel exposures don’t typically jump randomly; they meander around a consistent value. A bounded random walk captures that inertia.

def random_walk(mu, sigma, steps, lower=None, upper=None, seed=None):
    """
    Generate a bounded random walk with specified mean and standard deviation.

    Parameters
    ----------
    mu : float
        Target mean of the random walk
    sigma : float
        Target standard deviation of the random walk
    steps : int
        Number of steps in the random walk
    lower : float, optional
        Lower bound for the random walk values
    upper : float, optional
        Upper bound for the random walk values
    seed : int, optional
        Random seed for reproducibility

    Returns
    -------
    np.ndarray
        Random walk array with specified mean, std, and bounds
    """
    # if seed none then set 123
    if seed is None:
        seed = 123
    # Create a random number generator with the given seed
    rng = np.random.RandomState(seed)

    # Start from the target mean
    walk = np.zeros(steps)
    walk[0] = mu

    # Generate the walk step by step with bounds checking
    for i in range(1, steps):
        # Generate a random increment using the seeded RNG
        increment = rng.normal(0, sigma * 0.1)  # Scale increment size

        # Propose next value
        next_val = walk[i - 1] + increment

        # Apply bounds if specified
        if lower is not None and next_val < lower:
            # Reflect off lower bound
            next_val = lower + (lower - next_val)
        if upper is not None and next_val > upper:
            # Reflect off upper bound
            next_val = upper - (next_val - upper)

        # Final bounds check (hard clipping as backup)
        if lower is not None:
            next_val = max(next_val, lower)
        if upper is not None:
            next_val = min(next_val, upper)

        walk[i] = next_val

    # Adjust to match target mean and std while respecting bounds
    current_mean = np.mean(walk)
    current_std = np.std(walk)

    if current_std > 0:
        # Center around zero, scale to target std, then shift to target mean
        walk_centered = (walk - current_mean) / current_std * sigma + mu

        # Apply bounds again after scaling
        if lower is not None:
            walk_centered = np.maximum(walk_centered, lower)
        if upper is not None:
            walk_centered = np.minimum(walk_centered, upper)

        walk = walk_centered

    return walk

We let each stream wander within bounds and take a first look at the behavior.

impressions_x1_var = random_walk(
    mu=0.41, sigma=0.21, steps=n_observations, lower=0, upper=1, seed=SEED + 1
)
impressions_x2_var = random_walk(
    mu=0.1, sigma=0.01, steps=n_observations, lower=0, upper=1, seed=SEED + 2
)
impressions_x3_var = random_walk(
    mu=0.2, sigma=0.05, steps=n_observations, lower=0, upper=1, seed=SEED - 3
)
impressions_x4_var = random_walk(
    mu=0.05, sigma=0.05, steps=n_observations, lower=0, upper=1, seed=SEED - 1
)

fig, axs = plt.subplots(2, 2, figsize=(10, 8))
axs[0, 0].plot(impressions_x1_var, color="blue")
axs[0, 0].set_title("impressions_x1_var")
axs[0, 1].plot(impressions_x2_var, color="red")
axs[0, 1].set_title("impressions_x2_var")
axs[1, 0].plot(impressions_x3_var, color="green")
axs[1, 0].set_title("impressions_x3_var")
axs[1, 1].plot(impressions_x4_var, color="orange")
axs[1, 1].set_title("impressions_x4_var")
plt.show()

../../_images/3540f87dc2f31497bee77b1e258060a8a2f027276a7eeab4d6e097aa59893694.png

Now, we need to define the strength of the relationships, those will be the betas of every variable over the other. Doing so, we can create all missing nodes.

beta_x1_x2_var = 0.02
beta_x1_x3_var = 0.03
beta_x2_x4_var = 0.04
beta_x3_x4_var = 0.05
beta_event_x1_var = 0.01

impressions_x1_eval = impressions_x1.eval(
    {
        "impressions_x1": impressions_x1_var,
        "event_signal": np_event_signal[:-1],
        "beta_event_x1": beta_event_x1_var,
    }
)

impressions_x2_eval = impressions_x2.eval(
    {
        "impressions_x1": impressions_x1_var,
        "event_signal": np_event_signal[:-1],
        "beta_event_x1": beta_event_x1_var,
        "impressions_x2": impressions_x2_var,
        "beta_x1_x2": beta_x1_x2_var,
    }
)

impressions_x3_eval = impressions_x3.eval(
    {
        "impressions_x1": impressions_x1_var,
        "event_signal": np_event_signal[:-1],
        "beta_event_x1": beta_event_x1_var,
        "impressions_x3": impressions_x3_var,
        "beta_x1_x3": beta_x1_x3_var,
    }
)

impressions_x4_eval = impressions_x4.eval(
    {
        "impressions_x1": impressions_x1_var,
        "event_signal": np_event_signal[:-1],
        "beta_event_x1": beta_event_x1_var,
        "impressions_x2": impressions_x2_var,
        "impressions_x3": impressions_x3_var,
        "impressions_x4": impressions_x4_var,
        "beta_x1_x2": beta_x1_x2_var,
        "beta_x1_x3": beta_x1_x3_var,
        "beta_x2_x4": beta_x2_x4_var,
        "beta_x3_x4": beta_x3_x4_var,
    }
)

Only X4 reaches the outcome in our story. That’s where we add memory (adstock) and diminishing returns (saturation).

fig, axs = plt.subplots(2, 2, figsize=(10, 8))
axs[0, 0].plot(impressions_x1_eval, color="blue")
axs[0, 0].set_title("impressions_x1_eval")
axs[0, 1].plot(impressions_x2_eval, color="red")
axs[0, 1].set_title("impressions_x2_eval")
axs[1, 0].plot(impressions_x3_eval, color="green")
axs[1, 0].set_title("impressions_x3_eval")
axs[1, 1].plot(impressions_x4_eval, color="orange")
axs[1, 1].set_title("impressions_x4_eval")

Text(0.5, 1.0, 'impressions_x4_eval')

../../_images/1c79d7c5b3203cc9082b812288de77f2e2810bf24c801078725e4fe4dadb4b4f.png

# Creating forward pass for impressions
def forward_pass(x, adstock_alpha, saturation_lam, saturation_alpha):
    # return type pytensor.tensor.variable.TensorVariable
    return MichaelisMentenSaturation.function(
        MichaelisMentenSaturation,
        x=GeometricAdstock(l_max=24, normalize=False).function(
            x=x,
            alpha=adstock_alpha,
        ),
        lam=saturation_lam,
        alpha=saturation_alpha,
    )


x4_sat_lam = 0.2
x4_sat_alpha = 0.7
x4_adstock_alpha = 0.3

x4_sat_lam_scalar = pt.scalar("x4_sat_lam")
x4_sat_alpha_scalar = pt.scalar("x4_sat_alpha")
x4_adstock_alpha_scalar = pt.scalar("x4_adstock_alpha")

# Apply forward pass to impressions
impressions_x4_forward = forward_pass(
    impressions_x4, x4_adstock_alpha_scalar, x4_sat_lam_scalar, x4_sat_alpha_scalar
)

y += impressions_x4_forward
y.dprint();

Add [id A]
 ├─ Add [id B]
 │  ├─ Add [id C]
 │  │  ├─ trend [id D]
 │  │  └─ global_noise [id E]
 │  └─ event_contributions [id F]
 └─ True_div [id G]
    ├─ Mul [id H]
    │  ├─ ExpandDims{axis=0} [id I]
    │  │  └─ x4_sat_alpha [id J]
    │  └─ Blockwise{Convolve1d, (n),(k),()->(o)} [id K]
    │     ├─ Join [id L]
    │     │  ├─ -1 [id M]
    │     │  ├─ Alloc [id N]
    │     │  │  ├─ 0.0 [id O]
    │     │  │  └─ Sub [id P]
    │     │  │     ├─ Subtensor{i} [id Q]
    │     │  │     │  ├─ Shape [id R]
    │     │  │     │  │  └─ Pow [id S]
    │     │  │     │  │     ├─ ExpandDims{axis=0} [id T]
    │     │  │     │  │     │  └─ Check{0 <= alpha <= 1} [id U]
    │     │  │     │  │     │     ├─ x4_adstock_alpha [id V]
    │     │  │     │  │     │     └─ All{axes=None} [id W]
    │     │  │     │  │     │        └─ MakeVector{dtype='bool'} [id X]
    │     │  │     │  │     │           └─ All{axes=None} [id Y]
    │     │  │     │  │     │              └─ MakeVector{dtype='bool'} [id Z]
    │     │  │     │  │     │                 ├─ Ge [id BA]
    │     │  │     │  │     │                 │  ├─ x4_adstock_alpha [id V]
    │     │  │     │  │     │                 │  └─ 0 [id BB]
    │     │  │     │  │     │                 └─ Le [id BC]
    │     │  │     │  │     │                    ├─ x4_adstock_alpha [id V]
    │     │  │     │  │     │                    └─ 1 [id BD]
    │     │  │     │  │     └─ ARange{dtype='float64'} [id BE]
    │     │  │     │  │        ├─ 0 [id BF]
    │     │  │     │  │        ├─ 24 [id BG]
    │     │  │     │  │        └─ 1 [id BH]
    │     │  │     │  └─ -1 [id BI]
    │     │  │     └─ 1 [id BJ]
    │     │  └─ Add [id BK]
    │     │     ├─ Add [id BL]
    │     │     │  ├─ impressions_x4 [id BM]
    │     │     │  └─ Mul [id BN]
    │     │     │     ├─ Add [id BO]
    │     │     │     │  ├─ impressions_x2 [id BP]
    │     │     │     │  └─ Mul [id BQ]
    │     │     │     │     ├─ Add [id BR]
    │     │     │     │     │  ├─ impressions_x1 [id BS]
    │     │     │     │     │  └─ Mul [id BT]
    │     │     │     │     │     ├─ event_signal [id BU]
    │     │     │     │     │     └─ ExpandDims{axis=0} [id BV]
    │     │     │     │     │        └─ beta_event_x1 [id BW]
    │     │     │     │     └─ ExpandDims{axis=0} [id BX]
    │     │     │     │        └─ beta_x1_x2 [id BY]
    │     │     │     └─ ExpandDims{axis=0} [id BZ]
    │     │     │        └─ beta_x2_x4 [id CA]
    │     │     └─ Mul [id CB]
    │     │        ├─ Add [id CC]
    │     │        │  ├─ impressions_x3 [id CD]
    │     │        │  └─ Mul [id CE]
    │     │        │     ├─ Add [id BR]
    │     │        │     │  └─ ···
    │     │        │     └─ ExpandDims{axis=0} [id CF]
    │     │        │        └─ beta_x1_x3 [id CG]
    │     │        └─ ExpandDims{axis=0} [id CH]
    │     │           └─ beta_x3_x4 [id CI]
    │     ├─ Pow [id S]
    │     │  └─ ···
    │     └─ TensorFromScalar [id CJ]
    │        └─ False [id CK]
    └─ Add [id CL]
       ├─ ExpandDims{axis=0} [id CM]
       │  └─ x4_sat_lam [id CN]
       └─ Blockwise{Convolve1d, (n),(k),()->(o)} [id K]
          └─ ···

With everything in place, we can evaluate the target variable.

# Eval target_var and plot
target_var_eval = y.eval(
    {
        "impressions_x1": impressions_x1_var,
        "event_signal": np_event_signal[:-1],
        "beta_event_x1": beta_event_x1_var,
        "impressions_x2": impressions_x2_var,
        "impressions_x3": impressions_x3_var,
        "impressions_x4": impressions_x4_var,
        "beta_x1_x2": beta_x1_x2_var,
        "beta_x1_x3": beta_x1_x3_var,
        "beta_x2_x4": beta_x2_x4_var,
        "beta_x3_x4": beta_x3_x4_var,
        "x4_adstock_alpha": x4_adstock_alpha,
        "x4_sat_lam": x4_sat_lam,
        "x4_sat_alpha": x4_sat_alpha,
        "event_contributions": np_event_contributions[:-1],
        "trend": np_trend,
        "global_noise": pz_global_noise,
    }
)

plt.plot(target_var_eval, linewidth=2)
plt.title("Target Variable Over Time", fontsize=14)
plt.xlabel("Time Period", fontsize=12)
plt.ylabel("Target Value", fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

../../_images/b986ac6a7a9af359e1091976f213687c8f6a54a9f466df12c01792b043352e7f.png

We package our story into a single pandas dataframe with features, controls, and the outcome.

scaler_factor_for_target = 1
scaler_factor_for_impressions = 1
data = pd.DataFrame(
    {
        "date": date_range[:-1],
        "target_var": np.round(target_var_eval * scaler_factor_for_target, 4),
        "impressions_x1": np.round(
            impressions_x1_eval * scaler_factor_for_impressions, 4
        ),
        "impressions_x2": np.round(
            impressions_x2_eval * scaler_factor_for_impressions, 4
        ),
        "impressions_x3": np.round(
            impressions_x3_eval * scaler_factor_for_impressions, 4
        ),
        "impressions_x4": np.round(
            impressions_x4_eval * scaler_factor_for_impressions, 4
        ),
        "event_2020_09": np.round(signals_independent[0][:-1], 4),
        "event_2020_12": np.round(signals_independent[1][:-1], 4),
        "event_2021_09": np.round(signals_independent[2][:-1], 4),
        "event_2021_12": np.round(signals_independent[3][:-1], 4),
        "event_2022_09": np.round(signals_independent[4][:-1], 4),
    }
)
data["trend"] = data.index
data.head()

	date	target_var	impressions_x1	impressions_x2	impressions_x3	impressions_x4	trend
0	2022-01-01	0.4913	0.1565	0.1147	0.2182	0.1097	0
1	2022-01-02	0.5267	0.1377	0.1133	0.2227	0.1104	1
2	2022-01-03	0.5603	0.1656	0.1138	0.2233	0.1205	2
3	2022-01-04	0.5715	0.1097	0.1127	0.2207	0.1192	3
4	2022-01-05	0.5635	0.0863	0.1137	0.2191	0.1127	4

Train & Test split#

We keep a holdout to test whether our counterfactual story travels beyond the training window.

# Split data into train and test sets
train_idx = 879

X_train = data.iloc[:train_idx].drop(columns=["target_var"])
X_test = data.iloc[train_idx:].drop(columns=["target_var"])
y_train = data.iloc[:train_idx]["target_var"]
y_test = data.iloc[train_idx:]["target_var"]

Conclusion#

In this tutorial we showed that it is possible to recover high‑level, upper‑funnel impacts even when the observed target is a lower‑funnel KPI. By explicitly modeling mediation and then propagating that signal through the outcome model, we captured the causal contribution of upper‑funnel activity from low‑level targets.

Estimate mediated effect: quantify the link from awareness to a downstream mediator and recover its contribution.
Propagate to outcomes: translate the mediated effect into expected pressure on the final target via the MMM.
Causal counterfactuals: compare against do‑operator counterfactuals to measure upper‑funnel lift credibly.

This makes upper‑funnel value measurable alongside performance channels, improving attribution, forecasting, and budget decisions while keeping the model grounded in causal structure.

%load_ext watermark
%watermark -n -u -v -iv -w -p pytensor,pymc,pymc_marketing,numpyro

Last updated: Fri Oct 03 2025

Python implementation: CPython
Python version       : 3.12.11
IPython version      : 9.4.0

pytensor      : 2.31.7
pymc          : 5.25.1
pymc_marketing: 0.16.0
numpyro       : 0.19.0

graphviz      : 0.21
arviz         : 0.22.0
pytensor      : 2.31.7
pymc_extras   : 0.4.0
pandas        : 2.3.1
matplotlib    : 3.10.3
xarray        : 2025.7.1
pymc_marketing: 0.16.0
numpy         : 2.2.6
preliz        : 0.20.0

Watermark: 2.5.0

Measuring Upper-Funnel Impact with PyMC‑Marketing#

What you’ll build (at a glance)#

Why this matters#

Who this is for#

What this notebook covers#

Business Challenge: Untangling Upper → Mid → Lower Funnel Effects#

The Ecosystem based on the Causal DAG#

The Business Questions#

Imports#

Notebook settings#

Data generation process#

Data generation process#

Structural causal equations#

Outcome variable#

Creating the data for our symbolic place holders.#

Train & Test split#

All-in causal MMM#

Model configuration#

Fit model#

Mediator model#

Conclusion#