Skip to content

mParticle Paid Media Integration#15581

Open
andresilva-guardian wants to merge 20 commits intomainfrom
afs/mparticle-paid-media-integration
Open

mParticle Paid Media Integration#15581
andresilva-guardian wants to merge 20 commits intomainfrom
afs/mparticle-paid-media-integration

Conversation

@andresilva-guardian
Copy link
Copy Markdown
Contributor

@andresilva-guardian andresilva-guardian commented Mar 24, 2026

What does this change?

Adds a new client-side module src/client/mparticle/ that syncs a user's GDPR consent state to the mParticle backend. The sync fires for all users (anonymous and signed-in) whenever their consent state or auth state has changed since the last successful call. When the user is signed in, a Bearer token is also attached so the backend can immediately link the record to the user's identity_id; for anonymous users the call is made without auth and the backend records consent against bwid for later identity resolution.

New files:

File Purpose
src/client/mparticle/mparticle-consent.ts Registers an onConsentChange callback; runs all guard checks before calling the API
src/client/mparticle/mparticleConsentApi.ts Makes the PATCH /consents/{browserId} call; attaches Bearer token only for signed-in users
src/client/mparticle/cookies/mparticleConsentSynced.ts Manages two browser storage entries: gu.mparticle.lastSynced (localStorage — fingerprint of last successful PATCH) and gu.mparticle.sessionAttempted (sessionStorage — session-scoped retry cap)
src/client/mparticle/mparticle-consent.test.ts 12 unit tests covering all guard conditions and happy paths

Modified files:

File Change
src/client/main.web.ts Adds the switch-gated startup('mparticleConsentSync', …) entry
src/model/guardian.ts Declares mparticleApiUrl?: string in config.page
fixtures/config.js Adds mparticleApiUrl for local development

Also bumps @guardian/libs 30.1.131.0.0. This release adds mparticle to the VendorIDs registry (csnx PR #2347), making getConsentFor('mparticle', state) runtime-safe.

Why?

MRR (Marketing Reader Revenue) want to connect mParticle to Meta (Facebook Ads) and Google Ads audiences. To legally send user data to those platforms, mParticle must hold a current record of each user's GDPR consent state. The browser is the source of truth for consent (via Sourcepoint / our CMP), so dotcom-rendering is responsible for forwarding that state to the backend when it changes.

The call is made for all users, not just signed-in ones, because the backend uses the bwid browser ID for identity resolution in the data lake: anonymous consent records are linked to an identity_id overnight once the user signs in or is resolved. If a signed-in user's consent state is already stored as "anonymous:false" and they sign in, the fingerprint changes to "signed-in:false" and the call fires again — this time with a Bearer token — so the backend can immediately associate the record with their identity without waiting for overnight resolution.

How the new module fits into the startup pipeline

The new startup task runs concurrently with bootCmp and userFeatures, all at critical priority. Because onConsentChange only fires after cmp.init() resolves inside bootCmp, the mParticle callback always receives a real consent state, regardless of startup ordering.

flowchart TD
    A["Browser loads page\nwindow.guardian.config is set"] --> B["main.web.ts - startup() × N"]

    B --> C["bootCmp - priority: critical"]
    B --> D["userFeatures - priority: critical"]
    B --> E["abTesting / sentryLoader / islands / …\npriority: critical"]
    B --> NEW["🆕 mparticleConsentSync - priority: critical\nswitch-gated - off by default"]
Loading

The new flow in detail

flowchart TD
    MW["main.web.ts\nif switches.mparticleConsentSync\nstartup('mparticleConsentSync')"] --> SYNC["mparticle-consent.ts\nsyncMparticleConsent()"]

    SYNC --> REG["onConsentChange(callback)\nfires immediately on page load\nand again on any privacy-modal change"]
    REG --> CB["callback(state) fires"]

    CB --> BWID{"getCookie('bwid')"}
    BWID -- "No bwid cookie" --> NOOP["↩ return - no API call"]
    BWID -- "Has bwid" --> AUTH["getAuthStatus()\n→ SignedIn | SignedOut"]

    AUTH --> FP["buildFingerprint(consented, isSignedIn)\ne.g. 'signed-in:true' or 'anonymous:false'"]

    FP --> STALE{"mparticleConsentNeedsSync()?\ncompare fingerprint vs\ngu.mparticle.lastSynced (localStorage)"}
    STALE -- "Fingerprint matches - skip" --> NOOP2["↩ return - no API call"]
    STALE -- "Fingerprint differs" --> SESSION{"sessionAttemptExists()?\ngu.mparticle.sessionAttempted (sessionStorage)"}

    SESSION -- "Already attempted this session - skip" --> NOOP3["↩ return - no API call"]
    SESSION -- "No attempt yet" --> ATTEMPT["markSessionAttempt(fingerprint)\nsessionStorage - cleared on tab close"]

    ATTEMPT --> API["PATCH mparticle-consent.guardianapis.com\n/consents/{browserId}\n{ consented, pageViewId }\nAuthorization: Bearer … (signed-in only)"]

    API -- "200 OK" --> MARK["markMparticleConsentSynced(consented, isSignedIn)\nsets gu.mparticle.lastSynced in localStorage (persistent)"]
    API -- "error" --> ERR["throw Error → surfaces in Sentry\ngu.mparticle.lastSynced NOT updated\n→ will retry next session"]
Loading

Rate-limiting approach

Instead of a time-based TTL, the module uses a consent-state fingerprint. After a successful PATCH, the fingerprint is persisted to localStorage (key gu.mparticle.lastSynced) as a string encoding both the consent value and auth state — e.g. "anonymous:false" or "signed-in:true". On each onConsentChange callback, the current fingerprint is computed and compared. If it matches, no call is made.

A second entry in sessionStorage (key gu.mparticle.sessionAttempted) records that a PATCH was attempted for the current fingerprint in this browser session. If the API call fails, gu.mparticle.lastSynced is not updated (so the state still looks unsynced) but the sessionStorage entry prevents a retry on every subsequent page load within the same tab. The retry happens on the next browser session.

Using localStorage/sessionStorage rather than cookies means these values are never sent to the server on HTTP requests. Both are written via storage.local/storage.session from @guardian/libs, which handle blocked storage gracefully (returning null rather than throwing).

This means:

  • Same state, same session, repeated page loads → skipped (fingerprint matches)
  • Consent changes → new fingerprint → fires
  • User signs in with previously rejected consent → fingerprint changes from "anonymous:false""signed-in:false" → fires with Bearer token
  • API failure → retried on next session, not every page load

This PR is safe to merge before the Scala and backend changes are ready

The entire feature is gated behind window.guardian.config.switches.mparticleConsentSync:

// src/client/main.web.ts
if (window.guardian.config.switches.mparticleConsentSync) {
    void startup(
        'mparticleConsentSync',
        () =>
            import(/* webpackMode: 'eager' */ './mparticle/mparticle-consent')
                .then(({ syncMparticleConsent }) => syncMparticleConsent()),
        { priority: 'critical' },
    );
}

The Switches interface uses an open index signature ([key: string]: boolean | undefined). The Scala frontend repo does not emit mparticleConsentSync, so the value is undefined at runtime, the if block is never entered, no module is imported, and no API call is ever made.

The remaining work (adding the switch in Scala frontend, injecting mparticleApiUrl with the correct per-environment URL, and deploying the backend endpoint to mparticle-consent.guardianapis.com) is tracked in docs/mparticle-work-tracking.md.

Design doc

Full architecture, sequence diagrams, mParticle identity model, fingerprint storage rationale, and manual testing guide: docs/mparticle-paid-media-integration.md.

Screenshots

No visual changes — this is a backend API integration. The only observable frontend artefacts are a network request and two browser storage entries (localStorage and sessionStorage), visible in DevTools → Application once the switch is enabled.

Before After
No mParticle consent sync PATCH /consents/{bwid} fires for all users when their consent/auth state has genuinely changed (when switch is enabled)

@andresilva-guardian andresilva-guardian marked this pull request as ready for review April 7, 2026 09:18
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

Hello 👋! When you're ready to run Chromatic, please apply the run_chromatic label to this PR.

You will need to reapply the label each time you want to run Chromatic.

Click here to see the Chromatic project.

@andresilva-guardian andresilva-guardian requested a review from a team as a code owner April 7, 2026 09:37
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

@andresilva-guardian andresilva-guardian added run_chromatic Runs chromatic when label is applied feature Departmental tracking: work on a new feature labels Apr 7, 2026
@github-actions github-actions Bot removed the run_chromatic Runs chromatic when label is applied label Apr 7, 2026
@andresilva-guardian andresilva-guardian changed the title Afs/mparticle paid media integration mParticle Paid Media Integration Apr 7, 2026
@andresilva-guardian andresilva-guardian added the run_chromatic Runs chromatic when label is applied label Apr 13, 2026
@github-actions github-actions Bot removed the run_chromatic Runs chromatic when label is applied label Apr 13, 2026
@johnduffell
Copy link
Copy Markdown
Member

Thanks Andre, I've reviewed the description so far and it sounds great. A couple of questions/points:

  • why cookies rather than local storage? I think cookies would unnecessarily be sent to the server on every request would they?
  • could there be any issue with browser storage not being available/blocked? if there are limitations for a portion of browsers, could that cause excessive traffic or other issues?

@andresilva-guardian
Copy link
Copy Markdown
Contributor Author

Thanks Andre, I've reviewed the description so far and it sounds great. A couple of questions/points:

  • why cookies rather than local storage? I think cookies would unnecessarily be sent to the server on every request would they?
  • could there be any issue with browser storage not being available/blocked? if there are limitations for a portion of browsers, could that cause excessive traffic or other issues?

Good catch on both points, John.

1. Cookies → browser storage

You're right, these two values are purely client-side deduplication state and have no business being sent to the server on every request. We'll switch:

  • gu_mparticle_last_synced (persistent, 1 year) → storage.local
  • gu_mparticle_session_attempted (session-scoped) → storage.session

Both are already available via @guardian/libs, and this aligns with the pattern used everywhere else in the codebase for this kind of client-side state (banner timestamps, Braze user flag, epic view log, etc.).

2. Storage blocked/unavailable

The storage.local and storage.session utilities in @guardian/libs handle blocked storage gracefully, they catch SecurityError and return null rather than throwing, so there's no risk of a crash.

The behavioural consequence when storage is blocked is that:

  • mparticleConsentNeedsSync always returns true (no record of a previous sync)
  • sessionAttemptExists always returns false (no record of a previous attempt)

So affected users would trigger a PATCH on every page load rather than once. However, I believe that the population with storage fully blocked is very small, the mParticle endpoint is idempotent, and this is the same failure mode you'd have with cookies if cookies were blocked too, so it shouldn't cause any meaningful increase in traffic or issues. No special handling is needed for this case.

…age/sessionStorage for fingerprint management
@andresilva-guardian andresilva-guardian added the run_chromatic Runs chromatic when label is applied label Apr 17, 2026
import(
/* webpackMode: 'eager' */ './mparticle/mparticle-consent'
).then(({ syncMparticleConsent }) => syncMparticleConsent()),
{ priority: 'critical' },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this module need to load with critical priority? The module's logic seems to be triggered after other asynchronous tasks have been completed, so can it be lazy-loaded after user-critical chunks?

@johnduffell
Copy link
Copy Markdown
Member

@andresilva-guardian thanks for looking into the cookie vs local storage issue.

Having researched a bit more, and looked into how sourcepoint itself works, it looks like

  • local storage is strictly per origin (i.e. www.theguardian.com doesn't share with support.theguardian.com)
  • cookies can be set up to share subdomains, i.e. all of *theguardian.com

Sourcepoint works by storing the main information in localstorage and also on the server, then using cookies to share the UUID and last updated between domains, as a kind of cache-validity check.

Does that affect your decision about what to use? There's a tradeoff between traffic on every request vs traffic as people switch between domains.

@johnduffell
Copy link
Copy Markdown
Member

regarding storage being blocked, I'm not sure it makes sense what you say, it was more a point for consideration rather than something that needed a quick reassurance.

the mParticle endpoint is idempotent

I realise that, it's more like the amount of potential traffic that it could receive - given that the ratio of page views to consent changes is very high, only a small group of browsers could have a big impact. If it's a risk then we may need a mitigation on the backend.

and this is the same failure mode you'd have with cookies if cookies were blocked too, so it shouldn't cause any meaningful increase in traffic or issues

That would only make sense if we already had a cookie based solution in place and working. My question was intended to cover either case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Departmental tracking: work on a new feature run_chromatic Runs chromatic when label is applied

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants