diff --git a/README.md b/README.md index c9bd48e..ffae97b 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,29 @@ # GD Lead Source Tracker -**Version:** 1.0.0 -**Author:** Garrett Digital +**Version:** 1.1.0-wpengine +**Author:** Garrett Digital **Type:** WordPress Must-Use Plugin (mu-plugin) +> **You are on the `wp-engine` branch.** This is the WP Engine adaptation of the plugin. It moves all cookie capture logic to client-side JavaScript to work around WP Engine's aggressive full-page cache. For the standard PHP-based version, see the `main` branch. + +## What's Different on This Branch + +WP Engine caches pages at the server level. When a cached page is served, PHP hooks like `template_redirect` don't fire. This means the server-side `gd_ls_capture()` function never runs for the majority of visitors. + +**This branch removes the `template_redirect` hook and `gd_ls_capture()` function entirely.** All cookie capture logic has been moved into the inline JavaScript that runs in the footer. The JS handles: + +- Parsing `window.location.search` for UTM parameters and gclid +- Reading `document.referrer` for referrer classification +- Classifying the referrer against the same search/social/AI domain lists (ported from PHP to JS) +- Writing cookies via `document.cookie` with the same `gd_ls_` prefix and 30-day expiration +- Following the same first-touch / last-touch attribution rules + +Everything else is unchanged: the PHP shortcodes, form plugin integrations (Formidable, CF7, Gravity Forms), and cookie names all work identically. PHP shortcodes read from `$_COOKIE`, which is populated by JS-set cookies on the next page request. + +**First-visit behavior:** On a visitor's very first pageview, JS sets cookies and populates form fields in the same page load. If the visitor submits a form on that very first page, the JS form population path is the reliable one. Shortcode-based hidden field defaults (e.g., `[gd_ls_source]` as a Formidable default value) will be empty on that first visit but populated correctly on every subsequent page. + +--- + ## What It Does Captures traffic attribution data (UTM parameters, gclid, referrer, landing page) on a visitor's first pageview and stores it in cookies. When the visitor fills out a form, the plugin populates hidden fields with that attribution data so you know where each lead came from. @@ -16,16 +36,16 @@ Upload `gd-lead-source-tracker.php` to `/wp-content/mu-plugins/`. MU-plugins loa ## How It Works -### Cookie Capture (Server-Side, PHP) +### Cookie Capture (Client-Side, JavaScript — WP Engine Adaptation) -On every front-end page load (`template_redirect`), the plugin runs this logic: +On every front-end page load, the inline JS runs this logic: -1. **Guard checks** skip admin pages, AJAX, cron, REST API, CLI, 404s, RSS feeds, and logged-in editors/admins. -2. **UTM parameters** in the URL (`utm_source`, `utm_medium`, `utm_campaign`, `utm_term`, `utm_content`, `gclid`) are read and sanitized. +1. **Guard checks** skip logged-in editors/admins (applied in PHP before the script outputs). +2. **UTM parameters** in the URL (`utm_source`, `utm_medium`, `utm_campaign`, `utm_term`, `utm_content`, `gclid`) are read from `window.location.search` and sanitized. 3. **gclid auto-classification** sets source to "Google" and medium to "cpc" when gclid is present but UTMs are missing. -4. **Referrer classification** kicks in when no UTMs are present. The plugin checks the HTTP referrer against known domain lists for search engines, social platforms, and AI tools, then assigns source/medium accordingly. -5. **First-touch vs. last-touch** behavior differs by channel type. Organic/social/referral sources only write cookies if no source cookie exists yet (first-touch). UTM-tagged visits always overwrite (last-touch for paid campaigns). -6. **Referrer, landing page, and timestamp** are captured once and never overwritten. +4. **Referrer classification** kicks in when no UTMs are present. The JS reads `document.referrer` and checks it against the same domain lists for search engines, social platforms, and AI tools, then assigns source/medium accordingly. +5. **First-touch vs. last-touch** behavior is identical to the PHP version. Organic/social/referral sources only write cookies if no source cookie exists yet (first-touch). UTM-tagged visits always overwrite (last-touch for paid campaigns). +6. **Referrer, landing page (`window.location.href`), and timestamp** are captured once and never overwritten. ### Cookie Names @@ -41,11 +61,11 @@ All cookies use the `gd_ls_` prefix: | `gd_ls_gclid` | Google Ads click ID | When gclid param present | | `gd_ls_referrer` | Raw referrer URL or "(direct)" | First visit only | | `gd_ls_landing_page` | Full URL of first page visited | First visit only | -| `gd_ls_timestamp` | Date/time of first visit (site timezone) | First visit only | +| `gd_ls_timestamp` | Date/time of first visit (WordPress site timezone) | First visit only | Cookie duration: **30 days** (configurable via `GD_LS_COOKIE_DAYS` constant). -Cookies are set with `httpOnly = false` so the client-side JavaScript can read them for form population. +Cookies are set with `SameSite=Lax` and, on HTTPS sites, the `Secure` flag. `httpOnly` is not set so PHP can read them from `$_COOKIE` on subsequent requests. ### Form Population (Client-Side, JavaScript) @@ -68,7 +88,7 @@ The plugin classifies referrers into four channels: | Other external | `referral` | Any domain not in the lists above (uses bare hostname as source) | | No referrer | `none` | Direct traffic (source = "direct") | -To add new domains, edit the arrays in `gd_ls_get_channel_lists()`. +The domain lists are defined twice — once in PHP (`gd_ls_get_channel_lists()`) and once in JavaScript. If you add a domain to one, add it to the other. ### Shortcodes @@ -113,19 +133,28 @@ define( 'GD_LS_COOKIE_DAYS', 30 ); ## Referrer List Maintenance -When new search engines, social platforms, or AI tools gain meaningful traffic share, add them to `gd_ls_get_channel_lists()`. The format is `'domain.fragment' => 'Display Name'`. Matching uses `strpos` against the referrer hostname, so `'google.'` matches `google.com`, `google.co.uk`, etc. +When new search engines, social platforms, or AI tools gain meaningful traffic share, add them to `gd_ls_get_channel_lists()` in PHP **and** to the corresponding JS objects (`SEARCH_ENGINES`, `SOCIAL_PLATFORMS`, `AI_TOOLS`) in `gd_ls_inline_script()`. Both must stay in sync. + +## Testing on WP Engine + +1. Enable WP Engine's page cache in staging. +2. Visit with UTM params (e.g., `?utm_source=google&utm_medium=cpc&utm_campaign=test`). Verify cookies are set via browser DevTools → Application → Cookies. +3. Visit from Google organic (simulate with browser devtools by setting `document.referrer`). Verify `gd_ls_source=Google` and `gd_ls_medium=organic`. +4. Fill out a form. Verify hidden fields contain the correct values. +5. Verify shortcodes in email notifications pull correct data (requires a form submission so the cookie is available server-side). ## Known Limitations 1. **Cookie-based tracking** means data is lost if the user clears cookies or uses a different browser/device. -2. **No server-side form integration for WP Engine** or other hosts with aggressive page caching. The PHP cookie-setting runs on `template_redirect`, which gets bypassed on cached pages. See the WP Engine adaptation notes below. -3. **30-day window** means a visitor who returns after 31 days starts fresh. -4. **No cross-domain tracking.** If you run multiple domains, cookies are scoped per domain. +2. **30-day window** means a visitor who returns after 31 days starts fresh. +4. **No cross-domain tracking.** Cookies are scoped per domain. +5. **First-visit shortcode gap**: PHP shortcodes read `$_COOKIE`, which is populated by JS cookies on subsequent requests. A form submitted on the visitor's very first pageview will have shortcode values empty server-side; JS form field population handles this case client-side. -## WP Engine Considerations +## Branch Strategy -WP Engine's page caching serves static HTML for most visitors, which means the PHP `template_redirect` hook never fires on cached pages. The cookies won't get set server-side for the majority of visits. - -**Recommended approach:** Move all cookie capture logic to JavaScript. The JS version would read UTM params from `window.location.search`, read the referrer from `document.referrer`, classify the source client-side, and set cookies via `document.cookie`. The form population logic already works client-side, so that part stays the same. +``` +main ← stable, standard PHP version (for non-cached environments) +└── wp-engine ← permanent parallel branch for WP Engine (JS-based capture) +``` -See the `wp-engine` branch for this adaptation. +The `wp-engine` branch is a long-lived permanent branch, not a feature branch to merge back. When `main` gets improvements that also apply here (e.g., new referrer domains, security fixes), cherry-pick or merge them into `wp-engine` and keep both domain lists in sync. diff --git a/gd-lead-source-tracker.php b/gd-lead-source-tracker.php index 7379247..5cefe65 100644 --- a/gd-lead-source-tracker.php +++ b/gd-lead-source-tracker.php @@ -4,13 +4,21 @@ * Description: Captures UTM parameters, gclid, referrer, and landing page in cookies. * Classifies organic traffic source/medium from the referrer. * Populates hidden form fields in Formidable Forms and Contact Form 7. - * Version: 1.0.0 + * Version: 1.1.0-wpengine * Author: Garrett Digital * * INSTALLATION: * Upload this file to /wp-content/mu-plugins/gd-lead-source-tracker.php * MU-plugins load automatically. No activation step needed. * + * WP ENGINE ADAPTATION: + * This branch moves all cookie capture logic to client-side JavaScript. + * WP Engine's aggressive page caching serves static HTML for most visitors, + * which means the PHP `template_redirect` hook never fires on cached pages. + * Cookie capture is therefore handled entirely in JS on page load. + * The PHP shortcodes, form integrations, and helper functions are unchanged — + * they read from $_COOKIE which is populated by JS cookies on subsequent requests. + * * COOKIE PREFIX: gd_ls_ * COOKIE DURATION: 30 days (configurable below) * @@ -36,7 +44,7 @@ define( 'GD_LS_COOKIE_DAYS', 30 ); define( 'GD_LS_PREFIX', 'gd_ls_' ); -define( 'GD_LS_VERSION', '1.0.0' ); +define( 'GD_LS_VERSION', '1.1.0-wpengine' ); // Fields we track. The cookie name is GD_LS_PREFIX + key. // "param" is the URL query parameter that maps to this field (if any). @@ -61,6 +69,9 @@ * Returns arrays of known domain fragments for each channel. * Matching is done with strpos against the referrer hostname. * Add or remove entries as needed. + * + * Note: These lists are also mirrored in the inline JS below. + * If you add a domain here, add it to the JS arrays too. */ function gd_ls_get_channel_lists() { return array( @@ -160,134 +171,26 @@ function gd_ls_classify_referrer( $referrer_url ) { // ────────────────────────────────────────────── -// SERVER-SIDE: CAPTURE & SET COOKIES +// WP ENGINE ADAPTATION NOTE +// ────────────────────────────────────────────── +// +// The template_redirect hook and gd_ls_capture() PHP function have been +// removed on this branch. WP Engine's aggressive full-page cache serves +// static HTML to most visitors, which means PHP hooks like template_redirect +// never fire on cached pages. Server-side cookie capture is therefore +// unreliable on WP Engine. +// +// All cookie capture logic has been moved to inline JavaScript (see +// gd_ls_inline_script() below). The JS runs on every page load — cached +// or not — and handles UTM parsing, referrer classification, and cookie +// writing client-side using the same naming conventions, attribution rules, +// and domain lists as the PHP version. +// +// The PHP shortcodes, Formidable Forms integration, Contact Form 7 integration, +// and Gravity Forms integration are unchanged. They read from $_COOKIE, which +// will be populated by JS-set cookies on subsequent page requests. +// // ────────────────────────────────────────────── - -add_action( 'template_redirect', 'gd_ls_capture', 1 ); - -function gd_ls_capture() { - - // 1. Original guard: don't run in admin, AJAX, cron, REST API, or CLI. - if ( is_admin() || wp_doing_ajax() || wp_doing_cron() || defined( 'REST_REQUEST' ) || ( defined( 'WP_CLI' ) && WP_CLI ) ) { - return; - } - - // 2. Ghost-proof guard: Only capture data on real web pages. - // Ignore calls to files, RSS feeds, or unexpected internal processes. - if ( ! is_singular() && ! is_front_page() && ! is_archive() && ! is_home() && ! is_search() ) { - return; - } - - // 3. Ignore 404 errors (e.g., when the browser looks for an apple-touch-icon.png that doesn't exist) - if ( is_404() ) { - return; - } - - // Don't run for logged-in admins/editors (avoids polluting data) - if ( is_user_logged_in() && current_user_can( 'edit_posts' ) ) { - return; - } - - $cookie_duration = time() + ( DAY_IN_SECONDS * GD_LS_COOKIE_DAYS ); - $cookie_domain = gd_ls_get_cookie_domain(); - $is_secure = is_ssl(); - - // ── Step 1: Check for UTM parameters and gclid in the URL. ── - - $has_utm = false; - $utm_data = array(); - $param_map = array( - 'source' => 'utm_source', - 'medium' => 'utm_medium', - 'campaign' => 'utm_campaign', - 'term' => 'utm_term', - 'content' => 'utm_content', - 'gclid' => 'gclid', - ); - - foreach ( $param_map as $field_key => $query_param ) { - if ( isset( $_GET[ $query_param ] ) && $_GET[ $query_param ] !== '' ) { - $utm_data[ $field_key ] = sanitize_text_field( wp_unslash( $_GET[ $query_param ] ) ); - if ( $field_key !== 'gclid' ) { - $has_utm = true; - } - } - } - - // If gclid is present but no explicit utm_medium, set medium to cpc. - if ( ! empty( $utm_data['gclid'] ) && empty( $utm_data['medium'] ) ) { - $utm_data['medium'] = 'cpc'; - } - if ( ! empty( $utm_data['gclid'] ) && empty( $utm_data['source'] ) ) { - $utm_data['source'] = 'Google'; - } - - // ── Step 2: If no UTMs, classify from referrer. ── - - if ( ! $has_utm && empty( $utm_data['gclid'] ) ) { - $referrer = isset( $_SERVER['HTTP_REFERER'] ) ? esc_url_raw( wp_unslash( $_SERVER['HTTP_REFERER'] ) ) : ''; - $classified = gd_ls_classify_referrer( $referrer ); - - if ( $classified !== null ) { - // Only write source/medium if we don't already have a cookie. - // This preserves the original source across internal page navigations. - $existing_source = isset( $_COOKIE[ GD_LS_PREFIX . 'source' ] ) ? $_COOKIE[ GD_LS_PREFIX . 'source' ] : ''; - - if ( empty( $existing_source ) ) { - $utm_data['source'] = $classified['source']; - $utm_data['medium'] = $classified['medium']; - } - } - } else { - // UTMs present: always overwrite (last-touch attribution for paid campaigns). - // This means if someone first came from organic and later clicks a Google Ad, - // the source updates to the ad. This is intentional for paid campaign tracking. - } - - // ── Step 3: Capture referrer URL (raw, always on first visit). ── - - if ( ! isset( $_COOKIE[ GD_LS_PREFIX . 'referrer' ] ) ) { - // If there's a referrer we save it; if empty we explicitly save the string "(direct)". - $referrer = isset( $_SERVER['HTTP_REFERER'] ) && ! empty( $_SERVER['HTTP_REFERER'] ) ? esc_url_raw( wp_unslash( $_SERVER['HTTP_REFERER'] ) ) : '(direct)'; - $utm_data['referrer'] = $referrer; - } - - // ── Step 4: Capture landing page (first page visited, set once). ── - - if ( ! isset( $_COOKIE[ GD_LS_PREFIX . 'landing_page' ] ) ) { - $protocol = is_ssl() ? 'https://' : 'http://'; - $utm_data['landing_page'] = $protocol . wp_parse_url( home_url(), PHP_URL_HOST ) . esc_url_raw( wp_unslash( $_SERVER['REQUEST_URI'] ) ); - } - - // ── Step 5: Capture timestamp (set once). ── - - if ( ! isset( $_COOKIE[ GD_LS_PREFIX . 'timestamp' ] ) ) { - $utm_data['timestamp'] = current_time( 'Y-m-d H:i:s' ); - } - - // ── Step 6: Write cookies for any new data. ── - - foreach ( $utm_data as $key => $value ) { - if ( $value === '' && isset( $_COOKIE[ GD_LS_PREFIX . $key ] ) ) { - continue; // Don't overwrite existing cookie with empty value. - } - - $cookie_name = GD_LS_PREFIX . $key; - - setcookie( - $cookie_name, - $value, - $cookie_duration, - '/', - $cookie_domain, - $is_secure, - false // httpOnly = false so JS can read it for form population - ); - - // Make it available to PHP in the same request. - $_COOKIE[ $cookie_name ] = $value; - } -} // ────────────────────────────────────────────── @@ -312,7 +215,7 @@ function gd_ls_get_cookie_domain() { // ────────────────────────────────────────────── -// CLIENT-SIDE: JAVASCRIPT TO POPULATE FORM FIELDS +// CLIENT-SIDE: JAVASCRIPT FOR COOKIE CAPTURE AND FORM POPULATION // ────────────────────────────────────────────── add_action( 'wp_enqueue_scripts', 'gd_ls_enqueue_scripts' ); @@ -341,13 +244,21 @@ function gd_ls_inline_script() { if ( is_user_logged_in() && current_user_can( 'edit_posts' ) ) { return; } + + // Pass PHP values into JS safely. + $cookie_domain = gd_ls_get_cookie_domain(); + $site_host = strtolower( wp_parse_url( home_url(), PHP_URL_HOST ) ); + $site_host_bare = preg_replace( '/^www\./', '', $site_host ); + $gmt_offset = (float) get_option( 'gmt_offset' ); // Hours offset from UTC, e.g. -6 for CST. ?>