PriorLabs · safaricd · Apr 3, 2026 · Apr 8, 2026 · Apr 8, 2026 · Apr 9, 2026
@@ -496,15 +496,14 @@ pre-commit run --all-files
 pytest tests/
 ```
 
-## Anonymized Telemetry
+## Telemetry
 
-This project collects fully anonymous usage telemetry with an option to opt-out of any telemetry or opt-in to extended telemetry.
+This project collects usage telemetry with an option to opt-out.
 
 The data is used exclusively to help us provide stability to the relevant products and compute environments and guide future improvements.
 
-- **No personal data is collected**
+- **Personal data is collected only if user provided consent and accepted the terms of service**
 - **No code, model inputs, or outputs are ever sent**
-- **Data is strictly anonymous and cannot be linked to individuals**
 
 For details on telemetry, please see our [Telemetry Reference](https://github.com/PriorLabs/TabPFN/blob/main/TELEMETRY.md) and our [Privacy Policy](https://priorlabs.ai/privacy-policy/).
 

@@ -1,71 +1,61 @@
-# 📊 Telemetry
+# Telemetry
 
-This project includes lightweight, anonymous telemetry to help us improve TabPFN.  
-We've designed this with two goals in mind:
+TabPFN includes lightweight, optional telemetry that helps us understand how the library is used and where to focus development. This page explains exactly what is collected, how it's handled, and how to opt out.
 
-1. ✅ Be **fully GDPR-compliant** (no personal data, no sensitive data, no surprises)  
-2. ✅ Be **OSS-friendly and transparent** about what we track and why  
+## What we collect
 
-If you'd rather not send telemetry, you can always opt out (see **Opting out**).
+We gather high-level usage signals - enough to guide development, never enough to expose your data or code.
 
----
+**Events**
 
-## 🔍 What we collect
+- `session` - sent when a TabPFN estimator is initialized
+- `ping` - liveness check on model initialization
+- `model_load` - sent when a model is loaded from disk or cache
+- `fit_called` / `predict_called` - sent when you call `fit` or `predict`
 
-We only gather **very high-level usage signals** — enough to guide development, never enough to identify you or your data.  
+**Metadata (all events)**
 
-Here's the full list:
+- `tabpfn_version`, `python_version`, `numpy_version`, `pandas_version` - software versions
+- `gpu_type` - GPU type TabPFN is running on
+- `timestamp` - time of the event
+- `install_date` - date TabPFN was first used (year-month-day)
+- `install_id` - random, locally generated installation identifier (see "Privacy" below)
 
-### Events
-- `ping` – sent when models initialize, used to check liveness  
-- `fit_called` – sent when you call `fit`  
-- `predict_called` – sent when you call `predict`
-- `session` - sent whenever a user initializes a TabPFN estimator.
+**Additional metadata (fit / predict only)**
 
-### Metadata (all events)
-- `python_version` – version of Python you're running
-- `tabpfn_version` – TabPFN package version
-- `timestamp` – time of the event
-- `numpy_vesion` - local Numpy version
-- `pandas_version` - local Pandas version
-- `gpu_type` - type of GPU TabPFN is running on.
-- `install_date` - `year-month-day` when TabPFN was used for the first time
-- `install_id` - unique, random and anonymous installation ID.
+- `task` - classification or regression
+- `num_rows`, `num_columns` - dataset shape, rounded into ranges (exact values are never recorded)
+- `duration_ms` - wall-clock time of the call
 
-### Extra metadata (`fit` / `predict` only)
-- `task` – whether the call was for **classification** or **regression**  
-- `num_rows` – *rounded* number of rows in your dataset  
-- `num_columns` – *rounded* number of columns in your dataset  
-- `duration_ms` – time it took to complete the call  
+## What we never collect
 
----
+Regardless of account status, we never collect:
 
-## 🛡️ How we protect your privacy
+- Training data, features, labels, or model outputs
+- File paths, environment variables, or hostnames
+- Exact dataset dimensions
+- Code of any kind
 
-- **No inputs, no outputs, no code** ever leave your machine.  
-- **No personal data** is collected.  
-- Dataset shapes are **rounded into ranges** (e.g. `(953, 17)` → `(1000, 20)`) so exact dimensionalities can't be linked back to you.  
-- The data is strictly anonymous — it cannot be tied to individuals, projects, or datasets.  
+No inputs, outputs, or model weights ever leave your machine.
 
-This approach lets us understand dataset *patterns* (e.g. "most users run with ~1k features") while ensuring no one's data is exposed.  
+## Privacy
 
----
+TabPFN operates in two modes with different privacy properties:
 
-## 🤔 Why collect telemetry?
+**Without an account (anonymous).** Telemetry is tied only to a random `install_id` generated locally on first use. This ID is not linked to any personal information and cannot be traced back to you.
 
-Open-source projects don't get much feedback unless people file issues. Telemetry helps us:  
-- See which parts of TabPFN are most used (fit vs predict, classification vs regression)  
-- Detect performance bottlenecks and stability issues  
-- Prioritize improvements that benefit the most users  
+**With an account (pseudonymous).** If you create a TabPFN account, your `user_id` is included in telemetry events. 
 
-This information goes directly into **making TabPFN better** for the community.  
+For further details we suggest you check out our [privacy policy](https://priorlabs.ai/privacy-policy).
 
----
+## Opting out
 
-## 🚫 Opting out
-
-Don't want to send telemetry? No problem — just set the environment variable:
+Set one environment variable to disable all telemetry:
 
 ```bash
 export TABPFN_DISABLE_TELEMETRY=1
-```
+```
+
+## Why collect telemetry?
+
+Open-source projects get limited feedback unless people file issues. Telemetry helps us see which parts of TabPFN are most used, detect performance bottlenecks, and prioritize improvements that benefit the most users.
@@ -21,8 +21,9 @@ dependencies = [
   # Once Python 3.10 is the minimum version, this can be removed.
   "eval-type-backport>=0.2.2",
   "joblib>=1.2.0",
-  "tabpfn-common-utils[telemetry-interactive]>=0.2.13",
+  "tabpfn-common-utils[telemetry-interactive]>=0.2.19",
   "filelock>=3.11.0",
+  "pyjwt>=2.12.1",
 ]
 requires-python = ">=3.9"
 authors = [

@@ -14,7 +14,6 @@
 from sklearn.base import (
     check_is_fitted,
 )
-from tabpfn_common_utils.telemetry.interactive import capture_session, ping
 
 # --- TabPFN imports ---
 from tabpfn.constants import (
@@ -418,16 +417,6 @@ def estimator_to_device(
     return byte_size
 
 
-def initialize_telemetry() -> None:
-    """Initialize telemetry and acknowledge anonymous session.
-
-    If user opted out of telemetry using `TABPFN_DISABLE_TELEMETRY`,
-    no action is taken.
-    """
-    ping()
-    capture_session()
-
-
 def get_embeddings(
     model: TabPFNClassifier | TabPFNRegressor,
     X: XType,

@@ -30,7 +30,6 @@
 import torch
 from sklearn import config_context
 from sklearn.base import BaseEstimator, ClassifierMixin, check_is_fitted
-from tabpfn_common_utils.telemetry import track_model_call
 
 from tabpfn.base import (
     ClassifierModelSpecs,
@@ -39,7 +38,6 @@
     estimator_to_device,
     get_embeddings,
     initialize_model_variables_helper,
-    initialize_telemetry,
 )
 from tabpfn.constants import (
     PROBABILITY_EPSILON_ROUND_ZERO,
@@ -81,6 +79,10 @@
 from tabpfn.preprocessing.ensemble import TabPFNEnsemblePreprocessor
 from tabpfn.preprocessing.label_encoder import TabPFNLabelEncoder
 from tabpfn.preprocessing.modality_detection import detect_feature_modalities
+from tabpfn.telemetry import (
+    init as init_telemetry,
+    track_model_call,
+)
 from tabpfn.utils import (
     DevicesSpecification,
     balance_probas_by_class_counts,
@@ -482,7 +484,7 @@ class in Fine-Tuning. The fit_from_preprocessed() function sets this
         self.n_preprocessing_jobs = n_preprocessing_jobs
         self.eval_metric = eval_metric
         self.tuning_config = tuning_config
-        initialize_telemetry()
+        init_telemetry()
 
         # Only anonymously record `fit_mode` usage
         log_model_init_params(self, {"fit_mode": self.fit_mode})

@@ -26,7 +26,6 @@
 import joblib
 import torch
 from filelock import FileLock
-from tabpfn_common_utils.telemetry import set_model_config
 from torch import nn
 
 from tabpfn.architectures import ARCHITECTURES
@@ -36,6 +35,7 @@
 from tabpfn.inference import InferenceEngine
 from tabpfn.inference_config import InferenceConfig
 from tabpfn.settings import settings
+from tabpfn.telemetry import set_model_config
 
 if TYPE_CHECKING:
     from sklearn.base import BaseEstimator
@@ -767,7 +767,7 @@ def log_model_init_params(
             # We conditionally import here to avoid introducing breaking changes as
             # this interface was introduced in tabpfn_common_utils 0.2.13 and not all
             # users have upgraded to this version yet.
-            from tabpfn_common_utils.telemetry import set_init_params  # noqa: PLC0415
+            from tabpfn.telemetry import set_init_params  # noqa: PLC0415
 
             set_init_params(logged_params)
         except ImportError:

@@ -35,7 +35,6 @@
     TransformerMixin,
     check_is_fitted,
 )
-from tabpfn_common_utils.telemetry import track_model_call
 
 from tabpfn.architectures.base.bar_distribution import FullSupportBarDistribution
 from tabpfn.base import (
@@ -45,7 +44,6 @@
     estimator_to_device,
     get_embeddings,
     initialize_model_variables_helper,
-    initialize_telemetry,
 )
 from tabpfn.constants import REGRESSION_CONSTANT_TARGET_BORDER_EPSILON, ModelVersion
 from tabpfn.errors import TabPFNValidationError, handle_oom_errors
@@ -70,6 +68,10 @@
 from tabpfn.preprocessing.steps import (
     get_all_reshape_feature_distribution_preprocessors,
 )
+from tabpfn.telemetry import (
+    init as init_telemetry,
+    track_model_call,
+)
 from tabpfn.utils import (
     DevicesSpecification,
     convert_batch_of_cat_ix_to_schema,
@@ -466,7 +468,7 @@ class in Fine-Tuning. The fit_from_preprocessed() function sets this
             )
         self.n_jobs = n_jobs
         self.n_preprocessing_jobs = n_preprocessing_jobs
-        initialize_telemetry()
+        init_telemetry()
 
         # Only anonymously record `fit_mode` usage
         log_model_init_params(self, {"fit_mode": self.fit_mode})