diff --git a/THIRD-PARTY-NOTICES.md b/THIRD-PARTY-NOTICES.md
new file mode 100644
index 000000000..0e417bca7
--- /dev/null
+++ b/THIRD-PARTY-NOTICES.md
@@ -0,0 +1,39 @@
+# Third-Party Notices
+
+This file documents third-party code adapted into this repository, with
+upstream attribution preserved. Transitive dependencies installed via
+`pip` are governed by their own licenses (see `pyproject.toml` for the
+canonical list).
+
+---
+
+## Summary
+
+| Upstream | Local path | Upstream license |
+|---|---|---|
+| skrub — `SquashingScaler` | `src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py`<br>`src/tabpfn/preprocessing/torch/torch_squashing_scaler.py` | BSD-3-Clause |
+
+---
+
+## Per-upstream notices
+
+### skrub — SquashingScaler
+
+**Upstream:** https://github.com/skrub-data/skrub
+**Local paths:**
+- `src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py` — CPU/scikit-learn implementation
+- `src/tabpfn/preprocessing/torch/torch_squashing_scaler.py` — PyTorch port of the same algorithm, with explicit-state fit/apply semantics
+
+**License:** BSD-3-Clause
+**Copyright:** Copyright (c) 2018-2023, The dirty_cat developers, 2023-2026 the skrub developers. All rights reserved. (per the skrub `LICENSE.txt`)
+**Modifications:** Adapted to fit TabPFN's preprocessing pipeline; algorithmic logic preserved across both implementations. Upstream does not ship a per-file copyright header; attribution is carried in this NOTICE plus the in-file blocks.
+
+---
+
+## Adding new entries
+
+When vendoring or adapting third-party code:
+
+1. Preserve any upstream per-file copyright and license header verbatim. If the upstream does not ship a per-file header, add an attribution block citing the upstream URL, copyright holder, and SPDX license identifier.
+2. When vendoring a whole directory of upstream code, also vendor the upstream `LICENSE` / `NOTICE` file alongside it. For single-file adaptations, the in-file attribution plus the entry in this NOTICE file is sufficient.
+3. Add a row to the summary table and a per-upstream notice to this file, including the upstream copyright line when one is published.
diff --git a/changelog/964.added.md b/changelog/964.added.md
new file mode 100644
index 000000000..edbea8220
--- /dev/null
+++ b/changelog/964.added.md
@@ -0,0 +1 @@
+Add `THIRD-PARTY-NOTICES.md` at repo root documenting third-party code adapted into TabPFN (currently: skrub's `SquashingScaler`, used by both the CPU and PyTorch preprocessing implementations) with upstream attribution preserved.
diff --git a/src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py b/src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py
index 7ff176afe..6b3c0abdd 100644
--- a/src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py
+++ b/src/tabpfn/preprocessing/steps/squashing_scaler_transformer.py
@@ -1,6 +1,11 @@
 """Implementation of the SquashingScaler, adapted from skrub.
 
-See https://skrub-data.org/stable/reference/generated/skrub.SquashingScaler.html
+Adapted from skrub: https://github.com/skrub-data/skrub
+  reference: https://skrub-data.org/stable/reference/generated/skrub.SquashingScaler.html
+
+Copyright (c) 2018-2023, The dirty_cat developers, 2023-2026 the skrub developers.
+All rights reserved.
+SPDX-License-Identifier: BSD-3-Clause
 
 This preprocessing is used e.g. in RealMLP, see https://arxiv.org/abs/2407.04491
 """
diff --git a/src/tabpfn/preprocessing/torch/torch_squashing_scaler.py b/src/tabpfn/preprocessing/torch/torch_squashing_scaler.py
index 356e02bb5..19a14ee26 100644
--- a/src/tabpfn/preprocessing/torch/torch_squashing_scaler.py
+++ b/src/tabpfn/preprocessing/torch/torch_squashing_scaler.py
@@ -3,10 +3,16 @@
 """Torch implementation of SquashingScaler with NaN handling.
 
 Mirrors the CPU
-:class:`tabpfn.preprocessing.steps.squashing_scaler_transformer.SquashingScaler`:
-robust median-centering with quartile scaling (or a min-max fallback when the
-inter-quartile range collapses), followed by an injective soft-clip into
-``[-max_absolute_value, +max_absolute_value]``.
+:class:`tabpfn.preprocessing.steps.squashing_scaler_transformer.SquashingScaler`,
+which is itself adapted from skrub:
+  https://github.com/skrub-data/skrub
+The algorithmic logic (robust median-centering with quartile scaling, min-max
+fallback, soft-clip) is derived from skrub's ``SquashingScaler``.
+
+Original skrub attribution:
+  Copyright (c) 2018-2023, The dirty_cat developers, 2023-2026 the skrub developers.
+  All rights reserved.
+  SPDX-License-Identifier: BSD-3-Clause
 
 The state is returned explicitly from ``fit`` rather than stored on the
 instance, matching the rest of ``preprocessing/torch``.