Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 24 additions & 14 deletions c/include/cuvs/preprocessing/pca.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,9 @@ CUVS_EXPORT cuvsError_t cuvsPcaParamsDestroy(cuvsPcaParams_t params);
* Computes the principal components, explained variances, singular values, and column means
* from the input data.
*
* The layout of `input` (C-contiguous / row-major or F-contiguous / col-major) is detected
* from its DLPack strides; `components` must use the same layout as `input`.
*
* @code {.c}
* #include <cuvs/core/c_api.h>
* #include <cuvs/preprocessing/pca.h>
Expand All @@ -98,9 +101,9 @@ CUVS_EXPORT cuvsError_t cuvsPcaParamsDestroy(cuvsPcaParams_t params);
* cuvsPcaParamsCreate(&params);
* params->n_components = 2;
*
* // Assume populated DLManagedTensor objects (col-major, float32, device memory)
* DLManagedTensor input; // [n_rows x n_cols]
* DLManagedTensor components; // [n_components x n_cols]
* // Assume populated DLManagedTensor objects (float32, device memory)
* DLManagedTensor input; // [n_rows x n_cols] (C- or F-contiguous)
* DLManagedTensor components; // [n_components x n_cols] (same layout as input)
* DLManagedTensor explained_var; // [n_components]
* DLManagedTensor explained_var_ratio; // [n_components]
* DLManagedTensor singular_vals; // [n_components]
Expand All @@ -117,8 +120,8 @@ CUVS_EXPORT cuvsError_t cuvsPcaParamsDestroy(cuvsPcaParams_t params);
*
* @param[in] res cuvsResources_t opaque C handle
* @param[in] params PCA parameters
* @param[inout] input input data [n_rows x n_cols] (col-major, float32, device)
* @param[out] components principal components [n_components x n_cols] (col-major, float32, device)
* @param[inout] input input data [n_rows x n_cols] (C- or F-contiguous, float32, device)
* @param[out] components principal components [n_components x n_cols] (same layout as input)
* @param[out] explained_var explained variances [n_components] (float32, device)
* @param[out] explained_var_ratio explained variance ratios [n_components] (float32, device)
* @param[out] singular_vals singular values [n_components] (float32, device)
Expand All @@ -142,12 +145,14 @@ CUVS_EXPORT cuvsError_t cuvsPcaFit(cuvsResources_t res,
* @brief Perform PCA fit and transform in a single operation.
*
* Computes the principal components and transforms the input data into the eigenspace.
* The layout of `input` (C- or F-contiguous) is detected from its DLPack strides; all
* other matrix tensors must use the same layout.
*
* @param[in] res cuvsResources_t opaque C handle
* @param[in] params PCA parameters
* @param[inout] input input data [n_rows x n_cols] (col-major, float32, device)
* @param[out] trans_input transformed data [n_rows x n_components] (col-major, float32, device)
* @param[out] components principal components [n_components x n_cols] (col-major, float32, device)
* @param[inout] input input data [n_rows x n_cols] (C- or F-contiguous, float32, device)
* @param[out] trans_input transformed data [n_rows x n_components] (same layout as input)
* @param[out] components principal components [n_components x n_cols] (same layout as input)
* @param[out] explained_var explained variances [n_components] (float32, device)
* @param[out] explained_var_ratio explained variance ratios [n_components] (float32, device)
* @param[out] singular_vals singular values [n_components] (float32, device)
Expand All @@ -172,14 +177,16 @@ CUVS_EXPORT cuvsError_t cuvsPcaFitTransform(cuvsResources_t res,
* @brief Perform PCA transform operation.
*
* Transforms the input data into the eigenspace using previously computed principal components.
* The layout of `input` (C- or F-contiguous) is detected from its DLPack strides; all other
* matrix tensors must use the same layout.
*
* @param[in] res cuvsResources_t opaque C handle
* @param[in] params PCA parameters
* @param[inout] input data to transform [n_rows x n_cols] (col-major, float32, device)
* @param[in] components principal components [n_components x n_cols] (col-major, float32, device)
* @param[inout] input data to transform [n_rows x n_cols] (C- or F-contiguous, float32, device)
* @param[in] components principal components [n_components x n_cols] (same layout as input)
* @param[in] singular_vals singular values [n_components] (float32, device)
* @param[in] mu column means [n_cols] (float32, device)
* @param[out] trans_input transformed data [n_rows x n_components] (col-major, float32, device)
* @param[out] trans_input transformed data [n_rows x n_components] (same layout as input)
* @return cuvsError_t
*/
CUVS_EXPORT cuvsError_t cuvsPcaTransform(cuvsResources_t res,
Expand All @@ -194,14 +201,17 @@ CUVS_EXPORT cuvsError_t cuvsPcaTransform(cuvsResources_t res,
* @brief Perform PCA inverse transform operation.
*
* Transforms data from the eigenspace back to the original space.
* The layout of `trans_input` (C- or F-contiguous) is detected from its DLPack strides;
* all other matrix tensors must use the same layout.
*
* @param[in] res cuvsResources_t opaque C handle
* @param[in] params PCA parameters
* @param[in] trans_input transformed data [n_rows x n_components] (col-major, float32, device)
* @param[in] components principal components [n_components x n_cols] (col-major, float32, device)
* @param[in] trans_input transformed data [n_rows x n_components] (C- or F-contiguous,
* float32, device)
* @param[in] components principal components [n_components x n_cols] (same layout as trans_input)
* @param[in] singular_vals singular values [n_components] (float32, device)
* @param[in] mu column means [n_cols] (float32, device)
* @param[out] output reconstructed data [n_rows x n_cols] (col-major, float32, device)
* @param[out] output reconstructed data [n_rows x n_cols] (same layout as trans_input)
* @return cuvsError_t
*/
CUVS_EXPORT cuvsError_t cuvsPcaInverseTransform(cuvsResources_t res,
Expand Down
125 changes: 83 additions & 42 deletions c/src/preprocessing/pca.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ cuvs::preprocessing::pca::params to_cpp_params(const cuvsPcaParams& c_params)
return cpp_params;
}

template <typename LayoutT>
void _fit(cuvsResources_t res,
const cuvsPcaParams& params,
DLManagedTensor* input_tensor,
Expand All @@ -54,12 +55,12 @@ void _fit(cuvsResources_t res,
auto res_ptr = reinterpret_cast<raft::resources*>(res);
auto cpp_params = to_cpp_params(params);

using matrix_type = raft::device_matrix_view<float, int64_t, raft::col_major>;
using matrix_type = raft::device_matrix_view<float, int64_t, LayoutT>;
using vector_type = raft::device_vector_view<float, int64_t>;
using scalar_type = raft::device_scalar_view<float, int64_t>;

auto input = cuvs::core::from_dlpack<matrix_type>(input_tensor);
auto components = cuvs::core::from_dlpack<matrix_type>(components_tensor);
auto input = cuvs::core::from_dlpack<matrix_type>(input_tensor);
auto components = cuvs::core::from_dlpack<matrix_type>(components_tensor);
auto explained_var = cuvs::core::from_dlpack<vector_type>(explained_var_tensor);
auto explained_var_ratio = cuvs::core::from_dlpack<vector_type>(explained_var_ratio_tensor);
auto singular_vals = cuvs::core::from_dlpack<vector_type>(singular_vals_tensor);
Expand All @@ -78,6 +79,7 @@ void _fit(cuvsResources_t res,
flip_signs_based_on_U);
}

template <typename LayoutT>
void _fit_transform(cuvsResources_t res,
const cuvsPcaParams& params,
DLManagedTensor* input_tensor,
Expand All @@ -93,13 +95,13 @@ void _fit_transform(cuvsResources_t res,
auto res_ptr = reinterpret_cast<raft::resources*>(res);
auto cpp_params = to_cpp_params(params);

using matrix_type = raft::device_matrix_view<float, int64_t, raft::col_major>;
using matrix_type = raft::device_matrix_view<float, int64_t, LayoutT>;
using vector_type = raft::device_vector_view<float, int64_t>;
using scalar_type = raft::device_scalar_view<float, int64_t>;

auto input = cuvs::core::from_dlpack<matrix_type>(input_tensor);
auto trans_input = cuvs::core::from_dlpack<matrix_type>(trans_input_tensor);
auto components = cuvs::core::from_dlpack<matrix_type>(components_tensor);
auto input = cuvs::core::from_dlpack<matrix_type>(input_tensor);
auto trans_input = cuvs::core::from_dlpack<matrix_type>(trans_input_tensor);
auto components = cuvs::core::from_dlpack<matrix_type>(components_tensor);
auto explained_var = cuvs::core::from_dlpack<vector_type>(explained_var_tensor);
auto explained_var_ratio = cuvs::core::from_dlpack<vector_type>(explained_var_ratio_tensor);
auto singular_vals = cuvs::core::from_dlpack<vector_type>(singular_vals_tensor);
Expand All @@ -119,6 +121,7 @@ void _fit_transform(cuvsResources_t res,
flip_signs_based_on_U);
}

template <typename LayoutT>
void _transform(cuvsResources_t res,
const cuvsPcaParams& params,
DLManagedTensor* input_tensor,
Expand All @@ -130,7 +133,7 @@ void _transform(cuvsResources_t res,
auto res_ptr = reinterpret_cast<raft::resources*>(res);
auto cpp_params = to_cpp_params(params);

using matrix_type = raft::device_matrix_view<float, int64_t, raft::col_major>;
using matrix_type = raft::device_matrix_view<float, int64_t, LayoutT>;
using vector_type = raft::device_vector_view<float, int64_t>;

auto input = cuvs::core::from_dlpack<matrix_type>(input_tensor);
Expand All @@ -143,6 +146,7 @@ void _transform(cuvsResources_t res,
*res_ptr, cpp_params, input, components, singular_vals, mu, trans_input);
}

template <typename LayoutT>
void _inverse_transform(cuvsResources_t res,
const cuvsPcaParams& params,
DLManagedTensor* trans_input_tensor,
Expand All @@ -154,7 +158,7 @@ void _inverse_transform(cuvsResources_t res,
auto res_ptr = reinterpret_cast<raft::resources*>(res);
auto cpp_params = to_cpp_params(params);

using matrix_type = raft::device_matrix_view<float, int64_t, raft::col_major>;
using matrix_type = raft::device_matrix_view<float, int64_t, LayoutT>;
using vector_type = raft::device_vector_view<float, int64_t>;

auto trans_input = cuvs::core::from_dlpack<matrix_type>(trans_input_tensor);
Expand Down Expand Up @@ -205,19 +209,32 @@ extern "C" cuvsError_t cuvsPcaFit(cuvsResources_t res,
"PCA input must be float32 (kDLFloat, 32 bits)");
RAFT_EXPECTS(cuvs::core::is_dlpack_device_compatible(input->dl_tensor),
"PCA input must be device-accessible memory");
RAFT_EXPECTS(cuvs::core::is_f_contiguous(input),
"PCA input must be col-major (Fortran-contiguous)");

_fit(res,
*params,
input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);

if (cuvs::core::is_f_contiguous(input)) {
_fit<raft::col_major>(res,
*params,
input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);
} else if (cuvs::core::is_c_contiguous(input)) {
_fit<raft::row_major>(res,
*params,
input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);
} else {
RAFT_FAIL("PCA input must be contiguous (C- or F-order)");
}
});
}

Expand All @@ -239,20 +256,34 @@ extern "C" cuvsError_t cuvsPcaFitTransform(cuvsResources_t res,
"PCA input must be float32 (kDLFloat, 32 bits)");
RAFT_EXPECTS(cuvs::core::is_dlpack_device_compatible(input->dl_tensor),
"PCA input must be device-accessible memory");
RAFT_EXPECTS(cuvs::core::is_f_contiguous(input),
"PCA input must be col-major (Fortran-contiguous)");

_fit_transform(res,
*params,
input,
trans_input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);

if (cuvs::core::is_f_contiguous(input)) {
_fit_transform<raft::col_major>(res,
*params,
input,
trans_input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);
} else if (cuvs::core::is_c_contiguous(input)) {
_fit_transform<raft::row_major>(res,
*params,
input,
trans_input,
components,
explained_var,
explained_var_ratio,
singular_vals,
mu,
noise_vars,
flip_signs_based_on_U);
} else {
RAFT_FAIL("PCA input must be contiguous (C- or F-order)");
}
});
}

Expand All @@ -270,10 +301,14 @@ extern "C" cuvsError_t cuvsPcaTransform(cuvsResources_t res,
"PCA input must be float32 (kDLFloat, 32 bits)");
RAFT_EXPECTS(cuvs::core::is_dlpack_device_compatible(input->dl_tensor),
"PCA input must be device-accessible memory");
RAFT_EXPECTS(cuvs::core::is_f_contiguous(input),
"PCA input must be col-major (Fortran-contiguous)");

_transform(res, *params, input, components, singular_vals, mu, trans_input);
if (cuvs::core::is_f_contiguous(input)) {
_transform<raft::col_major>(res, *params, input, components, singular_vals, mu, trans_input);
} else if (cuvs::core::is_c_contiguous(input)) {
_transform<raft::row_major>(res, *params, input, components, singular_vals, mu, trans_input);
} else {
RAFT_FAIL("PCA input must be contiguous (C- or F-order)");
}
});
}

Expand All @@ -291,9 +326,15 @@ extern "C" cuvsError_t cuvsPcaInverseTransform(cuvsResources_t res,
"PCA trans_input must be float32 (kDLFloat, 32 bits)");
RAFT_EXPECTS(cuvs::core::is_dlpack_device_compatible(trans_input->dl_tensor),
"PCA trans_input must be device-accessible memory");
RAFT_EXPECTS(cuvs::core::is_f_contiguous(trans_input),
"PCA trans_input must be col-major (Fortran-contiguous)");

_inverse_transform(res, *params, trans_input, components, singular_vals, mu, output);
if (cuvs::core::is_f_contiguous(trans_input)) {
_inverse_transform<raft::col_major>(
res, *params, trans_input, components, singular_vals, mu, output);
} else if (cuvs::core::is_c_contiguous(trans_input)) {
_inverse_transform<raft::row_major>(
res, *params, trans_input, components, singular_vals, mu, output);
} else {
RAFT_FAIL("PCA trans_input must be contiguous (C- or F-order)");
}
});
}
6 changes: 3 additions & 3 deletions cpp/cmake/thirdparty/get_raft.cmake
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# =============================================================================
# cmake-format: off
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0
# cmake-format: on

Expand Down Expand Up @@ -60,8 +60,8 @@ endfunction()
# To use a different RAFT locally, set the CMake variable
# CPM_raft_SOURCE=/path/to/local/raft
find_and_configure_raft(VERSION ${RAFT_VERSION}.00
FORK ${RAFT_FORK}
PINNED_TAG ${RAFT_PINNED_TAG}
FORK aamijar
PINNED_TAG pca-row-major
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert later

Comment on lines +63 to +64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Revert to default RAFT fork/tag before merge.

These hard-coded values point to a temporary development branch and must be reverted before this PR is merged. Merging as-is would break the build for all users once the pca-row-major branch is removed.

Required before merge:

  1. Ensure RAFT PR #3036 is merged to the main RAFT repository
  2. Revert these lines to use the default variables:
    FORK                     ${RAFT_FORK}
    PINNED_TAG               ${RAFT_PINNED_TAG}

This configuration is appropriate for testing during development, but the author's own comment "Revert later" confirms it must not reach main.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/cmake/thirdparty/get_raft.cmake` around lines 63 - 64, Replace the
temporary hard-coded RAFT fork/tag values in the get_raft.cmake snippet with the
project variables so the build uses the configured defaults: change the FORK and
PINNED_TAG entries that currently read as literal "aamijar" and "pca-row-major"
to reference ${RAFT_FORK} and ${RAFT_PINNED_TAG} respectively; ensure this
change is applied where FORK and PINNED_TAG are defined in the same block so the
build picks up the RAFT_FORK/RAFT_PINNED_TAG variables (after confirming the
upstream RAFT PR `#3036` is merged).

ENABLE_MNMG_DEPENDENCIES OFF
ENABLE_NVTX OFF
BUILD_STATIC_DEPS ${CUVS_STATIC_RAPIDS_LIBRARIES}
Expand Down
Loading
Loading