Skip to content

PCA C and Python API#1987

Open
aamijar wants to merge 10 commits intorapidsai:mainfrom
aamijar:pca-c-api
Open

PCA C and Python API#1987
aamijar wants to merge 10 commits intorapidsai:mainfrom
aamijar:pca-c-api

Conversation

@aamijar
Copy link
Copy Markdown
Member

@aamijar aamijar commented Apr 2, 2026

Resolves #1977 and Resolves #1994

@aamijar aamijar self-assigned this Apr 2, 2026
@aamijar aamijar added non-breaking Introduces a non-breaking change feature request New feature or request labels Apr 2, 2026
@aamijar aamijar moved this to In Progress in Unstructured Data Processing Apr 2, 2026
*/
enum cuvsPcaSolver {
/** Covariance + divide-and-conquer eigen decomposition */
PCA_COV_EIG_DQ = 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enums should be CUVS prefixed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 27f73ac

#endif

/**
* @defgroup preprocessing_c_pca C API for PCA (Principal Component Analysis)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to documentation

output_tensor.manager_ctx = NULL;
output_tensor.deleter = NULL;

cuvsPcaFitTransform(res,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an option to test fit() and predict() independantly too?

cudaMemcpyDeviceToDevice,
stream));

run_pca(n_rows,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync stream before calling this function as there are no guarantee that the inputs are populated before pca::fit() is called.
Or you can create the C resource in this function and pass it to run_pca

raft::random::RngState rng(1234ULL);
raft::random::uniform(handle, rng, input.data(), n_rows * n_cols, -1.0f, 1.0f);

RAFT_CUDA_TRY(cudaMemcpyAsync(input_copy.data(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use C++ wrapper for memcpy and streamsync

Comment on lines +33 to +47
DLManagedTensor input_tensor;
input_tensor.dl_tensor.data = input_data;
input_tensor.dl_tensor.device.device_type = kDLCUDA;
input_tensor.dl_tensor.device.device_id = 0;
input_tensor.dl_tensor.ndim = 2;
input_tensor.dl_tensor.dtype.code = kDLFloat;
input_tensor.dl_tensor.dtype.bits = 32;
input_tensor.dl_tensor.dtype.lanes = 1;
int64_t input_shape[2] = {n_rows, n_cols};
int64_t input_strides[2] = {1, n_rows};
input_tensor.dl_tensor.shape = input_shape;
input_tensor.dl_tensor.strides = input_strides;
input_tensor.dl_tensor.byte_offset = 0;
input_tensor.manager_ctx = NULL;
input_tensor.deleter = NULL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use cuvs::core::detail::to_dlpack() for those? That should reduce a lot the number of lines so this function could fit in the same test file

res, params, &trans_tensor, &comp_tensor, &sv_tensor, &mu_tensor, &output_tensor);

cuvsPcaParamsDestroy(params);
cuvsResourcesDestroy(res);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync stream before destroying the resource, there is no guarantee that the work is complete

Copy link
Copy Markdown
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving CMake changes

@aamijar aamijar requested a review from a team as a code owner April 7, 2026 23:45
@aamijar aamijar requested a review from a team as a code owner April 8, 2026 00:18
@aamijar aamijar changed the title PCA C API PCA C and Python API Apr 8, 2026
@aamijar
Copy link
Copy Markdown
Member Author

aamijar commented Apr 13, 2026

Hi @lowener, I think all your comments have been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C feature request New feature or request non-breaking Introduces a non-breaking change Python

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

[FEA] Python API for PCA [FEA] C API for PCA

3 participants