chore(examples): Use QuantileDMatrix for histogram tree method in XGBoost example by sunalawa · Pull Request #3376 · kubeflow/trainer

sunalawa · 2026-03-23T10:04:00Z

Replace DMatrix with QuantileDMatrix in distributed XGBoost training example when using histogram tree method. This reduces memory usage and aligns with XGBoost best practices for distributed workloads.

What this PR does / why we need it:

Updates the distributed XGBoost example to use QuantileDMatrix instead of DMatrix.
This reduces memory usage and follows XGBoost best practices for distributed training workloads.

Which issue(s) this PR fixes:

Fixes #3300

Checklist:

Docs included if any changes are user facing

…oost example Replace DMatrix with QuantileDMatrix in distributed XGBoost training example when using histogram tree method. This reduces memory usage and aligns with XGBoost best practices for distributed workloads. Fixes kubeflow#3300 Signed-off-by: Suyash Nalawade <sunalawa@redhat.com>

review-notebook-app · 2026-03-23T10:04:05Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

google-oss-prow · 2026-03-23T10:04:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign johnugeorge for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

github-actions · 2026-03-23T10:04:09Z

🎉 Welcome to the Kubeflow Trainer! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards.
Our team will review your PR soon! cc @kubeflow/kubeflow-trainer-team

Join the community:

Slack: Join our #kubeflow-trainer Slack channel.
Meetings: Attend the Kubeflow AutoML and Training Working Group bi-weekly meetings.

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

Copilot

Pull request overview

Updates the distributed XGBoost training example notebook to follow XGBoost best practices for histogram-based training by switching to QuantileDMatrix, reducing memory overhead in distributed runs.

Changes:

Replace xgb.DMatrix with xgb.QuantileDMatrix for training and validation datasets (with ref=dtrain for validation).
Explicitly set tree_method to "hist" to match QuantileDMatrix’s intended usage.

sunalawa · 2026-03-23T10:10:44Z

Local Testing:

1. make test

      ✅ All tests passed - Every package shows "ok" status
      ✅ Good test coverage - Most packages have decent coverage (27-100%)
      ✅ Tests used cache - "(cached)" indicates tests ran quickly using previous results
      
      Key observations:
      
      - No failures or errors - All 24 packages passed
      - Coverage ranges from 27.9% to 100% across different components
      - Some packages show 0.0% coverage or [no test files] - this is normal for:
      - Constants packages (no logic to test)
      - Utility packages that may not have tests yet
      - Framework packages that are interfaces

2. make test-integration

      Ran 44 of 44 Specs in 32.347 seconds
      SUCCESS! -- 44 Passed | 0 Failed | 0 Pending | 0 Skipped
      PASS
      
      Ginkgo ran 2 suites in 1m26.659692667s
      Test Suite Passed

3. make test-e2e

TrainJob e2e when Creating TrainJob with runtime status server instrumentation should inject runtime configuration which allows the runtime status endpoint to be called
/Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:404
  STEP: Create a TrainJob that will call the runtime-status endpoint @ 03/23/26 14:09:43.142
  STEP: Verify trainerStatus is updated with runtime status information @ 03/23/26 14:09:43.149
  [FAILED] in [It] - /Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:443 @ 03/23/26 14:20:15.906
• [FAILED] [600.026 seconds]
TrainJob e2e when Creating TrainJob with runtime status server instrumentation [It] should inject runtime configuration which allows the runtime status endpoint to be called
/Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:404

  [FAILED] Timed out after 600.001s.
  The function passed to Eventually failed at /Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:424 with:
  Expected
      <*v1alpha1.TrainerStatus | 0x0>: nil
  not to be nil
  In [It] at: /Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:443 @ 03/23/26 14:20:15.906
------------------------------

Summarizing 1 Failure:
  [FAIL] TrainJob e2e when Creating TrainJob with runtime status server instrumentation [It] should inject runtime configuration which allows the runtime status endpoint to be called
  /Users/sunalawa/PycharmProjects/opensource/kubeflow-trainer/test/e2e/e2e_test.go:443

Ran 6 of 7 Specs in 1148.464 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 1 Skipped
--- FAIL: TestAPIs (1148.46s)
FAIL

Ginkgo ran 1 suite in 19m10.399196375s

Test Suite Failed
make: *** [test-e2e] Error 1

Copilot AI review requested due to automatic review settings March 23, 2026 10:04

google-oss-prow bot requested review from akshaychitneni and jinchihe March 23, 2026 10:04

google-oss-prow bot added the size/XS label Mar 23, 2026

Copilot started reviewing on behalf of sunalawa March 23, 2026 10:04 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(examples): Use QuantileDMatrix for histogram tree method in XGBoost example#3376

chore(examples): Use QuantileDMatrix for histogram tree method in XGBoost example#3376
sunalawa wants to merge 1 commit intokubeflow:masterfrom
sunalawa:chore/xgboost-quantile-dmatrix-update

sunalawa commented Mar 23, 2026

Uh oh!

review-notebook-app bot commented Mar 23, 2026

Uh oh!

google-oss-prow bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sunalawa commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sunalawa commented Mar 23, 2026

Uh oh!

review-notebook-app bot commented Mar 23, 2026

Uh oh!

google-oss-prow bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

sunalawa commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Local Testing:

1. make test

2. make test-integration

3. make test-e2e

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sunalawa commented Mar 23, 2026 •

edited

Loading