cuML 25.08 Release Notes
π What's New
β Highlights
- Spectral Embedding: New algorithm for dimensionality reduction and manifold learning (#6581) @aamijar
- cuML.accel Profiler: Added profiling capabilities for Zero Code Change Acceleration (#7021) @jcrist
- cuML.accel LinearSVC/LinearSVR: New support for linear support vector classification and regression (#6866) @viclafargue
- cuML.accel set_output/get_feature_names_out: Added support for scikit-learn output configuration (#6942) @jcrist
π§ Major Improvements
UMAP Enhancements
- Multi-GPU KNN graph building support (#7019) @jinsolp
- Improved handling of identical vectors in distance calculations (#6904) @jinsolp
- Disabled non-determinism on small datasets for better reproducibility (#7004) @viclafargue
FIL (Forest Inference Library) Improvements
- Support for wide data inference (#7014) @hcho3
- Better handling of empty categorical nodes (#6924) @hcho3
- Improved GPU context handling (#6987) @hcho3
- Restored legacy threshold behavior (#6922) @hcho3
Zero Code Change Acceleration (cuml.accel
)
- Added profilers for better performance analysis (#7021) @jcrist
- Enhanced logging for proxy estimators (#6957) @csadorf
- Implemented metadata routing support (#6950) @jcrist
- New support for
LinearSVC
andLinearSVR
(#6866) @viclafargue - Added
KernelRidge
to supported algorithms (#6917) @jcrist - Improved CLI with
-c
and-
options support (#6852) @jcrist
Algorithm Enhancements
- DBSCAN: Now computes
components_
attribute (#6976) @jcrist - LogisticRegression: Exposed
n_iter_
attribute for iteration tracking (#6911) @betatim - RandomForest: Fixed default
max_features
parameter (#6862) @jcrist - TSNE: Added fallback support for unsupported metrics (#6992) @jcrist
- Ridge: Better handling of underdetermined systems (#7003) @betatim
Developer Experience
- Testing: Enhanced CI with upstream test suites for HDBSCAN, UMAP, and other algorithms (#6995, #6989, #6986) @jcrist
- Documentation: Comprehensive updates to Python developer guide and API documentation (#6843) @csadorf
- Dependencies: Updated to CUDA 12.9 and added support for scikit-learn 1.4 (#6944, #6845) @jakirkham, @betatim
π¨ Breaking Changes
Deprecated Parameters & Functions
- UMAP:
data_on_host
parameter is deprecated (#6953) @jinsolp - HDBSCAN:
- SGD Algorithms:
penalty='none'
is deprecated inMBSGDClassifier
,MBSGDRegressor
, andSGD
(#6926) @jcrist - KMeans:
random_state
default changed toNone
(#6884) @jcrist
Removed Components
- Experimental FIL:
experimental.fil
Python module removed (#6899) @hcho3 - Legacy FIL: Removed from libcuml (#6844) @hcho3
- CUDA 11 Support: Removed from dependencies and CI (#6847, #6885) @KyleFromNVIDIA, @dantegd
- Package Distribution: Stopped uploading to downloads.rapids.ai (#6803) @jameslamb
API Changes
- Neighbors: Ported to
InteropMixin
/ProxyBase
(#6851) @jcrist - Ensemble: Ported to
InteropMixin
/ProxyBase
(#6863) @jcrist - UMAP: Ported to
InteropMixin
/ProxyBase
(#6840) @jcrist
π Bug Fixes
Algorithm Fixes
- UMAP: Improved handling of identical vectors in UMAP distance calculations (#6904) @jinsolp
- TSNE: Relaxed tolerance for sparse input tests (#7033) @jinsolp
- RandomForest: Fixed default
max_features
parameter (#6862) @jcrist - HDBSCAN: Rewrote Python wrapper for better stability (#6913) @jcrist
- Logistic Regression: Increased tolerance in Dask tests (#6848) @csadorf
Compatibility & Dependencies
- Fixed compatibility with scikit-learn 1.7.0 and Python 3.13.4 (#6865) @csadorf
- Unxfailed tests affected by numba compilation errors (#6905) @csadorf
Other
π Documentation Updates
User Documentation
- Supported Versions: Added comprehensive version compatibility documentation (#7040) @csadorf
- Zero Code Change Acceleration: Updated title and reorganized documentation (#7030, #7026) @csadorf, @jcrist
- UMAP: Added multi-GPU KNN graph building documentation (#7019) @jinsolp
- TSNE: Fixed FFT TSNE documentation (#6967) @jinsolp
- Limitations: Revamped
cuml.accel
limitations documentation (#6965) @jcrist
Developer Documentation
- Python Developer Guide: Comprehensive updates (#6843) @csadorf
- CI Workflow: Added documentation for workflow inputs (#6952) @jameslamb
- Async Operations: Removed outdated async operation section (#6980) @csadorf
π Migration Guide
For Users
- UMAP: Remove
data_on_host
parameter from your code - HDBSCAN: Update to use new prediction function signatures
- SGD: Replace
penalty='none'
with appropriate alternatives - KMeans: Be aware that
random_state=None
is now the default
For Developers
- CUDA 11: Update your development environment to CUDA 12.9
- FIL: Update imports from
experimental.fil
tofil
- Dependencies: Update to supported versions as documented
π Summary
This release brings significant improvements to cuML's performance, stability, and developer experience. The highlight is the new Spectral Embedding algorithm, along with major architectural improvements to HDBSCAN and UMAP. The Zero Code Change Acceleration feature continues to expand with new algorithms and better tooling.
The breaking changes are primarily focused on cleaning up deprecated APIs and improving the overall codebase structure. Users are encouraged to review the migration guide and update their code accordingly.
π§ Internal & Technical Changes
Architecture Improvements
- HDBSCAN Migration: Migrated to cuVS cluster primitives from raft::cluster for better performance (#6560) @tarang-jain
- Module Porting: Ported
cuml.neighbors
,cuml.ensemble
, and UMAP toInteropMixin
/ProxyBase
(#6851, #6863, #6840) @jcrist - Base Class Cleanup: Removed deprecated base classes and functions (#6919, #6888) @jcrist
- Cython Optimization: De-Cythonized several modules for better maintainability (#6920) @jcrist
Performance Optimizations
- Binary Size: Reduced ARIMA kernels binary size (#6997) @jcrist
- Memory Usage: Instantiated only specific RAFT kernels (#6900, #6780) @aamijar, @divyegala
- Kernel Optimization: Used
cuvs::neighbors::knn_merge_parts
(#7005) @jcrist
Infrastructure & CI/CD
- Added GH_TOKEN pass-through for job summarization (#6894) @msarahan
- Fixed various CI test failures and flaky tests (#6906, #6877) @csadorf
- CUDA 12.9: Updated across all environments (#6944) @jakirkham
- Dependencies: Removed NVIDIA and Dask channels (#6935) @vyasr
- CI Images: Added versioned CI image tags (#7016) @jameslamb
- Testing: Enhanced with upstream test suites and better organization (#6995, #6989, #6986) @jcrist
Code Quality
- Shell Scripts: Fixed all shellcheck warnings and errors (#6901) @gforsyth
- Linting: Updated cython-lint and fixed long lines (#6969) @jcrist
- Documentation: Comprehensive updates to developer guides (#6843) @csadorf
For detailed information about specific changes, please refer to the individual pull requests linked in each entry.