Skip to content

v25.08.00

Latest
Compare
Choose a tag to compare
@AyodeAwe AyodeAwe released this 06 Aug 20:19
b081fcd

cuML 25.08 Release Notes

πŸŽ‰ What's New

⭐ Highlights

  • Spectral Embedding: New algorithm for dimensionality reduction and manifold learning (#6581) @aamijar
  • cuML.accel Profiler: Added profiling capabilities for Zero Code Change Acceleration (#7021) @jcrist
  • cuML.accel LinearSVC/LinearSVR: New support for linear support vector classification and regression (#6866) @viclafargue
  • cuML.accel set_output/get_feature_names_out: Added support for scikit-learn output configuration (#6942) @jcrist

πŸ”§ Major Improvements

UMAP Enhancements

  • Multi-GPU KNN graph building support (#7019) @jinsolp
  • Improved handling of identical vectors in distance calculations (#6904) @jinsolp
  • Disabled non-determinism on small datasets for better reproducibility (#7004) @viclafargue

FIL (Forest Inference Library) Improvements

Zero Code Change Acceleration (cuml.accel)

Algorithm Enhancements

  • DBSCAN: Now computes components_ attribute (#6976) @jcrist
  • LogisticRegression: Exposed n_iter_ attribute for iteration tracking (#6911) @betatim
  • RandomForest: Fixed default max_features parameter (#6862) @jcrist
  • TSNE: Added fallback support for unsupported metrics (#6992) @jcrist
  • Ridge: Better handling of underdetermined systems (#7003) @betatim

Developer Experience

  • Testing: Enhanced CI with upstream test suites for HDBSCAN, UMAP, and other algorithms (#6995, #6989, #6986) @jcrist
  • Documentation: Comprehensive updates to Python developer guide and API documentation (#6843) @csadorf
  • Dependencies: Updated to CUDA 12.9 and added support for scikit-learn 1.4 (#6944, #6845) @jakirkham, @betatim

🚨 Breaking Changes

Deprecated Parameters & Functions

  • UMAP: data_on_host parameter is deprecated (#6953) @jinsolp
  • HDBSCAN:
    • Prediction functions in cuml.cluster namespace are deprecated (#6943) @jcrist
    • connectivity parameter is deprecated (#6936) @jcrist
  • SGD Algorithms: penalty='none' is deprecated in MBSGDClassifier, MBSGDRegressor, and SGD (#6926) @jcrist
  • KMeans: random_state default changed to None (#6884) @jcrist

Removed Components

API Changes

πŸ› Bug Fixes

Algorithm Fixes

  • UMAP: Improved handling of identical vectors in UMAP distance calculations (#6904) @jinsolp
  • TSNE: Relaxed tolerance for sparse input tests (#7033) @jinsolp
  • RandomForest: Fixed default max_features parameter (#6862) @jcrist
  • HDBSCAN: Rewrote Python wrapper for better stability (#6913) @jcrist
  • Logistic Regression: Increased tolerance in Dask tests (#6848) @csadorf

Compatibility & Dependencies

  • Fixed compatibility with scikit-learn 1.7.0 and Python 3.13.4 (#6865) @csadorf
  • Unxfailed tests affected by numba compilation errors (#6905) @csadorf

Other

πŸ“– Documentation Updates

User Documentation

  • Supported Versions: Added comprehensive version compatibility documentation (#7040) @csadorf
  • Zero Code Change Acceleration: Updated title and reorganized documentation (#7030, #7026) @csadorf, @jcrist
  • UMAP: Added multi-GPU KNN graph building documentation (#7019) @jinsolp
  • TSNE: Fixed FFT TSNE documentation (#6967) @jinsolp
  • Limitations: Revamped cuml.accel limitations documentation (#6965) @jcrist

Developer Documentation

  • Python Developer Guide: Comprehensive updates (#6843) @csadorf
  • CI Workflow: Added documentation for workflow inputs (#6952) @jameslamb
  • Async Operations: Removed outdated async operation section (#6980) @csadorf

πŸ”„ Migration Guide

For Users

  1. UMAP: Remove data_on_host parameter from your code
  2. HDBSCAN: Update to use new prediction function signatures
  3. SGD: Replace penalty='none' with appropriate alternatives
  4. KMeans: Be aware that random_state=None is now the default

For Developers

  1. CUDA 11: Update your development environment to CUDA 12.9
  2. FIL: Update imports from experimental.fil to fil
  3. Dependencies: Update to supported versions as documented

πŸ“Š Summary

This release brings significant improvements to cuML's performance, stability, and developer experience. The highlight is the new Spectral Embedding algorithm, along with major architectural improvements to HDBSCAN and UMAP. The Zero Code Change Acceleration feature continues to expand with new algorithms and better tooling.

The breaking changes are primarily focused on cleaning up deprecated APIs and improving the overall codebase structure. Users are encouraged to review the migration guide and update their code accordingly.

πŸ”§ Internal & Technical Changes

Architecture Improvements

  • HDBSCAN Migration: Migrated to cuVS cluster primitives from raft::cluster for better performance (#6560) @tarang-jain
  • Module Porting: Ported cuml.neighbors, cuml.ensemble, and UMAP to InteropMixin/ProxyBase (#6851, #6863, #6840) @jcrist
  • Base Class Cleanup: Removed deprecated base classes and functions (#6919, #6888) @jcrist
  • Cython Optimization: De-Cythonized several modules for better maintainability (#6920) @jcrist

Performance Optimizations

Infrastructure & CI/CD

Code Quality

  • Shell Scripts: Fixed all shellcheck warnings and errors (#6901) @gforsyth
  • Linting: Updated cython-lint and fixed long lines (#6969) @jcrist
  • Documentation: Comprehensive updates to developer guides (#6843) @csadorf

For detailed information about specific changes, please refer to the individual pull requests linked in each entry.