Lightweight, production-ready AutoML built entirely on scikit-learn with Python 3.12+ modernization.
# From Test PyPI (current releases)
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ yaaml
# From PyPI (when available)
pip install yaaml
# From GitHub (latest development)
pip install git+https://github.com/JordanRex/yaaml.git
# From GitHub releases
pip install https://github.com/JordanRex/yaaml/archive/refs/tags/v0.1.2.tar.gz
import pandas as pd
from yaaml import YAAMLAutoML
from sklearn.model_selection import train_test_split
# Load your data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# AutoML in 3 lines
automl = YAAMLAutoML(mode='classification', max_evals=20, feature_engineering=True)
automl.fit(X_train, y_train)
accuracy = automl.score(X_test, y_test)
- 🚀 Zero Dependencies: Built entirely on scikit-learn
- 🎯 Multi-Task: Classification and regression
- 🔄 Complete Pipeline: End-to-end ML with intelligent preprocessing
- 🧠 Smart Automation: Feature engineering, selection, hyperparameter optimization
- ⚡ Lightweight: Fast training, minimal resource footprint
- 🔍 Transparent: Full visibility into decisions and transformations
# Handle mixed data types automatically
automl = YAAMLAutoML(
mode='classification',
imputation_strategy='iterative', # Advanced missing values
encoding_method='target', # Target encoding
sampling_strategy='balanced', # Handle imbalanced data
feature_selection=True, # Intelligent feature selection
max_evals=50,
cv_folds=10
)
# Works with messy real-world data
mixed_data = pd.DataFrame({
'numeric': [1.5, 2.3, np.nan, 4.1],
'categorical': ['A', 'B', 'A', 'C'],
'text': ['good', 'excellent', 'poor', 'good']
})
automl.fit(mixed_data, target)
predictions = automl.predict(new_data)
Current: v0.1.2 - Production ready with comprehensive Python 3.12+ modernization
- ✅ Complete: Classification, regression, feature engineering, hyperparameter optimization
- ✅ Tested: Realistic test suite achieving 70% accuracy on challenging datasets
- ✅ Modern: Full Python 3.12+ typing with walrus operator and union types
- ✅ Validated: Enhanced CI/CD pipeline with performance benchmarking
See docs/status.md for detailed roadmap and docs/testing.md for testing infrastructure.
- Testing Infrastructure: Comprehensive test suite details
- Project Status: Current status and roadmap
- Contributing: Development guidelines
- Examples: Usage examples and tutorials
MIT License - see LICENSE for details.