Quickstart
This guide will show you how to use Gunz-ML to interact with the Juno tracking infrastructure and perform basic analysis tasks.
1. Connecting to Juno
The core of the library is the TrackingManager, which provides a unified interface for MLflow and Optuna.
from gunz_ml.management import TrackingManager
# Initialize the manager (defaults to Juno production server)
tm = TrackingManager(
mlflow_uri="http://juno.tnt.uni-hannover.de:5200",
optuna_db="mysql+pymysql://optuna:optuna@juno:3311/optuna"
)
2. Finding the Best Run
You can easily find the top-performing run for a specific study:
best_run = tm.get_best_run(
experiment_name="3DR-Exp08-V2",
metric_name="metric-spearmann_r"
)
if best_run:
print(f"Best Run ID: {best_run.run_id}")
print(f"Spearman R: {best_run.metrics['metric-spearmann_r']:.4f}")
3. High-Performance Logging
In your training scripts, use the safe_set_experiment utility to prevent race conditions when running high-concurrency Slurm jobs:
from gunz_ml.integrations.mlflow import safe_set_experiment
import mlflow
# Ensures the experiment exists and is active before logging
safe_set_experiment("3DR-NewStudy")
with mlflow.start_run(run_name="worker-01"):
mlflow.log_param("learning_rate", 0.001)
mlflow.log_metric("accuracy", 0.95)
4. Calculating Pareto Frontiers
If you have multiple objectives (e.g., Accuracy vs. Latency), you can find the Pareto frontier efficiently:
import pandas as pd
from gunz_ml.methods import find_pareto_frontier_fast
from optuna.study import StudyDirection
# Load your results
df = pd.DataFrame({
'accuracy': [0.92, 0.94, 0.88, 0.95],
'latency_ms': [15, 20, 10, 25]
})
# Find models that maximize accuracy and minimize latency
indices = find_pareto_frontier_fast(
df, 'accuracy', 'latency_ms',
StudyDirection.MAXIMIZE, StudyDirection.MINIMIZE
)
print(df.iloc[indices])