Quickstart

This guide will show you how to use Gunz-ML to interact with the Juno tracking infrastructure and perform basic analysis tasks.

1. Connecting to Juno

The core of the library is the TrackingManager, which provides a unified interface for MLflow and Optuna.

from gunz_ml.management import TrackingManager

# Initialize the manager (defaults to Juno production server)
tm = TrackingManager(
    mlflow_uri="http://juno.tnt.uni-hannover.de:5200",
    optuna_db="mysql+pymysql://optuna:optuna@juno:3311/optuna"
)

2. Finding the Best Run

You can easily find the top-performing run for a specific study:

best_run = tm.get_best_run(
    experiment_name="3DR-Exp08-V2", 
    metric_name="metric-spearmann_r"
)

if best_run:
    print(f"Best Run ID: {best_run.run_id}")
    print(f"Spearman R: {best_run.metrics['metric-spearmann_r']:.4f}")

3. High-Performance Logging

In your training scripts, use the safe_set_experiment utility to prevent race conditions when running high-concurrency Slurm jobs:

from gunz_ml.integrations.mlflow import safe_set_experiment
import mlflow

# Ensures the experiment exists and is active before logging
safe_set_experiment("3DR-NewStudy")

with mlflow.start_run(run_name="worker-01"):
    mlflow.log_param("learning_rate", 0.001)
    mlflow.log_metric("accuracy", 0.95)

4. Calculating Pareto Frontiers

If you have multiple objectives (e.g., Accuracy vs. Latency), you can find the Pareto frontier efficiently:

import pandas as pd
from gunz_ml.methods import find_pareto_frontier_fast
from optuna.study import StudyDirection

# Load your results
df = pd.DataFrame({
    'accuracy': [0.92, 0.94, 0.88, 0.95],
    'latency_ms': [15, 20, 10, 25]
})

# Find models that maximize accuracy and minimize latency
indices = find_pareto_frontier_fast(
    df, 'accuracy', 'latency_ms',
    StudyDirection.MAXIMIZE, StudyDirection.MINIMIZE
)

print(df.iloc[indices])