# Core Concepts

**Gunz-ML** is more than just a logging utility; it is a research-centric SDK designed to manage the entire lifecycle of deep learning experiments in a distributed environment.

## 1. The Research SDK Philosophy

Most experiment trackers are passive sinks for data. Gunz-ML acts as a **bridge**. It allows you to:
*   **Write:** Log high-frequency metrics and artifacts during training.
*   **Read:** Query past results to inform the current HPO (Hyperparameter Optimization) loop.
*   **Extract:** Programmatically download artifacts from Juno for downstream analysis in notebooks.

## 2. Tracking vs. Management

The library distinguishes between two levels of operation:
*   **Tracking (`gunz_ml.integrations`):** Low-level logic to ensure metrics reach MLflow and Optuna without database locks or race conditions.
*   **Management (`gunz_ml.management`):** High-level logic (e.g., `TrackingManager`) used to find the best runs, prune failed trials, and generate comparison reports across studies.

## 3. Distributed Safety

In a Slurm-based cluster environment, multiple workers often try to initialize the same study simultaneously. Gunz-ML implements **Initialisation First Policy**:
*   Studies are pre-scaffolded using the `gunz-ml init` CLI.
*   Workers use `safe_set_experiment` to verify the environment is ready before starting, preventing the common "database is locked" errors in MariaDB.

## 4. The Juno Ecosystem

Gunz-ML is designed to communicate with **Juno**, the unified experiment infrastructure. 
*   **MLflow:** Stores run metadata, parameters, and time-series metrics.
*   **Optuna (MariaDB):** Stores the relational data for HPO trials.
*   **MinIO (S3):** Stores large binary artifacts (model checkpoints, .pt files, and plots).

By standardizing on these backends, Gunz-ML ensures that your research is reproducible, queryable, and persistent.