Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,44 @@ https://www.kaggle.com/models/emilcode/nightingale

After downloading the folders unzip them.

### How to set up a remote tracking server on your local host system
https://mlflow.org/docs/3.1.3/ml/tracking/tutorials/remote-server/

https://mlflow.org/docs/3.1.3/ml/tracking/server/

TODO!!!!

### Default tracking server
If you don't configure/set the MLFLOW_TRACKING_URI to a specific location, MLflow will create a mlruns directory in the CWD or your python run (e.g. next to the notebook if mlfow run has been started for the notebook).

### Use .env file to configure mlflow tracking server
MLFLOW_TRACKING_URI=<your-tracking-uri> (local file system or url to remote tracking server)
MLFLOW_EXPERIMENT_ID=<experiment_id>

### Set tracking server during container startup
Optional: To change the MLFLOW_TRACKING_URI during container startup add something like
"containerEnv": {
"MLFLOW_TRACKING_URI": "./mlruns"
},
to the .devcontainer.json file.

### Setup mlflow to databricks
If you have a databricks account and you prefere to track your experiments and register your models to the databricks tracking server and unity catalog your can set up the connection doing the following.

https://docs.databricks.com/aws/en/mlflow3/genai/getting-started/tracing/tracing-ide?language=Use+a+.env+file

Create an .env file in your project root and add:
'''
DATABRICKS_TOKEN=<your-databricks-token>
DATABRICKS_HOST=<host-url-of-your-databricks-cloud> (something like: https://abc-a1234b12-a123.cloud.databricks.com)
MLFLOW_TRACKING_URI=databricks
MLFLOW_REGISTRY_URI=databricks-uc (for unity catalog)
MLFLOW_EXPERIMENT_ID=<experiment_id>
'''

The .env file will be ignored by git by default, but make sure to NOT commit and push your .env file as it contains credentials not meant to be public.


### Run mlflow tracking server
Open the bash/Git bash and change the directory to the unzipped folder containing the model.
```bash
Expand Down
80 changes: 41 additions & 39 deletions notebooks/train_nightingale.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -285,64 +285,66 @@
"metadata": {},
"outputs": [],
"source": [
"# import mlflow\n",
"# from mlflow import MlflowClient\n",
"\n",
"# TRACKING_URI_LOCAL = \"http://host.docker.internal:5757\"\n",
"\n",
"# client = MlflowClient(tracking_uri=TRACKING_URI_LOCAL)\n",
"\n",
"import mlflow\n",
"from mlflow import MlflowClient\n",
"# from mlflow import MlflowClient\n",
"\n",
"TRACKING_URI_LOCAL = \"http://host.docker.internal:5757\"\n",
"# At the beginning of your Python script\n",
"from dotenv import load_dotenv\n",
"\n",
"client = MlflowClient(tracking_uri=TRACKING_URI_LOCAL)"
"# Load environment variables from .env file\n",
"load_dotenv()"
]
},
{
"cell_type": "markdown",
"id": "348bf482",
"cell_type": "code",
"execution_count": null,
"id": "ea50b3d5",
"metadata": {},
"outputs": [],
"source": [
"### Create experiment \n",
"RUN THE FOLLOWING CODE BLOCK ONLY ONCE FOR INITIAL EXPERIMENT SETUP!!"
"import os\n",
"print(\"Env:\", os.getenv(\"MLFLOW_TRACKING_URI\"))\n",
"print(\"From MLflow:\", mlflow.get_tracking_uri())"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1dcec8ff",
"cell_type": "markdown",
"id": "348bf482",
"metadata": {},
"outputs": [],
"source": [
"experiment_description = (\n",
" \"Nightingale is a bird call classification project.\"\n",
")\n",
"\n",
"experiment_tags = {\n",
" \"project_name\": \"nightingale\",\n",
" \"mlflow.note.content\": experiment_description,\n",
"}\n",
"\n",
"# only run following command once to create the experiment after the server has been started for the first time\n",
"client.create_experiment(name=\"Nightingale Bird Call Classification\", tags=experiment_tags)"
"### Create experiment \n",
"RUN THE FOLLOWING CODE BLOCK ONLY ONCE FOR INITIAL EXPERIMENT SETUP!!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a5010b9c",
"id": "1dcec8ff",
"metadata": {},
"outputs": [],
"source": [
"\n",
"\n",
"# Use the fluent API to set the tracking uri and the active experiment\n",
"mlflow.set_tracking_uri(TRACKING_URI_LOCAL)\n",
"\n",
"# Sets the current active experiment to the \"Nightingale_Bird_Call_Classification\" experiment and returns the Experiment metadata\n",
"nightingale_experiment = mlflow.set_experiment(\"Nightingale Bird Call Classification\")\n",
"\n",
"# Define a run name for this iteration of training.\n",
"# If this is not set, a unique name will be auto-generated for your run.\n",
"# run_name = \"nightingale_classifier_test\"\n",
"\n",
"# Define an artifact path that the model will be saved to.\n",
"# artifact_path = \"classifier_nightingale\""
"# experiment_description = (\n",
"# \"Nightingale is a bird call classification project.\"\n",
"# )\n",
"\n",
"# experiment_tags = {\n",
"# \"project_name\": \"nightingale\",\n",
"# \"mlflow.note.content\": experiment_description,\n",
"# }\n",
"\n",
"# # only run following command once to create the experiment after the server has been started for the first time\n",
"# # client.create_experiment(name=\"Nightingale Bird Call Classification\", tags=experiment_tags)\n",
"# mlflow.set_experiment(\n",
"# experiment_name=\"/Workspace/Users/ephraim.eckl@posteo.de/nightingale\",\n",
"# experiment_id=\"2165278269360514\"\n",
"# )"
]
},
{
Expand Down Expand Up @@ -427,7 +429,7 @@
"\n",
" print(\"Shape of input_example:\", sample_input.shape)\n",
" # Log an instance of the trained model for later use\n",
" model_info = mlflow.keras.log_model(model=bird_class_model, name = \"Bird-Call-Classifier-Head\", signature=signature, pip_requirements=['keras==3.10.0'], registered_model_name=\"Reg-Bird-Call-Classifier-Head\")\n",
" model_info = mlflow.keras.log_model(model=bird_class_model, name = \"Bird-Call-Classifier-Head\", signature=signature, pip_requirements=['keras==3.10.0'], registered_model_name=\"nightingale-dev.default.Reg-Bird-Call-Classifier-Head\")\n",
"# # mlflow.sklearn.log_model(sk_model=rf, input_example=X_val, name=artifact_path)\n",
" "
]
Expand Down Expand Up @@ -462,7 +464,7 @@
"metadata": {},
"outputs": [],
"source": [
"bird_class_model.save('bird_classifier_head.keras')"
"# bird_class_model.save('bird_classifier_head.keras')"
]
}
],
Expand Down
Loading