Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion manuscript/01.1-Introduction-Machine_Learning_Workflow.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Model Tracking tools are often used at the development and testing stages of the

Model Serving refers to the process of deploying a machine learning model in a production environment, so it can be used to make predictions on new data. This includes tasks such as scaling the model to handle large amounts of data, deploying the model to different environments, and monitoring the performance of the deployed model. Model serving tools are specifically used at the deployment stage of the machine learning workflow and can handle the necessary tasks mentioned beforehand.

There are multiple tools that integrate the funtionality of serving models, each different in its specific use cases, for example *TensorFlow*, *Kubernetes*, *DataRobot*, or also the already mentioned tools *MLflow* and *Airflow*.
There are multiple tools that integrate the funtionality of serving models, each different in its specific use cases, for example *KF Serve*, *BentoML*, *Seldon*, or also the already mentioned tools like *MLflow*.


### Developing Machine Learning Models
Expand Down
10 changes: 4 additions & 6 deletions manuscript/07-ML-Project_Design.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,12 @@ The infrastructure will be maintained using the Infrastructure as Code tool *Ter

The following chapters give an introductory tutorial on each of the previously introduced tools. A machine learning workflow using Airflow is set up on the deployed infrastructure, including data preprocessing, model training, and model deployment, as well as tracking the experiment and deploying the model into production using MLFlow.

The necessary AWS infrastructure is set up using Terraform. This includes creating an AWS EKS cluster and the associated ressources like a virtual private cloud (VPC), subnets, security groups, IAM roles, as well as further AWS ressources needed to deploy Airflow and MLflow.
The necessary AWS infrastructure is set up using Terraform. This includes creating an AWS EKS cluster and the associated ressources like a virtual private cloud (VPC), subnets, security groups, IAM roles, as well as further AWS ressources needed to deploy custom modules. Networking is handled by AWS Application Load Balancing service or Ingress controller to route traffic to the correct service/pod in the cluster.
Once the EKS cluster is set up, Kubernetes can be used to deploy and manage applications on the cluster. Helm, a package manager for Kubernetes, is used to manage the deployment of Airflow and MLflow. The EKS cluster allows for easy scalability and management of the platforms. The code is made public on a Github repository and Github Actions is used for automating the deployment of the infrastructure using CI/CD principles.

Once the infrastructure is set up, machine learning models can be deployed to the EKS cluster as Kubernetes pods, using Airflows scheduling processes. Airflow's ability to scan local directories or Git repositories will be used to import the relevant machine learning code from second Github repository.
Similarly, to building Airflow workflows, the machine learning code will also include using the MLFlow API to allow for model tracking. Github Actions is used as a CI/CD pipeline to automatically build, test, and deploy machine learning models to this repository similarly as it is used in the repository for the infrastructure.
Once the infrastructure is set up, machine learning models can be trained on the EKS cluster as Kubernetes pods, using Airflows scheduling processes. Airflow's ability to scan local directories or Git repositories will be used to import the relevant machine learning code from second Github repository.
Similarly, to building Airflow workflows, the machine learning code will also include using the MLFlow API to allow for model tracking and storage. Github Actions is used as a CI/CD pipeline to automatically build, test, and deploy the machine learning model code to this repository similarly as it is used in the repository for the infrastructure.

<!---
Monitoring and logging would be achieved using CloudWatch to monitor the health and performance of the EKS cluster and its components, such as worker nodes, Kubernetes pods, etc and ELK stack or similar for logging of the system and applications. Networking would be handled by AWS Elastic Load Balancing service or Ingress controller to route traffic to the correct service/pod in the cluster.
-->
Model serving is done via Seldon, which allows for automatic scalability and seamlessly integrates with MLflow. Monitoring and logging is achieved using Prometheus & Grafana to monitor the health and performance of the EKS cluster and its components, such as worker nodes, Kubernetes pods, etc and similarly for monitoring the deployed models as applications.

Whereas the deployment of the infrastructure would be taken care of by MLOps-, DevOps-, and Data Engineers, the development of the Airflow workflows including MLFlow would be taken care of by Data Scientist and ML Engineers.
16 changes: 16 additions & 0 deletions manuscript/08.3-Deployment-Infrastructure_Modules.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -547,6 +547,20 @@ resource "helm_release" "jupyterhub" {
}
```


<!---

### Seldon

Seldon is used for Model deployment.

helm chart installed

--->




<!---

### Monitoring
Expand All @@ -557,5 +571,7 @@ Prometheus is employed for [specific purpose], while Grafana is used for [specif

#### Prometheus
#### Grafana

grafana github oauth and ingress
--->

2 changes: 1 addition & 1 deletion manuscript/09.2-Deployment-Usage_Pipeline-Workflow.Rmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@


## Pipeline Workflow
## Training Pipeline Workflow

The code and machine learning pipeline have been modularized into distinct steps, including preprocessing, model training, model comparison, and model serving. Airflow serves as the model workflow tool, generating DAGs for managing the pipeline. MLflow is integrated to facilitate model tracking, registry, and serving functionalities. To ensure portability and scalability, the codebase has been containerized using Docker, allowing it to be executed in Docker and/or Kubernetes environments.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

## Building the Pipeline Steps
## Training Pipeline Steps

As mentioned previously, the machine learning pipeline for this particular use case comprises three primary stages: preprocessing, training, and serving. Furthermore, only the model that achieves the highest accuracy is chosen for deployment, which introduces an additional step for model comparison. Each of these steps will be further explained in the upcoming sections.

Expand Down
82 changes: 6 additions & 76 deletions manuscript/09.4-Deployment-Usage_Model-Serving.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ The concept involves running a Docker container that serves the pre-trained Tens

### Model Serving

Model Serving is done via Seldon. Explanation is TBD

<!---
### Model Serving

The model serving application is built on FastAPI, which includes various endpoints catering to our use case. The primary endpoint, `predict`, allows for multiple predictions to be made, while additional maintenance endpoints such as `info` or `health` provide relevant information about the app itself.

To initiate the prediction process, the model is first retrieved from the MLflow registry. The specific model to be fetched and the location of the MLflow server are specified through environment variables. Once the model is loaded, it is used to generate predictions based on the provided input data. The API call returns the predictions in the form of a Python list.
Expand Down Expand Up @@ -147,79 +152,4 @@ The following code snippoet contains the actual logic for the `/predict` endpoin
raise HTTPException(status_code=400, detail="Invalid file format. Only JPG Files accepted.")

```

### Streamlit App

The Streamlit app offers a simple interface for performing inferences on the served model. The user interface enables users to upload a `jpg` image. Upon clicking the `predict` button, the image is sent to the model serving app, where a prediction is made. The prediction results are then returned as a JSON file, which can be downloaded upon request.

**Importing Dependencies**
This section imports the necessary dependencies for the code, including libraries for file handling, JSON processing, working with images, making HTTP requests, and creating the Streamlit application.

```python
# Imports necessary packages
import io
import json
import os

import pandas as pd
import requests
import streamlit as st
from PIL import Image

```

#### Setting Up the Streamlit Application {.unlisted .unnumbered}
At first, the header and subheader for the Streamlit application are set. Afterward, the FastAPI serving IP and port are retrieved from environment variables. They constructs the FastAPI endpoint URL and are later used to send a POST request to.

```python
st.header("MLOps Engineering Project")
st.subheader("Skin Cancer Detection")

# FastAPI endpoint
FASTAPI_SERVING_IP = os.getenv("FASTAPI_SERVING_IP")
FASTAPI_SERVING_PORT = os.getenv("FASTAPI_SERVING_PORT")
FASTAPI_ENDPOINT = f"http://{FASTAPI_SERVING_IP}:{FASTAPI_SERVING_PORT}/predict"

```

#### Uploading test image {.unlisted .unnumbered}
The `st.file_uploader` allows the user to upload a test image in JPG format using the Streamlit file uploader widget. The type of the uploaded file is limited to `.jpg`. If a test image has been uploaded, the image is processed by opening it with PIL and creating a file-like object.


```python
test_image = st.file_uploader("", type=["jpg"], accept_multiple_files=False)

if test_image:
image = Image.open(test_image)
image_file = io.BytesIO(test_image.getvalue())
files = {"file": image_file}

```

#### Displaying the uploaded image and performing prediction {.unlisted .unnumbered}
A two-column layout in the Streamlit app is created That displays the uploaded image in the first column. In the second columns, a button for the user to start the prediction process is displayed. When the button is clicked, it sends a POST request to the FastAPI endpoint with the uploaded image file. The prediction results are displayed as JSON and can be downloaded as a JSON file.

```python
col1, col2 = st.columns(2)

with col1:
# Display the uploaded image in the first column
st.image(test_image, caption="", use_column_width="always")

with col2:
if st.button("Start Prediction"):
with st.spinner("Prediction in Progress. Please Wait..."):
# Send a POST request to FastAPI for prediction
output = requests.post(FASTAPI_ENDPOINT, files=files, timeout=8000)
st.success("Success! Click the Download button below to retrieve prediction results (JSON format)")
# Display the prediction results in JSON format
st.json(output.json())
# Add a download button to download the prediction results as a JSON file
st.download_button(
label="Download",
data=json.dumps(output.json()),
file_name="cnn_skin_cancer_prediction_results.json",
)

```

--->
82 changes: 82 additions & 0 deletions manuscript/09.5-Deployment-Usage_Model-Inferencing.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@

## Model Serving & Inferencing

The process of serving and making inferences utilizes Docker containers and runs them within Kubernetes pods.

The concept involves running a Docker container that serves the pre-trained TensorFlow model using FastAPI. This containerized model is responsible for providing predictions and responses to incoming requests. Additionally, a Streamlit app is used to interact with the served model, enabling users to make inferences by sending input data to the model and receiving the corresponding predictions.

### Streamlit App

The Streamlit app offers a simple interface for performing inferences on the served model. The user interface enables users to upload a `jpg` image. Upon clicking the `predict` button, the image is sent to the model serving app, where a prediction is made. The prediction results are then returned as a JSON file, which can be downloaded upon request.

**Importing Dependencies**
This section imports the necessary dependencies for the code, including libraries for file handling, JSON processing, working with images, making HTTP requests, and creating the Streamlit application.

```python
# Imports necessary packages
import io
import json
import os

import pandas as pd
import requests
import streamlit as st
from PIL import Image

```

#### Setting Up the Streamlit Application {.unlisted .unnumbered}
At first, the header and subheader for the Streamlit application are set. Afterward, the FastAPI serving IP and port are retrieved from environment variables. They constructs the FastAPI endpoint URL and are later used to send a POST request to.

```python
st.header("MLOps Engineering Project")
st.subheader("Skin Cancer Detection")

# FastAPI endpoint
FASTAPI_SERVING_IP = os.getenv("FASTAPI_SERVING_IP")
FASTAPI_SERVING_PORT = os.getenv("FASTAPI_SERVING_PORT")
FASTAPI_ENDPOINT = f"http://{FASTAPI_SERVING_IP}:{FASTAPI_SERVING_PORT}/predict"

```

#### Uploading test image {.unlisted .unnumbered}
The `st.file_uploader` allows the user to upload a test image in JPG format using the Streamlit file uploader widget. The type of the uploaded file is limited to `.jpg`. If a test image has been uploaded, the image is processed by opening it with PIL and creating a file-like object.


```python
test_image = st.file_uploader("", type=["jpg"], accept_multiple_files=False)

if test_image:
image = Image.open(test_image)
image_file = io.BytesIO(test_image.getvalue())
files = {"file": image_file}

```

#### Displaying the uploaded image and performing prediction {.unlisted .unnumbered}
A two-column layout in the Streamlit app is created That displays the uploaded image in the first column. In the second columns, a button for the user to start the prediction process is displayed. When the button is clicked, it sends a POST request to the FastAPI endpoint with the uploaded image file. The prediction results are displayed as JSON and can be downloaded as a JSON file.

```python
col1, col2 = st.columns(2)

with col1:
# Display the uploaded image in the first column
st.image(test_image, caption="", use_column_width="always")

with col2:
if st.button("Start Prediction"):
with st.spinner("Prediction in Progress. Please Wait..."):
# Send a POST request to FastAPI for prediction
output = requests.post(FASTAPI_ENDPOINT, files=files, timeout=8000)
st.success("Success! Click the Download button below to retrieve prediction results (JSON format)")
# Display the prediction results in JSON format
st.json(output.json())
# Add a download button to download the prediction results as a JSON file
st.download_button(
label="Download",
data=json.dumps(output.json()),
file_name="cnn_skin_cancer_prediction_results.json",
)

```

3 changes: 2 additions & 1 deletion manuscript/_bookdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,9 @@ rmd_files:
"09-Deployment-Usage_Overview.Rmd",
"09.1-Deployment-Usage_IDE.Rmd",
"09.2-Deployment-Usage_Pipeline-Workflow.Rmd",
"09.3-Deployment-Usage_Building-Model-Pipeline.Rmd",
"09.3-Deployment-Usage_Training-Model-Pipeline.Rmd",
"09.4-Deployment-Usage_Model-Serving.Rmd",
"09.5-Deployment-Usage_Model-Inferencing.Rmd",
"10-Acknowledgements.Rmd"
]

Expand Down