seblum · seblum · Aug 12, 2023
diff --git a/manuscript/01.1-Introduction-Machine_Learning_Workflow.Rmd b/manuscript/01.1-Introduction-Machine_Learning_Workflow.Rmd
@@ -55,7 +55,7 @@ Model Tracking tools are often used at the development and testing stages of the
 
 Model Serving refers to the process of deploying a machine learning model in a production environment, so it can be used to make predictions on new data. This includes tasks such as scaling the model to handle large amounts of data, deploying the model to different environments, and monitoring the performance of the deployed model. Model serving tools are specifically used at the deployment stage of the machine learning workflow and can handle the necessary tasks mentioned beforehand.
 
-There are multiple tools that integrate the funtionality of serving models, each different in its specific use cases, for example *TensorFlow*, *Kubernetes*, *DataRobot*, or also the already mentioned tools *MLflow* and *Airflow*.
+There are multiple tools that integrate the funtionality of serving models, each different in its specific use cases, for example *KF Serve*, *BentoML*, *Seldon*, or also the already mentioned tools like *MLflow*.
 
 
 ### Developing Machine Learning Models

diff --git a/manuscript/07-ML-Project_Design.Rmd b/manuscript/07-ML-Project_Design.Rmd
@@ -14,14 +14,12 @@ The infrastructure will be maintained using the Infrastructure as Code tool *Ter
 
 The following chapters give an introductory tutorial on each of the previously introduced tools. A machine learning workflow using Airflow is set up on the deployed infrastructure, including data preprocessing, model training, and model deployment, as well as tracking the experiment and deploying the model into production using MLFlow. 
 
-The necessary AWS infrastructure is set up using Terraform. This includes creating an AWS EKS cluster and the associated ressources like a virtual private cloud (VPC), subnets, security groups, IAM roles, as well as further AWS ressources needed to deploy Airflow and MLflow.
+The necessary AWS infrastructure is set up using Terraform. This includes creating an AWS EKS cluster and the associated ressources like a virtual private cloud (VPC), subnets, security groups, IAM roles, as well as further AWS ressources needed to deploy custom modules.  Networking is handled by AWS Application Load Balancing service or Ingress controller to route traffic to the correct service/pod in the cluster.
 Once the EKS cluster is set up, Kubernetes can be used to deploy and manage applications on the cluster. Helm, a package manager for Kubernetes, is used to manage the deployment of Airflow and MLflow. The EKS cluster allows for easy scalability and management of the platforms. The code is made public on a Github repository and Github Actions is used for automating the deployment of the infrastructure using CI/CD principles. 
 
-Once the infrastructure is set up, machine learning models can be deployed to the EKS cluster as Kubernetes pods, using Airflows scheduling processes. Airflow's ability to scan local directories or Git repositories will be used to import the relevant machine learning code from second Github repository.
-Similarly, to building Airflow workflows, the machine learning code will also include using the MLFlow API to allow for model tracking. Github Actions is used as a CI/CD pipeline to automatically build, test, and deploy machine learning models to this repository similarly as it is used in the repository for the infrastructure. 
+Once the infrastructure is set up, machine learning models can be trained on the EKS cluster as Kubernetes pods, using Airflows scheduling processes. Airflow's ability to scan local directories or Git repositories will be used to import the relevant machine learning code from second Github repository.
+Similarly, to building Airflow workflows, the machine learning code will also include using the MLFlow API to allow for model tracking and storage. Github Actions is used as a CI/CD pipeline to automatically build, test, and deploy the machine learning model code to this repository similarly as it is used in the repository for the infrastructure. 
 
-<!---
-Monitoring and logging would be achieved using CloudWatch to monitor the health and performance of the EKS cluster and its components, such as worker nodes, Kubernetes pods, etc and ELK stack or similar for logging of the system and applications. Networking would be handled by AWS Elastic Load Balancing service or Ingress controller to route traffic to the correct service/pod in the cluster.
--->
+Model serving is done via Seldon, which allows for automatic scalability and seamlessly integrates with MLflow. Monitoring and logging is achieved using Prometheus & Grafana to monitor the health and performance of the EKS cluster and its components, such as worker nodes, Kubernetes pods, etc and similarly for monitoring the deployed models as applications.
 
 Whereas the deployment of the infrastructure would be taken care of by MLOps-, DevOps-, and Data Engineers, the development of the Airflow workflows including MLFlow would be taken care of by Data Scientist and ML Engineers.
diff --git a/manuscript/08.3-Deployment-Infrastructure_Modules.Rmd b/manuscript/08.3-Deployment-Infrastructure_Modules.Rmd
@@ -547,6 +547,20 @@ resource "helm_release" "jupyterhub" {
 }
 ```
 
+
+<!---
+
+### Seldon
+
+Seldon is used for Model deployment.
+
+helm chart installed
+
+--->
+
+
+
+
 <!---
 
 ### Monitoring
@@ -557,5 +571,7 @@ Prometheus is employed for [specific purpose], while Grafana is used for [specif
 
 #### Prometheus
 #### Grafana
+
+grafana github oauth and ingress
 --->
 
diff --git a/manuscript/09.2-Deployment-Usage_Pipeline-Workflow.Rmd b/manuscript/09.2-Deployment-Usage_Pipeline-Workflow.Rmd
@@ -1,6 +1,6 @@
 
 
-## Pipeline Workflow
+## Training Pipeline Workflow
 
 The code and machine learning pipeline have been modularized into distinct steps, including preprocessing, model training, model comparison, and model serving. Airflow serves as the model workflow tool, generating DAGs for managing the pipeline. MLflow is integrated to facilitate model tracking, registry, and serving functionalities. To ensure portability and scalability, the codebase has been containerized using Docker, allowing it to be executed in Docker and/or Kubernetes environments.
 

diff --git a/...loyment-Usage_Building-Model-Pipeline.Rmd → ...loyment-Usage_Training-Model-Pipeline.Rmd b/...loyment-Usage_Building-Model-Pipeline.Rmd → ...loyment-Usage_Training-Model-Pipeline.Rmd
@@ -1,5 +1,5 @@
 
-## Building the Pipeline Steps
+## Training Pipeline Steps
 
 As mentioned previously, the machine learning pipeline for this particular use case comprises three primary stages: preprocessing, training, and serving. Furthermore, only the model that achieves the highest accuracy is chosen for deployment, which introduces an additional step for model comparison. Each of these steps will be further explained in the upcoming sections.
 

diff --git a/manuscript/09.4-Deployment-Usage_Model-Serving.Rmd b/manuscript/09.4-Deployment-Usage_Model-Serving.Rmd
@@ -8,6 +8,11 @@ The concept involves running a Docker container that serves the pre-trained Tens
 
 ### Model Serving
 
+Model Serving is done via Seldon. Explanation is TBD
+
+<!---
+### Model Serving
+
 The model serving application is built on FastAPI, which includes various endpoints catering to our use case. The primary endpoint, `predict`, allows for multiple predictions to be made, while additional maintenance endpoints such as `info` or `health` provide relevant information about the app itself.
 
 To initiate the prediction process, the model is first retrieved from the MLflow registry. The specific model to be fetched and the location of the MLflow server are specified through environment variables. Once the model is loaded, it is used to generate predictions based on the provided input data. The API call returns the predictions in the form of a Python list.
@@ -147,79 +152,4 @@ The following code snippoet contains the actual logic for the `/predict` endpoin
         raise HTTPException(status_code=400, detail="Invalid file format. Only JPG Files accepted.")
 
 ```
-
-### Streamlit App
-
-The Streamlit app offers a simple interface for performing inferences on the served model. The user interface enables users to upload a `jpg` image. Upon clicking the `predict` button, the image is sent to the model serving app, where a prediction is made. The prediction results are then returned as a JSON file, which can be downloaded upon request.
-
-**Importing Dependencies**
-This section imports the necessary dependencies for the code, including libraries for file handling, JSON processing, working with images, making HTTP requests, and creating the Streamlit application.
-
-```python
-# Imports necessary packages
-import io
-import json
-import os
-
-import pandas as pd
-import requests
-import streamlit as st
-from PIL import Image
-
-```
-
-#### Setting Up the Streamlit Application {.unlisted .unnumbered}
-At first, the header and subheader for the Streamlit application are set. Afterward, the FastAPI serving IP and port are retrieved from environment variables. They constructs the FastAPI endpoint URL and are later used to send a POST request to.
-
-```python
-st.header("MLOps Engineering Project")
-st.subheader("Skin Cancer Detection")
-
-# FastAPI endpoint
-FASTAPI_SERVING_IP = os.getenv("FASTAPI_SERVING_IP")
-FASTAPI_SERVING_PORT = os.getenv("FASTAPI_SERVING_PORT")
-FASTAPI_ENDPOINT = f"http://{FASTAPI_SERVING_IP}:{FASTAPI_SERVING_PORT}/predict"
-
-```
-
-#### Uploading test image {.unlisted .unnumbered}
-The `st.file_uploader` allows the user to upload a test image in JPG format using the Streamlit file uploader widget. The type of the uploaded file is limited to `.jpg`. If a test image has been uploaded, the image is processed by opening it with PIL and creating a file-like object.
-
-
-```python
-test_image = st.file_uploader("", type=["jpg"], accept_multiple_files=False)
-
-if test_image:
-    image = Image.open(test_image)
-    image_file = io.BytesIO(test_image.getvalue())
-    files = {"file": image_file}
-
-```
-
-#### Displaying the uploaded image and performing prediction {.unlisted .unnumbered}
-A two-column layout in the Streamlit app is created That displays the uploaded image in the first column. In the second columns, a button for the user to start the prediction process is displayed. When the button is clicked, it sends a POST request to the FastAPI endpoint with the uploaded image file. The prediction results are displayed as JSON and can be downloaded as a JSON file.
-
-```python
-    col1, col2 = st.columns(2)
-
-    with col1:
-        # Display the uploaded image in the first column
-        st.image(test_image, caption="", use_column_width="always")
-
-    with col2:
-        if st.button("Start Prediction"):
-            with st.spinner("Prediction in Progress. Please Wait..."):
-                # Send a POST request to FastAPI for prediction
-                output = requests.post(FASTAPI_ENDPOINT, files=files, timeout=8000)
-            st.success("Success! Click the Download button below to retrieve prediction results (JSON format)")
-            # Display the prediction results in JSON format
-            st.json(output.json())
-            # Add a download button to download the prediction results as a JSON file
-            st.download_button(
-                label="Download",
-                data=json.dumps(output.json()),
-                file_name="cnn_skin_cancer_prediction_results.json",
-            )
-
-```
-
+--->
diff --git a/manuscript/09.5-Deployment-Usage_Model-Inferencing.Rmd b/manuscript/09.5-Deployment-Usage_Model-Inferencing.Rmd
@@ -0,0 +1,82 @@
+
+## Model Serving & Inferencing
+
+The process of serving and making inferences utilizes Docker containers and runs them within Kubernetes pods.
+
+The concept involves running a Docker container that serves the pre-trained TensorFlow model using FastAPI. This containerized model is responsible for providing predictions and responses to incoming requests. Additionally, a Streamlit app is used to interact with the served model, enabling users to make inferences by sending input data to the model and receiving the corresponding predictions.
+
+### Streamlit App
+
+The Streamlit app offers a simple interface for performing inferences on the served model. The user interface enables users to upload a `jpg` image. Upon clicking the `predict` button, the image is sent to the model serving app, where a prediction is made. The prediction results are then returned as a JSON file, which can be downloaded upon request.
+
+**Importing Dependencies**
+This section imports the necessary dependencies for the code, including libraries for file handling, JSON processing, working with images, making HTTP requests, and creating the Streamlit application.
+
+```python
+# Imports necessary packages
+import io
+import json
+import os
+
+import pandas as pd
+import requests
+import streamlit as st
+from PIL import Image
+
+```
+
+#### Setting Up the Streamlit Application {.unlisted .unnumbered}
+At first, the header and subheader for the Streamlit application are set. Afterward, the FastAPI serving IP and port are retrieved from environment variables. They constructs the FastAPI endpoint URL and are later used to send a POST request to.
+
+```python
+st.header("MLOps Engineering Project")
+st.subheader("Skin Cancer Detection")
+
+# FastAPI endpoint
+FASTAPI_SERVING_IP = os.getenv("FASTAPI_SERVING_IP")
+FASTAPI_SERVING_PORT = os.getenv("FASTAPI_SERVING_PORT")
+FASTAPI_ENDPOINT = f"http://{FASTAPI_SERVING_IP}:{FASTAPI_SERVING_PORT}/predict"
+
+```
+
+#### Uploading test image {.unlisted .unnumbered}
+The `st.file_uploader` allows the user to upload a test image in JPG format using the Streamlit file uploader widget. The type of the uploaded file is limited to `.jpg`. If a test image has been uploaded, the image is processed by opening it with PIL and creating a file-like object.
+
+
+```python
+test_image = st.file_uploader("", type=["jpg"], accept_multiple_files=False)
+
+if test_image:
+    image = Image.open(test_image)
+    image_file = io.BytesIO(test_image.getvalue())
+    files = {"file": image_file}
+
+```
+
+#### Displaying the uploaded image and performing prediction {.unlisted .unnumbered}
+A two-column layout in the Streamlit app is created That displays the uploaded image in the first column. In the second columns, a button for the user to start the prediction process is displayed. When the button is clicked, it sends a POST request to the FastAPI endpoint with the uploaded image file. The prediction results are displayed as JSON and can be downloaded as a JSON file.
+
+```python
+    col1, col2 = st.columns(2)
+
+    with col1:
+        # Display the uploaded image in the first column
+        st.image(test_image, caption="", use_column_width="always")
+
+    with col2:
+        if st.button("Start Prediction"):
+            with st.spinner("Prediction in Progress. Please Wait..."):
+                # Send a POST request to FastAPI for prediction
+                output = requests.post(FASTAPI_ENDPOINT, files=files, timeout=8000)
+            st.success("Success! Click the Download button below to retrieve prediction results (JSON format)")
+            # Display the prediction results in JSON format
+            st.json(output.json())
+            # Add a download button to download the prediction results as a JSON file
+            st.download_button(
+                label="Download",
+                data=json.dumps(output.json()),
+                file_name="cnn_skin_cancer_prediction_results.json",
+            )
+
+```
+
diff --git a/manuscript/_bookdown.yml b/manuscript/_bookdown.yml
@@ -43,8 +43,9 @@ rmd_files:
     "09-Deployment-Usage_Overview.Rmd",
     "09.1-Deployment-Usage_IDE.Rmd",
     "09.2-Deployment-Usage_Pipeline-Workflow.Rmd",
-    "09.3-Deployment-Usage_Building-Model-Pipeline.Rmd",
+    "09.3-Deployment-Usage_Training-Model-Pipeline.Rmd",
     "09.4-Deployment-Usage_Model-Serving.Rmd",
+    "09.5-Deployment-Usage_Model-Inferencing.Rmd",
     "10-Acknowledgements.Rmd"
     ]