The goal of this project is to develop a plug&play containerised deployment and serving of the developed Machine Learning models.
It is done by using (for example, can be any) onnx format for models, Fast API for request, redis for keeping current data and docker for keeping it nice and tight. For now the toy model is CNN MNIST classifier but it can be easily extended for other models and/or regressors/other types.
Project contains three independent docker containers:
web_api- front-end container (CLI or streamlit UI) of the project, exposed to outside world for users' requests/input. It's main role is to pass a request to proper, defined model waiting withinmodel_apiand return results. Take a look intoREADME.mdinweb_apifor more details.model_api- responsible for model initialization (using onnx runtime session) and communication only withweb_apifor the inputs and results anddb_apito store current results in redis db and persist them. Take a look intoREADME.mdinmodel_apifor more details.db_api- database of the project responsible for keeping current (or all) models results. If one requests again the same input for the same model, thedb_apiwill return it's previous value instead of inferencing again the input. It is basically a wrapper of the redis image which is configured for the current setup within network bridge.
Those containers can communicate only within specified docker network bridge stack_api exposing only one port within the web_api for the communication with outside world.
- Install docker, docker-compose etc. if necessary
- Create network bridge for containers:
docker network create stack_api - Run:
chmod a+x build_all.sh start_all.sh check_health.sh stop_all.shto turn 'executability' ofshfiles - Build images by:
./build_all.sh - Start containers by:
./start_all.sh APP_TYPE MODEL_NAME, whereAPP_TYPEandMODEL_NAMEvariables are define below. Basically it starts:model_api- container withMODEL_NAMEready for inference in gpu (or cpu)onnxsession.db_api-rediscontainer which saves inference results according to hexdigest of theAPP_TYPEand selected image name so that if we made an inference twice on the same image we would make one prediction and one would be taken from redis to save for example time.web_api-fast_apiifAPP_TYPE=cliorstreamlitifAPP_TYPE=uicontainer for posting requests for inference and storing/getting them to/from db. Runs on port5000locally. If:APP_TYPE=uiselected you can go tolocalhost:5000to use UIAPP_TYPE=clipost requests via cli (for example:python web_api/tests/test_{MODEL_NAME}_cli_request.py {IMAGE_PATH}.png)
- You can check whether the containers are running properly by:
./check_health.shor peek into them by:docker logs CONTAINER_NAME - To test containers separately just run:
python tests/{TEST_NAME}.pywithin containers dirs. - To stop containers just run:
./stop_all.sh
MODEL_NAMEfor now is available in two options:mnistandleukemiaif different name is specified container won't start.APP_TYPEfor now is available in two options:ui(streamlit) andcli(usual cmd) if different name is specified container won't start.
If you get error from model_api while starting like: [...]CUDA failure 999: unknown error[...] do the following:
sudo rmmod nvidia_uvmsudo rmmod nvidiasudo modprobe nvidiasudo modprobe nvidia_uvm
And try again starting containers.