-
Notifications
You must be signed in to change notification settings - Fork 867
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Example showing how to deploy MNIST using kserve with minikube
- Loading branch information
Showing
1 changed file
with
163 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
# Digit recognition model with MNIST dataset using a Kubernetes cluster | ||
|
||
In this example, we show how to use a pre-trained custom MNIST model to perform real time Digit recognition with TorchServe. | ||
We will be serving the model using Kserve deployed using [minikube](https://minikube.sigs.k8s.io/docs/start/). | ||
|
||
The inference service would return the digit inferred by the model in the input image. | ||
|
||
|
||
## Install kserve | ||
|
||
Start minikube cluster | ||
|
||
``` | ||
minikube start | ||
``` | ||
|
||
Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/kserve, run the steps from /home/my_path/kserve | ||
|
||
Run the following for quick install of kserve | ||
``` | ||
./hack/quick_install.sh | ||
``` | ||
|
||
Make sure kserve is installed on minikube cluster using | ||
|
||
``` | ||
kubectl get pods -n kserve | ||
``` | ||
|
||
This should result in | ||
``` | ||
NAME READY STATUS RESTARTS AGE | ||
kserve-controller-manager-57574b4878-rnsjn 2/2 Running 0 17s | ||
``` | ||
|
||
TorchServe supports KServe V1 and V2 protocol. We show how to deploy with both for Mnist. | ||
|
||
## KServe V1 protocol | ||
|
||
Deploy `InferenceService` with Kserve V1 protocol | ||
|
||
``` | ||
kubectl apply -f docs/samples/v1beta1/torchserve/v1/torchserve.yaml | ||
``` | ||
|
||
results in | ||
|
||
``` | ||
inferenceservice.serving.kserve.io/torchserve created | ||
``` | ||
|
||
We need to wait till the pod is up | ||
|
||
``` | ||
kubectl get pods | ||
NAME READY STATUS RESTARTS AGE | ||
torchserve-predictor-00001-deployment-8d66f9c-dkdhr 2/2 Running 0 8m19s | ||
``` | ||
|
||
We need to set the following | ||
|
||
``` | ||
MODEL_NAME=mnist | ||
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
``` | ||
|
||
``` | ||
export INGRESS_HOST=localhost | ||
export INGRESS_PORT=8080 | ||
``` | ||
|
||
``` | ||
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}') | ||
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80 & | ||
``` | ||
|
||
Make an inference request | ||
|
||
``` | ||
curl -H "Content-Type: application/json" -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./docs/samples/v1beta1/torchserve/v1/mnist.json | ||
``` | ||
|
||
Expected output is | ||
|
||
``` | ||
{"predictions":[2]} | ||
``` | ||
|
||
## KServe V2 protocol | ||
|
||
Deploy `InferenceService` with Kserve V2 protocol | ||
|
||
``` | ||
kubectl apply -f docs/samples/v1beta1/torchserve/v2/mnist.yaml | ||
``` | ||
|
||
results in | ||
|
||
``` | ||
inferenceservice.serving.kserve.io/torchserve-mnist-v2 created | ||
``` | ||
|
||
We need to check the pod is running with | ||
|
||
``` | ||
kubectl get pods | ||
NAME READY STATUS RESTARTS AGE | ||
torchserve-mnist-v2-predictor-00001-deployment-6c8c684dcb-4mfmr 2/2 Running 0 2m37s | ||
``` | ||
|
||
Inspecting the logs of the pods to check the version of TorchServe | ||
|
||
``` | ||
kubectl logs torchserve-mnist-v2-predictor-00001-deployment-6c8c684dcb-4mfmr | ||
Defaulted container "kserve-container" out of: kserve-container, queue-proxy, storage-initializer (init) | ||
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. | ||
2023-10-12T20:50:39,466 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties | ||
2023-10-12T20:50:39,468 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager... | ||
2023-10-12T20:50:39,659 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml | ||
2023-10-12T20:50:39,779 [INFO ] main org.pytorch.serve.ModelServer - | ||
Torchserve version: 0.8.2 | ||
TS Home: /home/venv/lib/python3.9/site-packages | ||
Current directory: /home/model-server | ||
Temp directory: /home/model-server/tmp | ||
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml | ||
``` | ||
|
||
We need to set the following | ||
|
||
``` | ||
MODEL_NAME=mnist | ||
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve-mnist-v2 -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
``` | ||
|
||
``` | ||
export INGRESS_HOST=localhost | ||
export INGRESS_PORT=8080 | ||
``` | ||
|
||
``` | ||
INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}') | ||
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80 & | ||
``` | ||
|
||
Make an inference request with tensor input | ||
|
||
``` | ||
curl -v -H "Content-Type: application/json" -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./docs/samples/v1beta1/torchserve/v2/tensor_conv/mnist_v2.json | ||
``` | ||
|
||
Expected output is | ||
|
||
``` | ||
{"model_name":"mnist","model_version":null,"id":"d3b15cad-50a2-4eaf-80ce-8b0a428bd298","parameters":null,"outputs":[{"name":"input-0","shape":[1],"datatype":"INT64","parameters":null,"data":[1]}]} | ||
``` | ||
|
||
## Stop and Delete the cluster | ||
|
||
``` | ||
minikube stop | ||
minikube delete | ||
``` |