Machine Learning/LiftWing/ML-Sandbox/Configuration
Installation + Configuration script for ML-Sandbox.
Summary
This is a guide for installing the KServe stack locally using WMF tools and images. The install steps diverge from the official KServe quick_install script in order to run on WMF infrastructure. All upstream changes to YAML configs were first published in the KServe chart’s README for the deployment-charts repository. In deployment-charts/custom_deploy.d/istio/ml-serve there is the config.yaml that we apply in production.
Minikube
We are running a small cluster using Minikube, which can be installed with the following command:
curl-LOhttps://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 sudoinstallminikube-linux-amd64/usr/local/bin/minikube
To match production, we want to make sure we set our k8s version to v.1.16.15:
# if needed minikubestop minikubedelete # start minikube minikubeconfigsetmemory24576 minikubeconfigsetcpus4 minikubestart--kubernetes-version=v1.16.15
If you see an issue related to something like HOST_LOCK_JUJU, you can do the following hack:
sudochownroot:root/tmp/juju-mk* sudosysctlfs.protected_regular=0
You will also need to install kubectl, or you can use the one provided by minikube with an alias:
aliaskubectl="minikube kubectl --"
Helm
First, install helm3 (it is in the WMF APT repo https://wikitech.wikimedia.org/wiki/APT_repository, debian buster) See: https://apt-browser.toolforge.org/buster-wikimedia/main/:
sudoaptinstallhelm
Also ensure that it is helm3:
helmversion version.BuildInfo{Version:"v3.7.1",GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4",GitTreeState:"clean",GoVersion:"go1.16.9"}
Now download the deployment-charts repo and use the templates to create “dev” charts:
NetworkPolicy
instances) that may not be supported out of the box. The helm template
approach outlined here also does not take into account any values.yaml
files.######################################## Create dev charts via helm template ######################################## gitclone"https://gerrit.wikimedia.org/r/operations/deployment-charts"cddeployment-charts helmtemplate"charts/knative-serving">dev-knative-serving.yaml helmtemplate"charts/kserve">dev-kserve.yaml
There will a number of references to “RELEASENAME” in the new yaml files, so we will need to replace it with a name like “dev”:
# replace all references to "RELEASE_NAME" to "dev" sed-i's/RELEASE-NAME/dev/g'dev-knative-serving.yaml sed-i's/RELEASE-NAME/dev/g'dev-kserve.yaml
Istio
Istio is installed using the istioctl package, which has been added to the WMF APT repository, you can use it (https://wikitech.wikimedia.org/wiki/APT_repository, debian buster). See: https://apt-browser.toolforge.org/buster-wikimedia/main/ , we want to install Istio 1.9.5 (istioctl: 1.9.5-1)
For Wikimedia servers and Cloud VPS instances, the repositories are automatically configured via Puppet. You can install it as follows
sudoaptinstallistioctl-y
Now we need to create the istio-system namespace:
####################### Istio Installation ####################### cat<<EOF | kubectl apply -f -apiVersion: v1kind: Namespacemetadata: name: istio-system labels: istio-injection: disabledEOF
Next you will need to create a file called istio-minimal-operator.yaml:
apiVersion:install.istio.io/v1beta1kind:IstioOperatorspec:values:global:proxy:autoInject:disableduseMCP:false# The third-party-jwt is not enabled on all k8s.# See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokensjwtPolicy:first-party-jwtmeshConfig:accessLogFile:/dev/stdoutaddonComponents:pilot:enabled:truecomponents:ingressGateways:-name:istio-ingressgatewayenabled:true-name:cluster-local-gatewayenabled:truelabel:istio:cluster-local-gatewayapp:cluster-local-gatewayk8s:service:type:ClusterIPports:-port:15020targetPort:15021name:status-port-port:80name:http2targetPort:8080-port:443name:httpstargetPort:8443
Next you can apply the manifest using istioctl:
/usr/bin/istioctl-1.9.5manifestapply-f../istio-minimal-operator.yaml-y
Knative
We are currently running Knative Serving v0.18.1.
First, let’s create a namespace for knative-serving:
######################### Knative Installation ######################### cat<<EOF | kubectl apply -f -apiVersion: v1kind: Namespacemetadata: name: knative-serving labels: serving.knative.dev/release: "v0.18.1"EOF
Now let’s install the Knative serving-crds.yaml. The CRDs are copied from upstream: https://github.com/knative/serving/releases/download/v0.18.1/serving-crds.yaml
We have them included in our deployment-charts repo: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/knative-serving-crds/templates/crds.yaml
You can install using the following command (in the deployment-charts repo):
kubectlapply-fcharts/knative-serving-crds/templates/crds.yaml
We can now apply the Knative “dev” chart that we generated using helm:
kubectlapply-fdev-knative-serving.yaml
Next we need to add registries skipping tag resolving etc.:
# update config-deployment to skip tag resolving cat<<EOF | kubectl apply -f -apiVersion: v1kind: ConfigMapmetadata: name: config-deployment namespace: knative-servingdata: queueSidecarImage: docker-registry.wikimedia.org/knative-serving-queue:0.18.1-4 registriesSkippingTagResolving: "kind.local,ko.local,dev.local,docker-registry.wikimedia.org,index.docker.io"EOF
Images
- Webhook: https://docker-registry.wikimedia.org/knative-serving-webhook/tags/
- Queue: https://docker-registry.wikimedia.org/knative-serving-queue/tags/
- Controller: https://docker-registry.wikimedia.org/knative-serving-controller/tags/
- Autoscaler: https://docker-registry.wikimedia.org/knative-serving-autoscaler/tags/
- Activator: https://docker-registry.wikimedia.org/knative-serving-activator/tags/
- Net-istio webhook: https://docker-registry.wikimedia.org/knative-net-istio-webhook/tags/
- Net-istio controller: https://docker-registry.wikimedia.org/knative-net-istio-controller/tags/
KServe
Let’s create the namespace kserve:
######################## KServe Installation ######################## cat<<EOF | kubectl apply -f -apiVersion: v1kind: Namespacemetadata: labels: control-plane: kserve-controller-manager controller-tools.k8s.io: "1.0" istio-injection: disabled name: kserveEOF
Now we can install the “dev” chart we created with helm template:
kubectlapply-fdev-kserve.yaml
This should install everything we need to run kserve, however, we still need to deal with tls certificate. We will use the self-signed-ca hack outlined in the kserve repo: https://github.com/kserve/kserve/blob/master/hack/self-signed-ca.sh
First, delete the existing secrets:
# delete existing certs kubectldeletesecretkserve-webhook-server-cert-nkserve kubectldeletesecretkserve-webhook-server-secret-nkserve
Now copy that script and execute it:
curl-LJ0https://raw.githubusercontent.com/kserve/kserve/master/hack/self-signed-ca.sh>self-signed-ca.sh chmod+xself-signed-ca.sh ./self-signed-ca.sh
Verify that you now have a new webhook-server-cert:
kubectlgetsecrets-nkserve NAMETYPEDATAAGE default-token-ccsk4kubernetes.io/service-account-token35d1h kserve-webhook-server-certOpaque230s
Lastly, let’s setup a namespace to deploy our inference services to:
kubectlcreatenamespacekserve-test
Images
- KServe agent: https://docker-registry.wikimedia.org/kserve-agent/tags/
- Kserve controller: https://docker-registry.wikimedia.org/kserve-controller/tags/
- KServe storage-initializer: https://docker-registry.wikimedia.org/kserve-storage-initializer/tags/
Minio
This is an optional step for using minio for model storage in your development cluster. In Production, we us Thanos Swift to store our model binaries, however, we can use something more adhoc for local dev.
This will mostly follow the document here: https://github.com/kserve/website/blob/main/docs/modelserving/kafka/kafka.md
First we create a file called minio.yaml, with the following contents:
---apiVersion:apps/v1kind:Deploymentmetadata:labels:app:minioname:minionamespace:kserve-testspec:progressDeadlineSeconds:600replicas:1revisionHistoryLimit:10selector:matchLabels:app:miniostrategy:type:Recreatetemplate:metadata:labels:app:miniospec:containers:-args:-server-/dataenv:-name:MINIO_ACCESS_KEYvalue:minio-name:MINIO_SECRET_KEYvalue:minio123image:minio/minio:RELEASE.2020-10-18T21-54-12ZimagePullPolicy:IfNotPresentname:minioports:-containerPort:9000protocol:TCP---apiVersion:v1kind:Servicemetadata:labels:app:minioname:minio-servicespec:ports:-port:9000protocol:TCPtargetPort:9000selector:app:miniotype:ClusterIP
Next, you can install the minio test instance to your cluster:
kubectlapply-fminio.yaml-nkserve-test
Now we need to install the Minio client (mc):
curl-LJ0https://dl.min.io/client/mc/release/linux-amd64/mc>mc chmod+xmc ./mc--help
Now we need to port-forward our minio test app in a different terminal window
# Run port forwarding command in a different terminal kubectlport-forward$(kubectlgetpod-nkserve-test--selector="app=minio"--outputjsonpath='{.items[0].metadata.name}')9000:9000-nkserve-test
Now lets add our test instance and create a bucket for model storage
./mcconfighostaddmyminiohttp://127.0.0.1:9000miniominio123 ./mcmbmyminio/wmf-ml-models
Now we need to create an s3 secret for minio and attach it to a service account.
apiVersion:v1kind:Secretmetadata:name:storage-secretannotations:serving.kserve.io/s3-endpoint:minio-service.kserve-test:9000# replace with your s3 endpointserving.kserve.io/s3-usehttps:"0"# by default 1, for testing with minio you need to set to 0serving.kserve.io/s3-verifyssl:"0"serving.kserve.io/s3-region:us-east-1type:OpaquestringData:AWS_ACCESS_KEY_ID:minioAWS_SECRET_ACCESS_KEY:minio123---apiVersion:v1kind:ServiceAccountmetadata:name:sasecrets:-name:storage-secret---
and we can apply it as follows:
kubectlapply-fs3-secret.yaml-nkserve-test
You should be able to upload a model binary file as follows:
./mccpmodel.binmyminio/wmf-ml-models/
You can use the modelupload.sh script to handle model uploads to minio. First you need to create a s3cmd config file called ~/.s3cfg:
# Setup endpoint host_base = 127.0.0.1:9000 host_bucket = 127.0.01:9000 bucket_location = us-east-1 use_https = False # Setup access keys access_key = minio secret_key = minio123 # Enable S3 v4 signature APIs signature_v2 = False
Now you can download the modelupload script and use in on the ml-sandbox:
curl-LJ0https://gitlab.wikimedia.org/accraze/ml-utils/-/raw/main/model_upload.sh>model_upload.sh chmod+xmodel_upload.sh ./model_upload.shmodel.binarticlequalityenwikiwmf-ml-models~/.s3cfg
Finally, when you create an Inference service, you can point it at the new minio bucket (s3://wmf-ml-models), just make sure to add the serviceAccountName “sa” to the container that has a storage uri.
Example Inference Service spec:
apiVersion:serving.kserve.io/v1beta1kind:InferenceServicemetadata:name:enwiki-goodfaithannotations:sidecar.istio.io/inject:"false"spec:predictor:serviceAccountName:sacontainers:-name:kfserving-containerimage:docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:2021-07-28-204847-productionenv:# TODO: https://phabricator.wikimedia.org/T284091-name:STORAGE_URIvalue:"s3://wmf-ml-models/"-name:INFERENCE_NAMEvalue:"enwiki-goodfaith"
Notes
Delete cluster
Sometimes you might need to destroy the cluster and rebuild. Here is a helpful command:
minikubedelete--purge--all minikubestart--kubernetes-version=v1.16.15--cpus4--memory8192--driver=docker--force