Kubernetes Deployment Tutorial for GenAI Chat App
This tutorial guides you through deploying a Aviata-chatbot on Kubernetes. Aviata-chatbot is a basic GenAI chat application. The app consists of three components:
- Weaviate (a vector database)
- A backend built using Python and FastAPI
- A frontend using NGINX to host a HTML and JavaScript UI.
Workshop Setup
This workshop uses the Aviata Cloud Security Flight Simulator. You will need to launch an instance in your personal AWS Account.
Follow the Getting Started Guide guide to launch your instance.
Running Costs
Running the Cloud Security Flight Simulator in your AWS Account will cost about $5.00 USD per day.
Make sure to follow the Cleanup instructions to avoid ongoing expenses.
TASK 1: Setting up Minikube
Starting Minikube
Minikube is already installed on your VM. You might need to delete a previous cluster by executing minikube delete
before starting up Minikube.
The next step would be to start Minikube. We also need to enable metalLB
which will allow us to map/expose our services externally.
minikube delete
minikube start --subnet='192.168.49.0/24' --kubernetes-version=v1.28.4 --cni=calico
minikube addons enable metallb
minikube status
kubectl cluster-info
kubectl get nodes
Expected Output:
student@ace135.sans.labs ~ $ minikube status
odes
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
student@ace135.sans.labs ~ $ kubectl cluster-info
Kubernetes control plane is running at https://192.168.49.2:8443
CoreDNS is running at https://192.168.49.2:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
student@ace135.sans.labs ~ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready control-plane 78m v1.28.4
student@ace135.sans.labs ~ $
This is the first time we use the kubectl
command tool. This is the tool that communicates with Kubernetes API server to execute Kubernetes commands.
Now that we have Minikube running, let us clone the aviata-chatbot
app repository. This repository contains the application source code as well as the Kubernetes deployment scripts that we meed.
Execute the following commands to clone the repo and configure MetalLB
:
cd ~/code
git clone https://github.com/Ahmed-AG/aviata-chatbot.git
cd aviata-chatbot
kubectl apply -f deploy-k8s/metallb-config.yaml
TASK 2: Creating the essentials
Creating a Namespace in Kubernetes
The first step you have to do is to create a namespace. Namespaces help in organizing and managing resources within your Kubernetes cluster. They logically separate your applications so that you can apply security controls later on. In Kubernetes, namespaces provide a way to divide cluster resources between multiple users or teams.
To create a namespace for your application, use the following command:
kubectl create namespace aviata-chatbot
Creating a secret in Kubernetes
Secrets in Kubernetes are indispensable for securely managing sensitive information like passwords, API tokens, and SSH keys within the cluster. Utilizing Kubernetes' built-in encryption and access control mechanisms, secrets are securely stored and accessible to authorized applications and users, ensuring data confidentiality and integrity. They facilitate secure interactions between applications and external services or resources without compromising sensitive data. By abstracting the complexities of secret management, Kubernetes streamlines the deployment and management of secure containerized applications at scale.
For our specific use case, we require a secret key to interface with OpenAI APIs. The OpenAI key is to be provided by the instructor. Replace <OPENAI_API_KEY>
with the correct key and execute the following command to create this secret in Kubernetes:
echo "<OPENAI_API_KEY>" > ~/openai_api.key
kubectl create secret -n aviata-chatbot generic openai --from-literal=openai-key=$(cat ~/openai_api.key)
Different components of the application will utilize this key for communication with OpenAI.
To verify your work, execute the following:
kubectl -n aviata-chatbot get secrets
Creating a ConfigMap
In Kubernetes, not all shared data between application components needs to be kept secret; ConfigMap
provides an alternative method for sharing information. Execute the following command to apply the configuration defined in the configmap.yaml
file within the aviata-chatbot
namespace:
cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/configmap.yaml
Examine the contents of deploy-k8s/configmap.yaml
:
apiVersion: v1
kind: ConfigMap
metadata:
name: backend-deployment
data:
db_url: http://weaviate.aviata-chatbot.svc.cluster.local:8080
backend_url: http://aviata-backend.sans.labs:8000
In this ConfigMap we have two key-value pairs: - db_url: Specifies the URL for accessing the database. - backend_url: Specifies the URL for accessing the backend service.
These key-value pairs represent configuration parameters that can be used by various components of the application, such as backend services or pods, allowing them to dynamically access the specified URLs without hardcoding them into the application code. This flexibility enables easier configuration management and updates, as configuration changes can be made centrally in the ConfigMap without requiring changes to application code or container images.
Once applied, you can inspect your work using the following commands:
kubectl -n aviata-chatbot get configmap
kubectl -n aviata-chatbot describe configmap backend-deployment
These commands allow you to retrieve ConfigMaps within the namespace and provide detailed information about a specific ConfigMap named "backend-deployment". ConfigMaps facilitate the efficient sharing of non-sensitive data across different components of the application.
TASK 3: Deploying your application
Applying the VectorDB (Weaviate)
Now that we have the essential resources deployed, let us begin by deploying the first deployment
and service
, the weaviate Database:
cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/weaviate-vectordb.yaml
Take a look at deploy-k8s/weaviate-vectordb.yaml
. This file defines two Kubernetes resources. The first one is a Deployment, which specifies the configuration for the containers running within the pods. Within the Deployment, you'll find settings like the container image to use, the ports that the pods will be listening on, and a name for the containers.
spec:
containers:
- name: weaviate-db
image: semitechnologies/weaviate:1.23.9
ports:
- containerPort: 8080
- containerPort: 50051
The deployment also includes environment variables that are essential for the Weaviate container to operate. Notably, there's the OpenAI key, which is crucial for the functionality of Weaviate.
env:
- name: OPENAI_KEY
valueFrom:
secretKeyRef:
name: openai
key: openai-key
- name: QUERY_DEFAULTS_LIMIT
value: "25"
- name: AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED
value: "true"
- name: PERSISTENCE_DATA_PATH
value: "/var/lib/weaviate"
- name: DEFAULT_VECTORIZER_MODULE
value: "text2vec-openai"
- name: ENABLE_MODULES
value: "text2vec-openai,generative-openai"
- name: CLUSTER_HOSTNAME
value: "node1"
OPENAI_KEY
in the container, retrieving its value from a secret named openai
, with the key openai-key
.
The second resource that will be created is a Service
.
apiVersion: v1
kind: Service
metadata:
name: weaviate
spec:
selector:
app: weaviate
ports:
- protocol: TCP
port: 8080
targetPort: 8080
The above configuration defines a Kubernetes Service named weaviate
that routes traffic to Pods labeled with app: weaviate
on port 8080.
You can inspect your work using the following commands:
kubectl get service weaviate -n aviata-chatbot
kubectl get service weaviate -n aviata-chatbot -o json
Applying The Backend
The backend for the Aviata-chatbot is a Python-based API server that listens to requests from the Frontend, processes them, communicates with OpenAI (using the OpenAI key), and interacts with the Weaviate pods if necessary. To deploy the backend, execute the following command:
cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/backend.yaml
Examine the spec
section in deploy-k8s/backend.yaml
:
spec:
containers:
- name: backend
image: ahmedag/aviata-backend
ports:
- containerPort: 8000
env:
- name: OPENAI_KEY
valueFrom:
secretKeyRef:
name: openai
key: openai-key
- name: DB_URL
valueFrom:
configMapKeyRef:
name: backend-deployment
key: db_url
Notice that we have two environment variables being created: one for the OpenAI Secret Key and the other for the Weaviate Database URL.
You can inspect your work using the following commands:
kubectl -n aviata-chatbot get service
kubectl -n aviata-chatbot get po
Sample Output:
% kubectl -n aviata-chatbot get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
backend LoadBalancer 10.101.46.162 192.168.49.200 8000:30318/TCP 25m
weaviate ClusterIP 10.101.169.19 <none> 8080/TCP 27m
% kubectl -n aviata-chatbot get po
NAME READY STATUS RESTARTS AGE
backend-deployment-7c795694-652cp 1/1 Running 0 54s
weaviate-db-97f46c9b6-ps8sr 1/1 Running 0 109s
Applying the Frontend
Lastly, we'll deploy our frontend, which consists of an Nginx
server running basic HTML and JavaScript, serving as our user interface.
Execute the command:
kubectl -n aviata-chatbot apply -f deploy-k8s/frontend.yaml
Within our JavaScript code, there's a reference to our backend server that requires updating. Feel free to examine that HTML/JS code here: https://github.com/Ahmed-AG/aviata-chatbot/blob/main/src/frontend/index.html
You can inspect your work using the following commands:
kubectl -n aviata-chatbot get service
Sample output should look like this:
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
backend LoadBalancer 10.101.46.162 192.168.49.200 8000:30318/TCP 25m
frontend LoadBalancer 10.99.1.241 192.168.49.201 80:30000/TCP 65s
weaviate ClusterIP 10.101.169.19 <none> 8080/TCP 27m
student@ace135.sans.labs ~/code/aviata-chatbot (main)$
Notice that the backend
and the frontend
both have external IP addresses. However, the weaviate
DB does not.
Great work so far! Let us review what we have created so far:
kubectl -n aviata-chatbot get all
kubectl -n aviata-chatbot get configmap,secrets
student@ace135.sans.labs ~ $ kubectl -n aviata-chatbot get all
NAME READY STATUS RESTARTS AGE
pod/backend-deployment-74876d5497-hnqg9 1/1 Running 0 14h
pod/frontend-deployment-6fbfd49487-4xnh9 1/1 Running 0 14h
pod/weaviate-db-97f46c9b6-zhmfh 1/1 Running 0 14h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/backend LoadBalancer 10.97.230.127 192.168.49.200 8000:30131/TCP 14h
service/frontend LoadBalancer 10.104.69.115 192.168.49.201 80:30000/TCP 14h
service/weaviate ClusterIP 10.110.116.78 <none> 8080/TCP 14h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/backend-deployment 1/1 1 1 14h
deployment.apps/frontend-deployment 1/1 1 1 14h
deployment.apps/weaviate-db 1/1 1 1 14h
NAME DESIRED CURRENT READY AGE
replicaset.apps/backend-deployment-74876d5497 1 1 1 14h
replicaset.apps/frontend-deployment-6fbfd49487 1 1 1 14h
replicaset.apps/weaviate-db-97f46c9b6 1 1 1 14h
student@ace135.sans.labs ~ $ kubectl -n aviata-chatbot get configmap,secrets
NAME DATA AGE
configmap/backend-deployment 2 14h
configmap/kube-root-ca.crt 1 14h
NAME TYPE DATA AGE
secret/openai Opaque 1 14h
student@ace135.sans.labs ~ $
Testing connectivity
With all application components deployed, it's time to test the connectivity between them. We'll start by accessing a shell on the frontend container and attempting to reach the backend.
kubectl -n aviata-chatbot get po -o wide
Make a note of the IP address associated with the Backend container. Once done, execute the following to get shell access on the Frontend container
FRONTEND_POD_NAME=$(kubectl -n aviata-chatbot get po -o json |jq -r '.items[].metadata.name' | grep frontend)
echo $FRONTEND_POD_NAME
kubectl -n aviata-chatbot exec -ti $FRONTEND_POD_NAME -- /bin/sh
Once you are inside the container, execute the following command replacing <BACKEND_IP>
with the correct IP address:
curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?
Sample output:
{"message":"I am a helpful assistant here to provide you with information and assistance to the best of my abilities. How can I help you today?","Your prompt is":"who are you?"}
If you received a message like that, congratulations! This indicates that your frontend successfully accessed your backend, and your backend was able to communicate successfully with the OpenAI API.
Exit the container by executing:
exit
Set DNS configuration
To be able to access Aviata-chatbot's UI, run the following command and open the link in your browser:
export BACKEND_IP=$(kubectl -n aviata-chatbot get service backend -o json |jq -r '.status.loadBalancer.ingress[].ip')
grep -v 'aviata-backend.sans.labs' /etc/hosts > /tmp/be
echo "$BACKEND_IP aviata-backend.sans.labs" >> /tmp/be
sudo cp /tmp/be /etc/hosts
export FRONTEND_IP=$(kubectl -n aviata-chatbot get service frontend -o json |jq -r '.status.loadBalancer.ingress[].ip')
grep -vE 'aviata-chatbot.sans.labs' /etc/hosts > /tmp/fe
echo "${FRONTEND_IP} aviata-chatbot.sans.labs" >> /tmp/fe
sudo cp /tmp/fe /etc/hosts
Take a look at /etc/hosts
:
cat /etc/hosts
This sequence of commands retrieves the IP addresses for the backend and frontend services in the aviata-chatbot namespace from their respective Kubernetes load balancers and stores them in the BACKEND_IP and FRONTEND_IP variables. It then updates the /etc/hosts file with these IP addresses, associating them with the hostnames aviata-backend.sans.labs and aviata-chatbot.sans.labs respectively, using sudo to gain the necessary permissions.
Use your browser to access http://aviata-chatbot.sans.labs
You can also access the backend going to http://aviata-backend.sans.labs:8000/api/llm?q=tell%20me%20a%20story
TASK 4: Network control
kubernetes can be used by many teams to run many applications, workloads, systems Kubernetes is a powerful platform that allows multiple teams to run various applications, workloads, and systems efficiently. Effective segmentation between different applications, namespaces, services, and Pods is a cornerstone of network access control within Kubernetes. This segmentation ensures that each component operates within its defined boundaries, enhancing security, and reducing the risk of unauthorized access. By isolating resources and managing permissions at a granular level, Kubernetes facilitates a secure and organized environment where teams can deploy and manage their applications without interference, maintaining the integrity and performance of the overall system.
There are multiple ways for us to apply network access in K8s. We will give one example that uses apiVersion: networking.k8s.io/v1.
Let us begin creating a new namespace for a second application:
kubectl create namespace application2
Next, let us add a pod there. We dont have to use a Kubernetes yaml file, We can do this quickly via the command line
kubectl -n application2 create deployment app2-deployment --image=nginx
kubectl -n application2 get po -o wide
aviata-chatbot
app components.
Get the IP address associated with backend-deployment pod:
kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
Let us get a shell on our newly created app2-deployment
pod
APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
echo $APP2_POD_NAME
kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
Once you are inside the container, try to reach the aviata-chatbot backend
curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?
Sample output:
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
backend-deployment-74876d5497-mqngg 1/1 Running 0 4h44m 10.244.0.8 minikube <none> <none>
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ echo $APP2_POD_NAME
app2-deployment-58667fc458-kvk5l
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
# curl http://10.244.0.8:8000/api/llm?q=who%20are%20you?
{"message":"Hello! I am Aviata-chatbot, your helpful assistant. How can I assist you today?","Your prompt is":"who are you?"}#
# exit
student@ace135.sans.labs ~/code/aviata-chatbot (main)$
Alright, even though app2 lives in a separate namespace, that is only a logical separation. In order for us to truly isolate, we can utelize a NetworkPolicy
cd ~/code/aviata-chatbot
kubectl apply -f deploy-k8s/network-policy-deny-all.yaml
Now try to access aviata-chatbot's backend from app2 pod again,
kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
echo $APP2_POD_NAME
kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
Once you are inside the container, try to reach the aviata-chatbot backend
curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?
Sample Output:
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl apply -f deploy-k8s/network-policy-deny-all.yaml
networkpolicy.networking.k8s.io/default-deny-ingress created
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
backend-deployment-74876d5497-hnqg9 1/1 Running 0 22m 10.244.120.69 minikube <none> <none>
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ echo $APP2_POD_NAME
app2-deployment-58667fc458-tr2gj
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
# curl http://10.244.120.69:8000/api/llm?q=who%20are%20you?
^C
#
Is our app still working?
Feel free to delete the policy and try again:
kubectl delete -f deploy-k8s/network-policy-deny-all.yaml
TASK 5: Cleanup
kubectl -n aviata-chatbot delete -f deploy-k8s/weaviate-vectordb.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/backend.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/frontend.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/configmap.yaml
kubectl delete namespace aviata-chatbot