Skip to content

Kubernetes Deployment Tutorial for GenAI Chat App

This tutorial guides you through deploying a Aviata-chatbot on Kubernetes. Aviata-chatbot is a basic GenAI chat application. The app consists of three components:

  • Weaviate (a vector database)
  • A backend built using Python and FastAPI
  • A frontend using NGINX to host a HTML and JavaScript UI.

Workshop Setup

This workshop uses the Aviata Cloud Security Flight Simulator. You will need to launch an instance in your personal AWS Account.

Follow the Getting Started Guide guide to launch your instance.

Running Costs

Running the Cloud Security Flight Simulator in your AWS Account will cost about $5.00 USD per day.

Make sure to follow the Cleanup instructions to avoid ongoing expenses.

TASK 1: Setting up Minikube

Starting Minikube

Minikube is already installed on your VM. You might need to delete a previous cluster by executing minikube delete before starting up Minikube. The next step would be to start Minikube. We also need to enable metalLB which will allow us to map/expose our services externally.

minikube delete
minikube start --subnet='192.168.49.0/24' --kubernetes-version=v1.28.4 --cni=calico
Enable metallb:

minikube addons enable metallb
Check your work:

minikube status
kubectl cluster-info
kubectl get nodes

Expected Output:

student@ace135.sans.labs ~ $ minikube status
odes
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

student@ace135.sans.labs ~ $ kubectl cluster-info
Kubernetes control plane is running at https://192.168.49.2:8443
CoreDNS is running at https://192.168.49.2:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
student@ace135.sans.labs ~ $ kubectl get nodes
NAME       STATUS   ROLES           AGE   VERSION
minikube   Ready    control-plane   78m   v1.28.4
student@ace135.sans.labs ~ $

This is the first time we use the kubectl command tool. This is the tool that communicates with Kubernetes API server to execute Kubernetes commands.

Now that we have Minikube running, let us clone the aviata-chatbot app repository. This repository contains the application source code as well as the Kubernetes deployment scripts that we meed.

Execute the following commands to clone the repo and configure MetalLB:

cd ~/code
git clone https://github.com/Ahmed-AG/aviata-chatbot.git
cd aviata-chatbot
kubectl apply -f deploy-k8s/metallb-config.yaml
Congratulations, your environment is ready!

TASK 2: Creating the essentials

Creating a Namespace in Kubernetes

The first step you have to do is to create a namespace. Namespaces help in organizing and managing resources within your Kubernetes cluster. They logically separate your applications so that you can apply security controls later on. In Kubernetes, namespaces provide a way to divide cluster resources between multiple users or teams.

To create a namespace for your application, use the following command:

kubectl create namespace aviata-chatbot

Creating a secret in Kubernetes

Secrets in Kubernetes are indispensable for securely managing sensitive information like passwords, API tokens, and SSH keys within the cluster. Utilizing Kubernetes' built-in encryption and access control mechanisms, secrets are securely stored and accessible to authorized applications and users, ensuring data confidentiality and integrity. They facilitate secure interactions between applications and external services or resources without compromising sensitive data. By abstracting the complexities of secret management, Kubernetes streamlines the deployment and management of secure containerized applications at scale.

For our specific use case, we require a secret key to interface with OpenAI APIs. The OpenAI key is to be provided by the instructor. Replace <OPENAI_API_KEY> with the correct key and execute the following command to create this secret in Kubernetes:

echo "<OPENAI_API_KEY>" > ~/openai_api.key
kubectl create secret -n aviata-chatbot generic openai --from-literal=openai-key=$(cat ~/openai_api.key)

Different components of the application will utilize this key for communication with OpenAI.

To verify your work, execute the following:

kubectl -n aviata-chatbot get secrets

Creating a ConfigMap

In Kubernetes, not all shared data between application components needs to be kept secret; ConfigMap provides an alternative method for sharing information. Execute the following command to apply the configuration defined in the configmap.yaml file within the aviata-chatbot namespace:

cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/configmap.yaml

Examine the contents of deploy-k8s/configmap.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: backend-deployment
data:
  db_url: http://weaviate.aviata-chatbot.svc.cluster.local:8080
  backend_url: http://aviata-backend.sans.labs:8000

In this ConfigMap we have two key-value pairs: - db_url: Specifies the URL for accessing the database. - backend_url: Specifies the URL for accessing the backend service.

These key-value pairs represent configuration parameters that can be used by various components of the application, such as backend services or pods, allowing them to dynamically access the specified URLs without hardcoding them into the application code. This flexibility enables easier configuration management and updates, as configuration changes can be made centrally in the ConfigMap without requiring changes to application code or container images.

Once applied, you can inspect your work using the following commands:

kubectl -n aviata-chatbot get configmap
kubectl -n aviata-chatbot describe  configmap backend-deployment

These commands allow you to retrieve ConfigMaps within the namespace and provide detailed information about a specific ConfigMap named "backend-deployment". ConfigMaps facilitate the efficient sharing of non-sensitive data across different components of the application.

TASK 3: Deploying your application

Applying the VectorDB (Weaviate)

Now that we have the essential resources deployed, let us begin by deploying the first deployment and service, the weaviate Database:

cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/weaviate-vectordb.yaml

Take a look at deploy-k8s/weaviate-vectordb.yaml. This file defines two Kubernetes resources. The first one is a Deployment, which specifies the configuration for the containers running within the pods. Within the Deployment, you'll find settings like the container image to use, the ports that the pods will be listening on, and a name for the containers.

    spec:
      containers:
      - name: weaviate-db
        image: semitechnologies/weaviate:1.23.9
        ports:
        - containerPort: 8080
        - containerPort: 50051

The deployment also includes environment variables that are essential for the Weaviate container to operate. Notably, there's the OpenAI key, which is crucial for the functionality of Weaviate.

        env:
        - name: OPENAI_KEY
          valueFrom:
            secretKeyRef:
              name: openai
              key: openai-key
        - name: QUERY_DEFAULTS_LIMIT
          value: "25"
        - name: AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED
          value: "true"
        - name: PERSISTENCE_DATA_PATH
          value: "/var/lib/weaviate"
        - name: DEFAULT_VECTORIZER_MODULE
          value: "text2vec-openai"
        - name: ENABLE_MODULES
          value: "text2vec-openai,generative-openai"
        - name: CLUSTER_HOSTNAME
          value: "node1"
OpenAI key is set a creating an environment variable named OPENAI_KEY in the container, retrieving its value from a secret named openai, with the key openai-key.

The second resource that will be created is a Service.

apiVersion: v1
kind: Service
metadata:
  name: weaviate
spec:
  selector:
    app: weaviate
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080

The above configuration defines a Kubernetes Service named weaviate that routes traffic to Pods labeled with app: weaviate on port 8080.

You can inspect your work using the following commands:

kubectl get service weaviate -n aviata-chatbot
kubectl get service weaviate -n aviata-chatbot -o json

Applying The Backend

The backend for the Aviata-chatbot is a Python-based API server that listens to requests from the Frontend, processes them, communicates with OpenAI (using the OpenAI key), and interacts with the Weaviate pods if necessary. To deploy the backend, execute the following command:

cd ~/code/aviata-chatbot
kubectl -n aviata-chatbot apply -f deploy-k8s/backend.yaml

Examine the spec section in deploy-k8s/backend.yaml:

    spec:
      containers:
      - name: backend
        image: ahmedag/aviata-backend
        ports:
        - containerPort: 8000
        env:
        - name: OPENAI_KEY
          valueFrom:
            secretKeyRef:
              name: openai
              key: openai-key
        - name: DB_URL
          valueFrom:
            configMapKeyRef:
              name: backend-deployment
              key: db_url

Notice that we have two environment variables being created: one for the OpenAI Secret Key and the other for the Weaviate Database URL.

You can inspect your work using the following commands:

kubectl -n aviata-chatbot  get service

kubectl -n aviata-chatbot  get po

Sample Output:

% kubectl -n aviata-chatbot  get service

NAME       TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE
backend    LoadBalancer   10.101.46.162   192.168.49.200   8000:30318/TCP   25m
weaviate   ClusterIP      10.101.169.19   <none>           8080/TCP         27m

% kubectl -n aviata-chatbot  get po

NAME                                READY   STATUS    RESTARTS   AGE
backend-deployment-7c795694-652cp   1/1     Running   0          54s
weaviate-db-97f46c9b6-ps8sr         1/1     Running   0          109s

Applying the Frontend

Lastly, we'll deploy our frontend, which consists of an Nginx server running basic HTML and JavaScript, serving as our user interface.

Execute the command:

kubectl -n aviata-chatbot apply -f deploy-k8s/frontend.yaml

Within our JavaScript code, there's a reference to our backend server that requires updating. Feel free to examine that HTML/JS code here: https://github.com/Ahmed-AG/aviata-chatbot/blob/main/src/frontend/index.html

You can inspect your work using the following commands:

kubectl -n aviata-chatbot get service

Sample output should look like this:

student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get service

NAME       TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE
backend    LoadBalancer   10.101.46.162   192.168.49.200   8000:30318/TCP   25m
frontend   LoadBalancer   10.99.1.241     192.168.49.201   80:30000/TCP     65s
weaviate   ClusterIP      10.101.169.19   <none>           8080/TCP         27m

student@ace135.sans.labs ~/code/aviata-chatbot (main)$

Notice that the backend and the frontend both have external IP addresses. However, the weaviate DB does not.

Great work so far! Let us review what we have created so far:

kubectl -n aviata-chatbot get all
kubectl -n aviata-chatbot get configmap,secrets
Sample output:

student@ace135.sans.labs ~ $ kubectl -n aviata-chatbot get all
NAME                                       READY   STATUS    RESTARTS   AGE
pod/backend-deployment-74876d5497-hnqg9    1/1     Running   0          14h
pod/frontend-deployment-6fbfd49487-4xnh9   1/1     Running   0          14h
pod/weaviate-db-97f46c9b6-zhmfh            1/1     Running   0          14h

NAME               TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)          AGE
service/backend    LoadBalancer   10.97.230.127   192.168.49.200   8000:30131/TCP   14h
service/frontend   LoadBalancer   10.104.69.115   192.168.49.201   80:30000/TCP     14h
service/weaviate   ClusterIP      10.110.116.78   <none>           8080/TCP         14h

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/backend-deployment    1/1     1            1           14h
deployment.apps/frontend-deployment   1/1     1            1           14h
deployment.apps/weaviate-db           1/1     1            1           14h

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/backend-deployment-74876d5497    1         1         1       14h
replicaset.apps/frontend-deployment-6fbfd49487   1         1         1       14h
replicaset.apps/weaviate-db-97f46c9b6            1         1         1       14h
student@ace135.sans.labs ~ $ kubectl -n aviata-chatbot get configmap,secrets
NAME                           DATA   AGE
configmap/backend-deployment   2      14h
configmap/kube-root-ca.crt     1      14h

NAME            TYPE     DATA   AGE
secret/openai   Opaque   1      14h
student@ace135.sans.labs ~ $

Testing connectivity

With all application components deployed, it's time to test the connectivity between them. We'll start by accessing a shell on the frontend container and attempting to reach the backend.

kubectl -n aviata-chatbot get po -o wide

Make a note of the IP address associated with the Backend container. Once done, execute the following to get shell access on the Frontend container

FRONTEND_POD_NAME=$(kubectl -n aviata-chatbot get po -o json |jq -r '.items[].metadata.name' | grep frontend)
echo $FRONTEND_POD_NAME

kubectl -n aviata-chatbot exec -ti $FRONTEND_POD_NAME -- /bin/sh

Once you are inside the container, execute the following command replacing <BACKEND_IP> with the correct IP address:

curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?

Sample output:

{"message":"I am a helpful assistant here to provide you with information and assistance to the best of my abilities. How can I help you today?","Your prompt is":"who are you?"}

If you received a message like that, congratulations! This indicates that your frontend successfully accessed your backend, and your backend was able to communicate successfully with the OpenAI API.

Exit the container by executing:

exit

Set DNS configuration

To be able to access Aviata-chatbot's UI, run the following command and open the link in your browser:

export BACKEND_IP=$(kubectl -n aviata-chatbot get service backend -o json |jq -r '.status.loadBalancer.ingress[].ip')
grep -v 'aviata-backend.sans.labs' /etc/hosts > /tmp/be
echo "$BACKEND_IP  aviata-backend.sans.labs" >> /tmp/be
sudo cp /tmp/be /etc/hosts
export FRONTEND_IP=$(kubectl -n aviata-chatbot get service frontend -o json |jq -r '.status.loadBalancer.ingress[].ip')
grep -vE 'aviata-chatbot.sans.labs'  /etc/hosts > /tmp/fe
echo "${FRONTEND_IP} aviata-chatbot.sans.labs" >> /tmp/fe
sudo cp /tmp/fe /etc/hosts

Take a look at /etc/hosts:

cat /etc/hosts

This sequence of commands retrieves the IP addresses for the backend and frontend services in the aviata-chatbot namespace from their respective Kubernetes load balancers and stores them in the BACKEND_IP and FRONTEND_IP variables. It then updates the /etc/hosts file with these IP addresses, associating them with the hostnames aviata-backend.sans.labs and aviata-chatbot.sans.labs respectively, using sudo to gain the necessary permissions.

Use your browser to access http://aviata-chatbot.sans.labs

You can also access the backend going to http://aviata-backend.sans.labs:8000/api/llm?q=tell%20me%20a%20story

TASK 4: Network control

kubernetes can be used by many teams to run many applications, workloads, systems Kubernetes is a powerful platform that allows multiple teams to run various applications, workloads, and systems efficiently. Effective segmentation between different applications, namespaces, services, and Pods is a cornerstone of network access control within Kubernetes. This segmentation ensures that each component operates within its defined boundaries, enhancing security, and reducing the risk of unauthorized access. By isolating resources and managing permissions at a granular level, Kubernetes facilitates a secure and organized environment where teams can deploy and manage their applications without interference, maintaining the integrity and performance of the overall system.

There are multiple ways for us to apply network access in K8s. We will give one example that uses apiVersion: networking.k8s.io/v1.

Let us begin creating a new namespace for a second application:

kubectl create namespace application2

Next, let us add a pod there. We dont have to use a Kubernetes yaml file, We can do this quickly via the command line

kubectl -n application2 create deployment app2-deployment --image=nginx
kubectl -n application2 get po -o wide
Notice the IP address assigned to the new pod we created. The pod lives in the same subnet as our aviata-chatbot app components.

Get the IP address associated with backend-deployment pod:

kubectl -n aviata-chatbot get po -o wide |grep backend-deployment

Let us get a shell on our newly created app2-deployment pod

APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
echo $APP2_POD_NAME
kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh

Once you are inside the container, try to reach the aviata-chatbot backend

curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?

Sample output:

student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
backend-deployment-74876d5497-mqngg    1/1     Running   0          4h44m   10.244.0.8   minikube   <none>           <none>
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ echo $APP2_POD_NAME
app2-deployment-58667fc458-kvk5l
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
# curl http://10.244.0.8:8000/api/llm?q=who%20are%20you?
{"message":"Hello! I am Aviata-chatbot, your helpful assistant. How can I assist you today?","Your prompt is":"who are you?"}#
# exit
student@ace135.sans.labs ~/code/aviata-chatbot (main)$

Alright, even though app2 lives in a separate namespace, that is only a logical separation. In order for us to truly isolate, we can utelize a NetworkPolicy

cd ~/code/aviata-chatbot
kubectl apply -f deploy-k8s/network-policy-deny-all.yaml

Now try to access aviata-chatbot's backend from app2 pod again,

kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
echo $APP2_POD_NAME
kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh

Once you are inside the container, try to reach the aviata-chatbot backend

curl http://<BACKEND_IP>:8000/api/llm?q=who%20are%20you?

Sample Output:

student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl apply -f deploy-k8s/network-policy-deny-all.yaml
networkpolicy.networking.k8s.io/default-deny-ingress created
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n aviata-chatbot get po -o wide |grep backend-deployment
backend-deployment-74876d5497-hnqg9    1/1     Running   0          22m   10.244.120.69   minikube   <none>           <none>
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ APP2_POD_NAME=$(kubectl -n application2 get po -o json |jq -r '.items[].metadata.name' | grep app2)
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ echo $APP2_POD_NAME
app2-deployment-58667fc458-tr2gj
student@ace135.sans.labs ~/code/aviata-chatbot (main)$ kubectl -n application2 exec -ti $APP2_POD_NAME -- /bin/sh
# curl http://10.244.120.69:8000/api/llm?q=who%20are%20you?
^C
#

Is our app still working?

Feel free to delete the policy and try again:

kubectl delete -f deploy-k8s/network-policy-deny-all.yaml

TASK 5: Cleanup

kubectl -n aviata-chatbot delete -f deploy-k8s/weaviate-vectordb.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/backend.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/frontend.yaml
kubectl -n aviata-chatbot delete -f deploy-k8s/configmap.yaml
kubectl delete namespace aviata-chatbot