This is Part 2 in a series outlining monitoring and instrumentation in Kubernetes. The Part 1 blog post discussed how we could implement a Zipkin, OpenTracing, and Prometheus (ZOP) stack in Kubernetes to monitor Kubernetes and some of its underlying components like Docker, the ZOP stack itself, as well as metrics and tracing for your own applications. Part 2 will focus more on taking that ZOP stack into a production environment and making the Prometheus time series data persistent through the use of external volumes.

If you are interested in kicking the tires and giving this a try, this blog post builds upon the ZOP stack that can be deployed from https://github.com/dvonthenen/zop-stack.

 

Storage Persistence in Kubernetes with ScaleIO

In case you missed it, there was a great blog post written by Vladimir Vivien titled Kubernetes Adds Native ScaleIO Support that covers how to use ScaleIO natively in a Kubernetes environment. We will be using this Kubernetes+ScaleIO configuration to provide the storage persistence for our ZOP stack. It should be noted that any of the native storage drivers available in Kubernetes will work. For those details, please check the Kubernetes documentation for configuration instructions.

Why use persistent storage? We know that placing anything into a production environment requires that we have persistent data for our applications. In the default configuration, Prometheus keeps all of its time series data (think database) in the container itself. If an external volume mount such as our ScaleIO volume is not used, if the Kubernetes node running Prometheus has some problem, like the hardware faults, Prometheus will get rescheduled on a different node. When that container instance is moved, all that data will be lost because the container is now gone. That ScaleIO volume provides that data persistence for the Prometheus time series data to move around with the container.

The two methods for providing that persistence is by using a pre-allocated volume or using Kubernetes Persistent Volume Claims. Let’s take a look at how we do that below.

Simple Storage Persistence

To deploy Prometheus using a previously allocated ScaleIO volume in Kubernetes, you can simply use the prometheus-simple.yaml to use that ScaleIO volume as the backing store for the Prometheus time series data. If you already have Prometheus deployed, delete the existing deployment and simply run a kubectl create -f prometheus-simple.yaml

 

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: prometheus
  labels:
    k8s-app: prometheus
    version: v1.7.1
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: prometheus
        k8s-app: prometheus
        version: v1.7.1
    spec:
      containers:
        - name: prometheus
          image: "prom/prometheus:v1.7.1"
          args:
            - "-config.file=/etc/prometheus/config.yaml"
            - "-storage.local.path=/var/lib/prometheus"
            - "-web.listen-address=0.0.0.0:9090"
          volumeMounts:
            - name: config
              mountPath: /etc/prometheus
            - name: test-data
              mountPath: /var/lib/prometheus
      volumes:
        - name: config
          configMap:
            name: prometheus
        - name: test-data
          scaleIO:
            gateway: https://10.138.0.6:443/api
            system: scaleio
            volumeName: prometheus
            secretRef:
              name: sio-secret

NOTE: Replace the ScaleIO system name, volumeName, and sio-secret with the actual names used in your configuration.

Dynamic Storage Reservation

For Dynamic Storage Reservation, if you haven’t already set this up, you will need to configure some definitions in Kubernetes to allow for Dynamic Storage reservations. For convenience, you can simply use the yaml found in the GitHub repo and run the following commands:

# This creates the storage class to provide Kubernetes access to ScaleIO.
# If the system name and sio-secret are named differently, that will need to
# be changed in this file.
kubectl create -f storageclass.yaml

# To allow for volume creation in the kube-system namespace.
# In this blog, this is the namespace where Prometheus is deployed to.
kubectl create -f persistvolumeclaim-kubesystem.yaml

To deploy Prometheus to make use of the Dynamic Storage Reservation functionality in Kubernetes, you can simply use the prometheus-dynamic.yaml below to provision a persistent volume from ScaleIO and use that as the backing store for the Prometheus time series data. If you already have Prometheus deployed, delete the existing deployment and run a kubectl create -f prometheus-dynamic.yaml.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: prometheus
  labels:
    k8s-app: prometheus
    version: v1.7.1
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: prometheus
        k8s-app: prometheus
        version: v1.7.1
    spec:
      containers:
        - name: prometheus
          image: "prom/prometheus:v1.7.1"
          args:
            - "-config.file=/etc/prometheus/config.yaml"
            - "-storage.local.path=/var/lib/prometheus"
            - "-web.listen-address=0.0.0.0:9090"
          volumeMounts:
            - name: config
              mountPath: /etc/prometheus
            - name: test-data
              mountPath: /var/lib/prometheus
      volumes:
        - name: config
          configMap:
            name: prometheus
        - name: test-data
          persistentVolumeClaim:
            claimName: pvc-sio-small

To Production and Beyond

Part 1 talked about the benefits of metrics and instrumentation in relation to containers, container orchestrators, and microservices and why a solution is needed. This post summarizes how to deploy the ZOP stack in a best practices configuration so that you can monitor and trace in a production environment. Now that this ZOP stack is running into your Kubernetes cluster, leverage this stack and integrate metrics and tracing capabilities into your own applications!