That time my k8s master broke and I deleted my volumes…
OBSERVE! This post should not be seen as a tutorial or best practices, the fixes I do in this post is probably bad practice and seeing I should have had a better backup routine on this cluster, the issue would have been a whole lot smaller! Read it as a rant, as sort of a funny story, don’t follow it and hold me liable for any data loss!!
So, today I woke up to a k8s cluster which had a broken node. This is usually not a huge issue, you can restart the node and it often re-join the cluster pretty much automatically. But in this case it didnt…
So what is the problem then? Well, as it turns out, the master node had some issues, in this cluster, which is a very small cluster, the master node is untainted and runs pods on itself, it also runs the ETCD servers internally and the scheduler had broken and the server itself was behaving badly…
Well, smart as I am, instead of figuring this out from the start I tried to restart the containers… and when that didn’t work
I actually deleted a couple of daemonsets and deployments… and well, you can guess that it didnt end up as I wanted…
After messing around with restarting and deleting I decided to restart the node (which I had not done yet, as I didnt know it was broken…),
it rebooted and kubernetes didnt start.
When kubernetes don’t start and you dont know why, the best thing to do is to check some of the logs. Easiest way to check when it comes to the startup process is through journalctl, just print:
> sudo journalctl -u kubelet --since "1 hour ago"
and you will get all your kubelet logs in the terminal right away. In this case, it was quite an easy fix, the
server had swap enabled (something that kubernetes DONT LIKE!), so turning that off and changing the entry in the
/etc/fstab
file fixed that and kubelet started fine.
In the fstab file, you can find a entry which has the type swap
, just put a #
(comment) in front of it and
it will not load during boot.
> cat /etc/fstab
# Hah! I masked my UUIDs!
UUID=xxxxxxxx-xxxx-4d2c-xxxx-7bdxx94e39da / ext4 noatime,errors=remount-ro 0 1
# /boot was on /dev/sda1 during installation
UUID=xxxxxxxx-a569-xxxx-xxxx-1405f7axx148 /boot ext4 noatime 0 2
# swap was on /dev/sda3 during installation
UUID=xxxxxxxx-4008-xxxx-a9db-2ff3bxxxxxx7 none swap sw 0 0
To just turn it of right away without having to remount the system, run [sudo] swapoff -a
, this will turn of the swap
but when restarting it will be back again if not changing the fstab file.
So, now my little minion was up and running again, but did that fix the issues with the pods that where broken? No… It did not… not only that… Daemonsets would’nt spawn new pods, containers didnt have their Ceph mounts working… Fun fun I thought.
That is when I understood that the master must be miss-behaving, so I tried to reboot it… and it didnt work… Not at all…
The command returned nothing and the server was still up…
Luckily my bare-metal provider have a control panel with ability to force-reboot computers. This is pretty much like
pulling the plug and turning it on again, something I do NOT like to do, but this time I had to… Waited and waited…
Finally, the server came back online (this was like half a minute, but felt like hours)!
So, did this fix my issue? Well yes, slightly! The containers went back up all daemonsets spawned new pods, all
networking and ceph storage was just as it should be… yay!… err… Right, I forgot that I actually killed a couple of
deployments with kubectl delete -f ...yml
… And that - smart as I am sometimes - those deployments had their
storage claims in the same yml file as the deployment… so the pods had created new persistent volumes for their claims!
Doh!
Now I had a couple of deployments with the wrong volumes, so they where like they where all new… The containers where my OAuth provider and my Issue Tracker, stuff that you really don’t want to loose!
This story ends quite well (I hope… for now it seems good at the least!), cause a lost PV is just a reference, not actually a lost
volume as long as you use retain
on them!
So, what to do when you loose a PV from your pod? When the PVC is pointing to a new volume instead?
Well, first off, you need to find the volume that you need.
To fetch all your PV’s from kubernetes just run:
> kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-00632722-2e2d-11e9-bc92-0007cb040cb0 20Gi RWX Retain Bound jitesoft/minio-storage-minio-2 rook-ceph-block 96d
pvc-127342a6-2e2d-11e9-bc92-0007cb040cb0 20Gi RWX Retain Bound jitesoft/minio-storage-minio-3 rook-ceph-block 96d
pvc-141f359f-552c-11e9-bc92-0007cb040cb0 10Gi RWX Retain Bound kube-system/consul-storage-consul-1 rook-ceph-block 47d
pvc-193f8ac5-552c-11e9-bc92-0007cb040cb0 10Gi RWX Retain Bound kube-system/consul-storage-consul-2 rook-ceph-block 47d
pvc-1defdb0c-5b73-11e9-bc92-0007cb040cb0 5Gi RWX Retain Bound monitoring/grafana-storage rook-ceph-block 39d
pvc-2649d5ae-6a65-11e9-bc92-0007cb040cb0 10Gi RWX Retain Released jitesoft/mongo-persistent-storage-mongo-0 rook-ceph-block 20d
pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0 10Gi RWX Retain Released jitesoft/mongo-persistent-storage-mongo-1 rook-ceph-block 20d
pvc-2fb2a7d4-2af7-11e9-a529-0007cb040cb0 1Gi RWO Retain Released default/jiteeu.isso.persistent-volume.claim rook-ceph-block 101d
pvc-3df9fba6-552b-11e9-bc92-0007cb040cb0 10Gi RWX Retain Bound kube-system/consul-storage-consul-0 rook-ceph-block 47d
As you can see in the above, I use Rook-Ceph as my storage engine, it works great, I like it a lot!
What you need from the above view is the NAME of the PV
, the names of the ones above are all auto generated, so lets for example just take the mongo-1
claim as it is released!.
pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
… a long and nice name, something you really don’t have to remember though, hehe…
The first thing we want to do is to is to edit the volume:
kubectl edit pv pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
This will - depending on the system you are using - open a text editor. The file will look something like this:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: ceph.rook.io/block
creationTimestamp: 2019-04-29T09:43:01Z
finalizers:
- kubernetes.io/pv-protection
name: pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
resourceVersion: "11233174"
selfLink: /api/v1/persistentvolumes/pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
uid: 2e120442-6a63-11e9-bc92-0007cb040cb0
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 10Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: mongo-persistent-storage-mongo-1
namespace: jitesoft
resourceVersion: "11231795"
uid: 2d9e264b-6a63-11e9-bc92-0007cb040cb0
flexVolume:
driver: ceph.rook.io/rook-ceph-system
options:
clusterNamespace: rook-ceph
dataBlockPool: ""
image: pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
pool: replicapool
storageClass: rook-ceph-block
persistentVolumeReclaimPolicy: Retain
storageClassName: rook-ceph-block
volumeMode: Filesystem
status:
phase: Released
The part that we care about is the claimRef
entry. To be sure that we can re-mount it on a new VPC, we aught to remove the claimRef. So delete the lines of the claimRef object and save the file and close it.
When that is done, the PV can be taken by any pod in the system, so now we go to our deployment yaml file.
On the specs
object, we add a volumeName
property and set the NAME of the volume as value:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: jitesoft
name: mongo-persistent-storage-mongo-1
spec:
# HERE!
volumeName: pvc-2d9e264b-6a63-11e9-bc92-0007cb040cb0
storageClassName: rook-ceph-block
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
We remove the currently running pod and the volume claim that we are now editing, and when that is done, we redeploy!
When the pod is back up, it will now use the correct volume and everything is awesome!
So, that was a short story about what I like to do during my sunday afternoons! What are your hobbies?!