Tag Archives: Troubleshooting

Kubernetes

Kubernetes Troubleshooting – Kubelet Unable to attach or mount volumes – timed out waiting for the condition

The Issue

When I updated my Kasten application in my Kubernetes cluster, I found that one of the pods was stuck in “init” status.

dean@dean [ ~ ] (⎈ |tkg-wld-01-admin@tkg-wld-01:default) # k get pods -n kasten-io -w
NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-78564d4697-wl9wg 1/1 Running 0 3m9s
auth-svc-7977b9684b-zph27 1/1 Running 0 3m11s
catalog-svc-7ff7779b75-kmvsr 0/2 Init:0/2 0 2m43s

kubectl get pods - status init

Running a describe on that pod pointed to the fact the volume could not be attached.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m58s default-scheduler Successfully assigned kasten-io/catalog-svc-7ff7779b75-kmvsr to tkg-wld-01-md-0-54598b8d99-rpqjf
Warning FailedMount 55s kubelet Unable to attach or mount volumes: unmounted volumes=[catalog-persistent-storage], unattached volumes=[k10-k10-token-lbqpw catalog-persistent-storage]: timed out waiting for the condition
kubelet Unable to attach or mount volumes- unmounted volumes=[catalog-persistent-storage], unattached volumes=[k10-k10-token-lbqpw catalog-persistent-storage]- timed out waiting for the condition
The Cause

Some where along the line I found some stale volumeattachments linked to Kubernetes node that no longer exist in my cluster. This looks to be causing some confusion in the cluster who should be attaching the volume

The image below shows:

  • Find the Persistent Volume name linked to the associated claim for the failure in the pod events
  • Map this to the available VolumeAttachments
  • Reference VolumeAttachments for each node to available nodes in the cluster
    • I’ve highlighted the missing node in the red box

kubectl get pv - get volumeattachment - get nodes

The Fix

The fix is to remove the stale VolumeAttachment.

kubectl delete volumeattachment [volumeattachment_name]

kubectl delete volumeattachment

After this your pod should eventually pick up and retry, or you could remove the pod and let Kubernetes replace it for you (so long as it’s part of a deployment or other configuration managing your application).

Regards

Dean Lewis

winscp logo

How to Elevate to Root Access with WinSCP

I was using WinSCP to transfer logs from a VMware CloudBuilder appliance to troubleshoot a failed lab deployment, however the files wouldn’t transfer as the user account to be used had to have root access. For this appliance, you need to elevate to root after login.

Good news, WinSCP can elevate to root after login, In your connection settings pane;

  1. Click Advanced
  2. Under “Environment” select “SCP/Shell”
  3. For the shell value, enter your command to elevate
  4. Save the configuration

When you next connect to your appliances, the command will be sent after login.

Winscp elevate to root

The official pages are here.

Regards

Dean

troubleshooting

[Quick Post] Sharing a generic Troubleshooting poster I found online #vDM30in30

I found this post on Reddit, where a user had created a generic troubleshooting poster. It’s not aligned to any tech, but can be very useful, especially those getting started in IT.

RrdwDgd

Source: Reddit post

Regards

Dean

Troubleshoot

Applying the TSHOOT methodology to everything

Last exam of the year

The past few months I’ve forced my head into revision for the three CCNP exams, with two down, I have TSHOOT left to do, and I’m aiming to pass before Christmas Day. Which is a nice goal to have, but when I have to recertify next year, It’ll be around 24th December.

But currently, Cisco only requires you to pass one of the three exams to keep the CCNP.
The TSHOOT is an interesting exam, compared with others. It contains a small number of multiple answer based questions, however the bulk of the exam is sat around a pre-defined and publicly available topology, where within a simulator you troubleshoot various support tickets.

CCNP_TSHOOT

There’s even an online mock exam provided by Cisco, so you can get use to the way the simulator works before you sit the exam. To me, it’s very close to an open book exam, however having a busy work schedule, and a small amount of time to complete the exams, I personally will not have that much time to sit down and understand the topology inside out. So fingers crossed.

Using the TSHOOT study guide for something else

The first two chapters of the TSHOOT official study guide are actually a really good blueprint for infrastructure maintenance and troubleshooting which can be applied beyond that of just networking.
So that’s what I’m going to touch upon in this blog post.
Let’s start off with infrastructure maintenance, so this not only includes your network devices, but server and client hardware, the following are a few instances;Sample-network-diagram Continue reading Applying the TSHOOT methodology to everything

VMworld2014 barcelona

#VMworld Day 3 & 4 – Final Notes

So I didnt get a change to write up Day 3 nor Day 4, but now I’m home, I can get into it as writing blog posts and uploading pictures from your iPad to wordpress is  pretty horrific to be honest, I was getting ready to launch my said iPad towards the hang space screen during the keynote. Maybe its time to update to the new iPad

So my sessions covered EUC, VVOLS, Troubleshooting and more, with Day 4 being the best day of the whole week, I actually attended every session I was booked into.

Heres a view of the sun outside the conference center, incase your interested

2014-10-16 12.47.23

Continue reading #VMworld Day 3 & 4 – Final Notes