# Node management

## Change kernel parameters

Pod’s `securityContext` will most likely result in `SysctlForbidden` erros.

To workaround, create DaemonSet.

```
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: more-fs-watchers
  namespace: kube-system
  labels:
    app: more-fs-watchers
spec:
  template:
    metadata:
      labels:
        name: more-fs-watchers
    spec:
      hostNetwork: true
      hostPID: true
      hostIPC: true
      initContainers:
        - command:
            - sh
            - -c
            - sysctl -w fs.inotify.max_user_watches=524288;
          image: alpine:3.6
          imagePullPolicy: IfNotPresent
          name: sysctl
          resources: {}
          securityContext:
            privileged: true
          volumeMounts:
            - name: sys
              mountPath: /sys
      containers:
        - resources:
            requests:
              cpu: 0.01
          image: alpine:3.6
          name: sleepforever
          command: ["tail"]
          args: ["-f", "/dev/null"]
      volumes:
        - name: sys
          hostPath:
            path: /sys
```

## Reboot node

* Manually, through the Azure portal or the Azure CLI.
* By upgrading your AKS cluster. The cluster upgrades [cordon and drain nodes](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) automatically and then bring a new node online with the latest Ubuntu image and a new patch version or a minor Kubernetes version. For more information, see [Upgrade an AKS cluster](https://docs.microsoft.com/en-us/azure/aks/upgrade-cluster).
* By using [Kured](https://github.com/weaveworks/kured), an open-source reboot daemon for Kubernetes. Kured runs as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) and monitors each node for the presence of a file that indicates that a reboot is required. Across the cluster, OS reboots are managed by the same [cordon and drain process](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) as a cluster upgrade.

**References**

<https://docs.microsoft.com/en-us/azure/aks/faq#are-security-updates-applied-to-aks-agent-nodes>

## SSH to nodes

Set your subscription.

```bash
az account set --subscription "MY-SUBSCRIPTION"
```

Set an env var with your cluster resources RG.

```bash
CLUSTER_RESOURCE_GROUP=MC_my-aks-name
```

Add your RSA key to the node.

```bash
az vm user update \
    --resource-group $CLUSTER_RESOURCE_GROUP \
    --name PUT-YOUR-NODE-NAME-HERE \
    --username azureuser \
    --ssh-key-value ~/.ssh/id_rsa.pub
```

Get your node IP.

```bash
az vm list-ip-addresses --resource-group $CLUSTER_RESOURCE_GROUP -o table
```

Run a pod.

```bash
kubectl run -it --rm aks-ssh --image=debian
```

Install SSH client.

```bash
apt-get update && apt-get install openssh-client vim -y
```

Setup the id\_rsa file.

```bash
mkdir ~/.ssh
vi ~/.ssh/id_rsa
# Paste your id_rsa
chmod 600 ~/.ssh/id_rsa
```

SSH to your node.

```bash
ssh azureuser@PUT.YOUR.NODE.IP.HERE
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.devops.buzz/public/aks/reboot-node.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
