Recovering CNPG Apps after Reboot
Apps with a PostgreSQL database that were updated to the new CNPG common sometimes don’t survive a reboot of TrueNAS Scale. The App then hangs on DEPLOYING and pods are in state Completed or TaintToleration.
Symptoms
If you have rebooted and your Apps are hanging on DEPLOYING, check if you see pods in state Completed or TaintToleration and the apps main pod in state Init with the
command k3s kubectl get all -n ix-<app-name>.
Examples:
k3s kubectl get all -n ix-home-assistantNAME                                  READY   STATUS            RESTARTS   AGEpod/home-assistant-cnpg-main-1        0/1     TaintToleration   0          12hpod/home-assistant-cnpg-main-2        0/1     TaintToleration   0          12hpod/home-assistant-85865456d5-tc8h4   0/1     TaintToleration   0          12hpod/home-assistant-85865456d5-kl96x   0/1     Init:0/2          0          12h
k3s kubectl get all -n ix-home-assistantNAME                                               READY   STATUS      RESTARTS   AGEpod/home-assistant-cnpg-main-2                    0/1     Completed   0          22mpod/home-assistant-cnpg-main-rw-df9bcbccc-s8z2n   0/1     Completed   0          23mpod/home-assistant-cnpg-main-rw-df9bcbccc-ptltn   0/1     Completed   0          23mpod/home-assistant-cnpg-main-rw-df9bcbccc-jbbcj   1/1     Running     0          12mpod/home-assistant-5867d984d9-gfznd               0/1     Completed   0          23mpod/home-assistant-cnpg-main-1                    0/1     Completed   0          23mpod/home-assistant-cnpg-main-rw-df9bcbccc-q2w2d   1/1     Running     0          12mpod/home-assistant-5867d984d9-vcp6x               0/1     Init:0/2    0          12mLogs from the cnpg-wait container in the main app pod show something like this:
Testing database on url:  home-assistant-cnpg-main-rwhome-assistant-cnpg-main-rw:5432 - no responseRecovery Steps
To recover your app, you need to first stop it (do not click the Stop button!), delete the hanging pods and then restart the app.
- Stop the app either by checking “Stop All” in the app settings or with HeavyScript via the SCALE GUI Shell as below
 
heavyscript app --stop <app-name>`- 
Wait 2-3min
 - 
Delete any still hanging pods with the below command
 
k3s kubectl delete pods -n ix-<app-name> <pod name>`e.g. k3s kubectl delete pods -n ix-home-assistant home-assistant-85865456d5-tc8h4- Start the app either by unchecking “Stop All” in the app settings or with HeavyScript as below
 
heavyscript app --start <app-name>- If you unchecked “Stop All” you might have to click the 
Startbutton on the GUI (Start is safe, Stop is NOT). There also might be a task that gets stuck in TrueNAS under Jobs (top right). You can get rid of those by restarting the TrueNAS GUI with the below command 
systemctl restart middlewared- 
Wait 2-3 minutes
 - 
Check that the app and all of its pods are running. In the third paragraph there should be no deployment.apps with 0 AVAILABLE
 
Example:k3s kubectl get all -n ix-home-assistant`NAME                                          READY   UP-TO-DATE   AVAILABLE   AGEdeployment.apps/home-assistant-cnpg-main-rw   0/0     0            0           14hdeployment.apps/home-assistant                1/1     1            1           14h- You can scale them up manually to 1 replica or if it’s a cnpg-main-rw pod you might want 2 replicas
 
k3s kubectl scale deploy <deployment.apps-name> -n ix-<app-name> --replicas=1e.g. k3s kubectl scale deploy home-assistant-cnpg-main-rw -n ix-home-assistant --replicas=2Credit
Thanks to Zasx from the TrueCharts team for the steps used to create this guide.