Technical Notes: How to remove juju application in error state

I have been playing with juju for application deployment recently as parts of my current job. Most of the time, I just need to `juju deploy application` and JuJu gives me all settings, no matter what cloud I am using. However, in some cases, juju refuses to follow my orders, which makes me really frustrasted.

For example, the following deployment will definitely fail because grafana charm version 24 was not updated to changes in upstream grafana.

$ juju deploy grafana-24

The system will stuck with following juju status

$ juju status
Model       Controller  Cloud/Region  Version  SLA          Timestamp
experiment  stark-kvm   stark-kvm     2.6.5    unsupported  16:53:31+09:00

App      Version  Status  Scale  Charm    Store       Rev  OS      Notes
grafana           error       1  grafana  jujucharms   24  ubuntu

Unit        Workload  Agent  Machine  Public address  Ports  Message
grafana/0*  error     idle   0            hook failed: "install"

Machine  State    DNS          Inst id    Series  AZ       Message
0        started  tidy-tick  bionic  default  Deployed

When juju stucks at this stage, due to the hook error, we could not remove application. Worse, if you try `juju remove-application grafana` without --force or --no-wait flags, any subsequent command will also fail. In other word, the application refuses to be removed, until you resolve its internal errors.

My colleagues suggests 3 ways to resolve this issue in this case.

1. Resolve the issue and `juju resolve grafana/0`
2. Perform an operational hack 1: change hook script to a bash that always returns normal status code (exitcode 0).
3. Perform an operational hack 2: `juju debug-hook grafana/0`, wait for the hook context loaded, and exit immediately. This will send the positive feedback to juju controllers, let the controller perform the next action (which is our removal command).

I believe that users should be able to remove application every time they want, no matter what problem is occurring. It seems that JuJu developers thought the same so they triaged several related bugs. However, up to the writing of this blog, the issue has not been fixed yet (my version 2.6.5-bionic-amd64 still has the issue). Until the issue is fixed, there is no way but to really resolve the issue or to "hack" it.


Popular posts from this blog

Technical Memo: Disk Encryption using Cryptsetup with Vault as Key Management Service

Use MaaS to manage local computing resource

Technical Memo: Convert an ova image to qcow2 format