Technical Note: LXD Database and patch SQL

I deleted some unused zfs storage pools without realizing that they are used by LXD, so today after a machine rebooted, LXD refused to startups with following log messages were output to lxd.log files

t=2019-10-07T23:02:43+0900 lvl=info msg="Initializing storage pools"
t=2019-10-07T23:02:43+0900 lvl=eror msg="Failed to start the daemon: ZFS storage pool \"juju-zfs\" could not be imported: "
t=2019-10-07T23:02:43+0900 lvl=info msg="Starting shutdown sequence"

As lxd settings are stored in dqlite database (distributed sqlite) at /var/snap/lxd/common/lxd/database/global/db.bin, so I go confirm the record settings.

sqlite> .tables
certificates                        networks
config                              networks_config
images                              networks_nodes
images_aliases                      nodes
images_nodes                        operations
images_properties                   profiles
images_source                       profiles_config
instances                           profiles_config_ref
instances_backups                   profiles_devices
instances_config                    profiles_devices_config
instances_config_ref                profiles_devices_ref
instances_devices                   profiles_used_by_ref
instances_devices_config            projects
instances_devices_ref               projects_config
instances_profiles                  projects_config_ref
instances_profiles_ref              projects_used_by_ref
instances_snapshots                 schema
instances_snapshots_config          storage_pools
instances_snapshots_config_ref      storage_pools_config
instances_snapshots_devices         storage_pools_nodes
instances_snapshots_devices_config  storage_volumes
instances_snapshots_devices_ref     storage_volumes_config
sqlite> select * from storage_pools;
sqlite> select * from storage_pools_config;

It seems that the storage pools settings are stored in 2 tables: storage_pools and storage_pools_config.

It tried to delete the related records from the above tables and restarted lxd process but lxd still failed with the same errors. I went back to to database and confirmed that those records were still there even after I deleted it. It seems that lxd recover those records from its log files. I could read the code to see how it recover but it takes time so I decided to look for database documentation in lxd source code while creating a new topic to ask for helps from lxd community.

I skimmed through lxd database documentation and found that I could create a to remove unnecessary records as these sql statements are run at the very early stage of lxd startup. I created a file call with statement to remove unneeded settings and start lxd.

And lxd process starts again with all my in-development containers!

Lesson learned: before removing anything, look for all its usages.


Popular posts from this blog

Technical Memo: Disk Encryption using Cryptsetup with Vault as Key Management Service

Technical Memo: Convert an ova image to qcow2 format

Use MaaS to manage local computing resource