October 7, 2019

Technical Note: LXD Database and patch SQL

I deleted some unused zfs storage pools without realizing that they are used by LXD, so today after a machine rebooted, LXD refused to startups with following log messages were output to lxd.log files

t=2019-10-07T23:02:43+0900 lvl=info msg="Initializing storage pools"
t=2019-10-07T23:02:43+0900 lvl=eror msg="Failed to start the daemon: ZFS storage pool \"juju-zfs\" could not be imported: "
t=2019-10-07T23:02:43+0900 lvl=info msg="Starting shutdown sequence"
t

As lxd settings are stored in dqlite database (distributed sqlite) at /var/snap/lxd/common/lxd/database/global/db.bin, so I go confirm the record settings.


sqlite> .tables
certificates                        networks
config                              networks_config
images                              networks_nodes
images_aliases                      nodes
images_nodes                        operations
images_properties                   profiles
images_source                       profiles_config
instances                           profiles_config_ref
instances_backups                   profiles_devices
instances_config                    profiles_devices_config
instances_config_ref                profiles_devices_ref
instances_devices                   profiles_used_by_ref
instances_devices_config            projects
instances_devices_ref               projects_config
instances_profiles                  projects_config_ref
instances_profiles_ref              projects_used_by_ref
instances_snapshots                 schema
instances_snapshots_config          storage_pools
instances_snapshots_config_ref      storage_pools_config
instances_snapshots_devices         storage_pools_nodes
instances_snapshots_devices_config  storage_volumes
instances_snapshots_devices_ref     storage_volumes_config
sqlite> select * from storage_pools;
1|lxd|zfs||1
2|juju-zfs|zfs||1
3|juju-btrfs|btrfs||1
sqlite> select * from storage_pools_config;
3|1|1|zfs.pool_name|lxd
4|1|1|source|lxd
5|1|1|volatile.initial_source|lxd
7|2|1|size|21GB
8|2|1|source|/var/snap/lxd/common/lxd/disks/juju-zfs.img
9|2|1|zfs.pool_name|juju-zfs
11|3|1|size|21GB
12|3|1|source|/var/snap/lxd/common/lxd/disks/juju-btrfs.img


It seems that the storage pools settings are stored in 2 tables: storage_pools and storage_pools_config.

It tried to delete the related records from the above tables and restarted lxd process but lxd still failed with the same errors. I went back to to database and confirmed that those records were still there even after I deleted it. It seems that lxd recover those records from its log files. I could read the code to see how it recover but it takes time so I decided to look for database documentation in lxd source code while creating a new topic to ask for helps from lxd community.

I skimmed through lxd database documentation and found that I could create a patch.global.sql to remove unnecessary records as these sql statements are run at the very early stage of lxd startup. I created a file call patch.global.sql with statement to remove unneeded settings and start lxd.

And lxd process starts again with all my in-development containers!

Lesson learned: before removing anything, look for all its usages.

No comments: