From 6f95e78d9e4e5cc4cfaa4ffc0b4b50d3e35e9915 Mon Sep 17 00:00:00 2001 From: John Kohl Date: Sat, 11 Nov 2023 13:15:16 +0200 Subject: [PATCH] luci-app-statistics: Add backup/restore for RRD statistics Add a backup/restore capability for rrd data storage in luci_statistics. The data storage is typically in /tmp and does not survive reboot or sysupgrade. This adds an option for the administrator to configure the RRD plugin, so that the RRD data are are preserved with a backup copy in the overlay file system. This works for shutdown/reboot, sysupgrade (backup config files, restore config files, and true sysupgrade). Also fix a bug where starting luci_statistics for the first time would not get a restart a running collectd: during install of the package when it is not included in the base flashed image, collectd might be started when it got installed/configured before this package gets installed/configured. So we need to check if it's running, and restart it to use the luci_statistics configuration. Signed-off-by: John Kohl (cherry picked from commit ad98af3a2be6c87b1f36cec05c8c3529831b7787) --- applications/luci-app-statistics/README.md | 136 +++++++++++++++++ .../view/statistics/plugins/rrdtool.js | 5 + .../root/etc/init.d/luci_statistics | 138 +++++++++++++++++- .../root/etc/luci_statistics/README.backups | 3 + .../upgrade/luci_statistics-add-conffiles.sh | 8 + 5 files changed, 288 insertions(+), 2 deletions(-) create mode 100644 applications/luci-app-statistics/README.md create mode 100644 applications/luci-app-statistics/root/etc/luci_statistics/README.backups create mode 100755 applications/luci-app-statistics/root/lib/upgrade/luci_statistics-add-conffiles.sh diff --git a/applications/luci-app-statistics/README.md b/applications/luci-app-statistics/README.md new file mode 100644 index 0000000000..9b315847c4 --- /dev/null +++ b/applications/luci-app-statistics/README.md @@ -0,0 +1,136 @@ +# Backups + +The backup scheme implemented in `/etc/init.d/luci_statistics` aims to +limit writes to stable storage, to preserve flash memory lifetime. +(Flash-memory based routers may have limited lifetime of write cycles, +we want to conserve those.) While it would be simpler to run a periodic +backup as a cron job, you'd risk wearing out the flash memory. This +scheme only writes backups to flash during shutdowns/reboots and +upgrades. + +The backup is only enabled if the administrator sets +`luci_statistics.collectd_rrdtool.backup=1`. + +We only want to restore a sysupgrade backup file if: + +1. It was installed by `sysupgrade -r` (restore configuration files), and + we have rebooted. In this case, there is an orderly shutdown that calls + the shutdown methods. We do not want to overwrite the + restored sysupgrade backup file during shutdown, but after reboot we + do want to restore it. + +1. It was generated during a true sysupgrade, and we are rebooting into + the new image: `sysupgrade` with any or none of `-o`, `-c`, `-f`, + `-u`, resulting in a new image being installed and a config file + being preserved for processing after reboot. In this case we do not + want to overwrite the backup while rebooting during the upgrade. + `sysupgrade` in this case stores a `.tgz` archive of all preserved + files where it can be found after rebooting into the new image, and + it does not run the shutdown scripts before rebooting. + +When the administrator runs `sysupgrade -b` (command line or LuCI), we +create a sysupgrade backup file and it is included in the combined +backup. Then the system continues running. When we later stop or +restart or reboot (orderly conditions, when +`/etc/init.d/luci_statistics` is called to shut down), we do not want to +use the saved sysupgrade backup. If we had a control path after +`sysupgrade -b` that would allow us to remove the sysupgrade backup, this +would be simple. But we don't! + +What we *can* do is arrange that a sysupgrade backup contains enough +information to indicate if it should be restored. + +1. True sysupgrade is straightforward: we arrange that the backed-up + file list only includes the sysupgrade backup file and one twin file + (see below). The next starting of `/etc/init.d/luci_statistics` + after a sysupgrade will restore the sysupgrade backup. + +1. Continued system operation after `sysupgrade -b`: next time we stop the + service (during reboot or during other init script actions), we check + for a stale sysupgrade backup, and if we detect it we remove it. + +1. `sysupgrade -r` only unpacks the backup files, it does not erase + other non-backed-up files still in the overlay. Its intended use is + to then immediately reboot, which will run an orderly shutdown/normal + backup. We must ensure the orderly shutdown in this case preserves + the sysupgrade backup, unlike the previous case. + +To implement these cases, we use a pair of twinned files, only one of +which is included in the list of files preserved by sysupgrade. If we +detect mismatched files (or only one file present) during service +shutdown or startup, we trust that the sysupgrade backup should be kept +and restored. If the files are matched, that indicates that we have not +restored files since the sysupgrade backup, and the current normal +backup should be used instead. + +## During sysupgrade backup + +`/etc/init.d/luci_statistics sysupgrade_backup` is invoked by sysupgrade +for true upgrade or for the `-l` or `-b` flags. We detect the list flag +(`-l`) by checking the process environment, and if found, we only +generate a list: we don't actually do a backup. For all cases, we edit +the list of files listed already and remove any other mentions of +`/etc/luci_statistics` to ensure that only the backup file and one of +the twin files is in the backup list. + +## During sysupgrade + +During a true sysupgrade, only the sysupgrade backup file and one of the +twin files is restored after the image reboots, so the first running of +startup scripting will restore the sysupgrade backup. This could be at +the time of first boot, if the image has been built to include this +package, or it could be later when the package is downloaded, installed, +and the service is started. + +## During backup (including orderly shutdown) + +During backup (run during shutdown), if there is a matched set of twins, +then we know that sometime since the last service start the +administrator ran `sysupgrade -b` and had the chance to copy the +resulting backup. We can now erase the twins and the sysupgrade backup. + +If there is a mismatched set of twins, then someone restored a backup +such as with `sysupgrade -r` and we should now be rebooting, so we +should leave the sysupgrade files alone to be processed on service +restart (after reboot). + +If someone takes a `sysupgrade -b` backup and then restores it before +they reboot or restart statistics, the twins will still match, and we +then don't keep the statistics from the restored backup, we instead take +a new backup from current data and use that on reboot. + +## During startup + +If there are matched twin files (the normal case for shutdown/reboot +without sysupgrade), then the sysupgrade backup is ignored and the +regular backup is restored. If there are mismatched twin files, then +the sysupgrade backup is restored. + +## During disorderly reboot + +In a system crash or other disorderly reboot, the shutdown scripts do +not run. What remains on the system is the previous contents of +`/etc/luci_statistics`. + +* If the system never started luci_statistics, or it was cleanly shut + down before the crash, then there is no difference in behavior from + normal startup: we restore either the sysupgrade backup (if + luci_statistics had never run) or the regular backup (if + luci_statistics was cleanly stopped) + +* If luci_statistics and collectd were running at the time of the crash, + there could be a regular backup and a sysupgrade backup present, plus + volatile data in /tmp (which are lost in the crash). The regular + backup would be from the most recent time the system cleanly stopped + luci_statistics. During the subsequent reboot/service start up: + + * If there is a sysupgrade backup on disk from having run `sysupgrade + -b`, with both twin files matching (meaning the administrator had + taken a backup sometime during the life of the system, before the + crash), they are ignored and a regular backup (if any) is restored. + + * If the sysupgrade backup has mismatched twin files or only one twin, + then it is used to restore state. This would be the case if a + sysupgrade restored configuration (`sysupgrade -r`), whether or not + it did an orderly shutdown/reboot, or if the file system were + damaged in a crash and only one of the twin files survived. diff --git a/applications/luci-app-statistics/htdocs/luci-static/resources/view/statistics/plugins/rrdtool.js b/applications/luci-app-statistics/htdocs/luci-static/resources/view/statistics/plugins/rrdtool.js index e971e2c6c4..7e2704a2c3 100644 --- a/applications/luci-app-statistics/htdocs/luci-static/resources/view/statistics/plugins/rrdtool.js +++ b/applications/luci-app-statistics/htdocs/luci-static/resources/view/statistics/plugins/rrdtool.js @@ -16,6 +16,11 @@ return baseclass.extend({ o.default = '/tmp/rrd'; o.depends('enable', '1'); + o = s.option(form.Flag, 'backup', _('Backup RRD statistics'), + _('Backup and restore RRD statistics to/from non-volatile storage around shutdown, reboot, and/or sysupgrade')); + o.default = '0'; + o.depends('enable', '1'); + o = s.option(form.Value, 'StepSize', _('RRD step interval'), _('Seconds')); o.placeholder = '30'; o.datatype = 'uinteger'; diff --git a/applications/luci-app-statistics/root/etc/init.d/luci_statistics b/applications/luci-app-statistics/root/etc/init.d/luci_statistics index 3684bc1834..5513ace2c2 100755 --- a/applications/luci-app-statistics/root/etc/init.d/luci_statistics +++ b/applications/luci-app-statistics/root/etc/init.d/luci_statistics @@ -1,8 +1,37 @@ #!/bin/sh /etc/rc.common +# run luci_statistics before collectd starts (80) and stop after +# collectd stops (10): START=79 +STOP=11 USE_PROCD=1 +BACKUP_DIR="/etc/luci_statistics" +BACKUP_FILE="${BACKUP_DIR}/rrdbackup.tgz" +SYSUPGRADE_BACKUP_FILE="${BACKUP_DIR}/rrdbackup.sysupgrade.tgz" +SYSUPGRADE_BACKUP_TWIN_A="${BACKUP_DIR}/sysupgrade.trustme.txt" +SYSUPGRADE_BACKUP_TWIN_B="${BACKUP_DIR}/sysupgrade.dont.trustme.txt" +EXTRA_COMMANDS="backup sysupgrade_backup" +EXTRA_HELP="\ backup Backup +current rrd database if configured to do so\n\ sysupgrade_backup Take +a special backup for sysupgrade/configuration saving" + +TRACE=0 + +doing_backups() { + ### Determine if we should do backups/restores + local rrd_enabled=$(uci -q get luci_statistics.collectd_rrdtool.enable) + local rrd_backups_enabled=$(uci -q get luci_statistics.collectd_rrdtool.backup) + rrd_dir=$(uci -q get luci_statistics.collectd_rrdtool.DataDir) + + [ "$rrd_enabled" = "1" \ + -a "$rrd_backups_enabled" = "1" \ + -a -n "$rrd_dir" ] && { + return 0 + } + return 1 +} + service_triggers() { procd_add_reload_trigger "luci_statistics" @@ -21,6 +50,83 @@ start_service() { ### workaround broken permissions on /tmp chmod 1777 /tmp + + ### restore if necessary + rrd_restore + + ### stop collectd if it was running before us + /etc/init.d/collectd status >/dev/null 2>&1 && /etc/init.d/collectd stop >/dev/null 2>&1 + ### always start it so we have functioning statistics + /etc/init.d/collectd start +} + +matched_twins() { + cmp -s "${SYSUPGRADE_BACKUP_TWIN_A}" "${SYSUPGRADE_BACKUP_TWIN_B}" +} + +remove_sysupgrade_backup() { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics removing stale sysupgrade backup + rm -f "${SYSUPGRADE_BACKUP_FILE}" + rm -f "${SYSUPGRADE_BACKUP_TWIN_A}" "${SYSUPGRADE_BACKUP_TWIN_B}" +} + +rrd_restore() { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics rrd_restore + doing_backups && { + ### Restore backup if backups enabled and we have a + ### nonzero backup file and the twins are unequal + ### (absent or one missing or both present but + ### mismatched). + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics checking sysupgrade backup + [ -s "${SYSUPGRADE_BACKUP_FILE}" ] && ! matched_twins && { + ### restore sysupgrade file to replace any + ### backup temporarily in place during various + ### upgrades or reboots + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics restoring sysupgrade backup + mv -f "${SYSUPGRADE_BACKUP_FILE}" "${BACKUP_FILE}" + } + [ -s "${BACKUP_FILE}" ] && { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics restoring backup + ### unpack only files/directories under the configured rrd_dir + data_relative=${rrd_dir#/} + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics restoring only ${data_relative} + tar -xzf "${BACKUP_FILE}" -C / ${data_relative} + } + } +} + +rrd_backup() { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics rrd_backup + doing_backups && [ -d "$rrd_dir" ] && { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics making backup + local tmp_file=$(mktemp -u) + tar -czf "$tmp_file" -C / "$rrd_dir" 2>/dev/null + mkdir -p "${BACKUP_DIR}" + mv "$tmp_file" "${BACKUP_FILE}" + rm -f "$tmp_file" + + ### remove backup if it's stale + matched_twins && remove_sysupgrade_backup + } +} + +backup() { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics backup + /etc/init.d/collectd status >/dev/null 2>&1 && { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics stopping collectd + collectd_restart=yes + /etc/init.d/collectd stop >/dev/null 2>&1 + } + rrd_backup + [ "$collectd_restart" = "yes" ] && { + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics starting collectd + /etc/init.d/collectd start >/dev/null 2>&1 + } +} + +stop_service() { + /etc/init.d/collectd stop + backup } reload_service() { @@ -28,9 +134,37 @@ reload_service() { } restart() { + ### Stop data collection (and make a backup if configured) + stop + ### regenerate config / prepare environment start +} + +copy_backup_for_sysupgrade() { + local backup_date=$(date -Iseconds) + cp -p ${BACKUP_FILE} ${SYSUPGRADE_BACKUP_FILE} + echo ${backup_date} >${SYSUPGRADE_BACKUP_TWIN_A} + echo ${backup_date} >${SYSUPGRADE_BACKUP_TWIN_B} +} - ### restart collectd - /etc/init.d/collectd restart +sysupgrade_backup() { + local filelist="$1" + [ ${TRACE} -gt 0 ] && logger -t ${0##*/} -- luci_statistics sysupgrade_backup CONF_BACKUP_LIST=${CONF_BACKUP_LIST} + doing_backups && { + ### CONF_BACKUP_LIST=1 means we are generating the + ### list, so we don't make the actual backup. + [ "$CONF_BACKUP_LIST" != "1" ] && { + ### backup now if running + status >/dev/null 2>&1 && backup + ### Copy the backup to use for sysupgrade + copy_backup_for_sysupgrade + } + ### Edit the backup file list to remove everything else + sed -i -e /${BACKUP_DIR//\//\\/}/d $filelist + ### Add only the files we need to ensure proper + ### restore behavior + echo ${SYSUPGRADE_BACKUP_FILE} >>$filelist + echo ${SYSUPGRADE_BACKUP_TWIN_A} >>$filelist + } } diff --git a/applications/luci-app-statistics/root/etc/luci_statistics/README.backups b/applications/luci-app-statistics/root/etc/luci_statistics/README.backups new file mode 100644 index 0000000000..11615c1e43 --- /dev/null +++ b/applications/luci-app-statistics/root/etc/luci_statistics/README.backups @@ -0,0 +1,3 @@ +This directory is used by luci-app-statistics to manage backups and restores of +rrdtool statistics data. Any other files you include here manually will not +survive a sysupgrade. diff --git a/applications/luci-app-statistics/root/lib/upgrade/luci_statistics-add-conffiles.sh b/applications/luci-app-statistics/root/lib/upgrade/luci_statistics-add-conffiles.sh new file mode 100755 index 0000000000..9024644c93 --- /dev/null +++ b/applications/luci-app-statistics/root/lib/upgrade/luci_statistics-add-conffiles.sh @@ -0,0 +1,8 @@ +add_luci_statistics_conffiles() +{ + local filelist="$1" + # get list of our files (and create a backup if needed) + /etc/init.d/luci_statistics sysupgrade_backup $filelist +} + +sysupgrade_init_conffiles="$sysupgrade_init_conffiles add_luci_statistics_conffiles" -- 2.30.2