How do filesystem snapshots work on Linux?

To perform valid backups of your database, it is important to suspend the database. This prevents modifications of files during the backup process. By taking a point-in-time snapshot of your database files, your backup program will be capturing a “frozen” database instead of an “in motion” database.

Our standard backup script uses database suspension with snapshots to create point-in-time images of your database files. The snapshot script itself is located at /u2/UTILS/bin/ (with a symbolic link at /bin/save for backwards compatibility).

The snapshot script is typically scheduled to run at regular intervals via crontab to create new filesystem snapshots.  Here’s an example of a snapshot backup script that is scheduled via crontab to run every night at 12:59 AM:

[root@eclipse ~]# crontab -l
59 0 * * * /u2/UTILS/bin/

After running the script, the snapshot filesystems are mounted under /snap, allowing read-only access by backup software. For example, the snapshot of the /u2/eclipse/LEDGER file would be located at /snap/u2/eclipse/LEDGER. When configuring backup software, it is recommended to backup every file under /snap/u2.

Since every change (delta) between the snapshot and the “live” filesystem must be recorded, the snapshots have a finite lifespan. By default, the snapshot script is configured to hold 1GB of changes before requiring a refresh. On busier systems, or on systems where the snapshots must be retained for a longer period of time to accommodate a slow backup process, the snapshot volume size may be increased by editing the snapshot backup script. You may check the status of the snapshots using the “lvs” command, which shows a usage percentage for each snapshot volume.

[root@eclipse ~]# lvs
  LV       VG     Attr   LSize   Origin   Snap%  Move Log Copy%  Convert
  eclipse  datavg owi-ao  26.00G
  ereports datavg owi-ao   1.00G
  lvol0    datavg swi-ao   1.00G u2        45.85
  lvol4    datavg swi-ao   1.00G uvtmp      0.00
  lvol5    datavg swi-a-   1.00G ereports   0.00
  lvol6    datavg swi-ao   1.00G eclipse    0.85
  u2       datavg owi-ao   4.00G
  uvtmp    datavg owi-ao   4.00G
  esupport rootvg -wi-ao   6.00G
  root     rootvg -wi-ao  20.00G
  swap     rootvg -wi-ao   4.00G

When the Snap% value reaches 100%, the snapshot volume has reached its maximum capacity for tracking changes and must be recreated by running the snapshot script again.

For troubleshooting purposes, a log of the snapshot backup script is kept at /tmp/snapsave.log. Information regarding the creation, removal and expiration of snapshot LVs is also recorded in the system log (/var/log/messages).

How do I check the status of my ABS backup on my Linux server?

All successful backups of an Eclipse database server, regardless of the backup software being used, require the successful completion of both of the following two, separate processes:

  • Prepare: the snapshot process that prepares a point-in-time, frozen “picture” of the database and application files. For an overview of this process, see How do filesystem snapshots work on Linux?
  • Capture: the backup software process(es) that take this snapshot data and transfer to any sort of archival media (disk, tape, online vault, etc.)

For ABS customers, a successful backup requires a successful database snapshot and a successful CrashPlan backup running to completion between snapshot intervals. This article will review how to check the status of both processes.

Part 1: Snapshot Verification

First, we must verify that the snapshot process has completed successfully.

Log into the server as root via command line or GUI.

Open the /tmp/snapsave.log via your preferred text editor. For example, using less from the command line:

less /tmp/snapsave.log

Review the log for the following items:

  • Date/time the snapshot was created
  • Whether or not the snapshot operation was successful, or if any warnings/errors were generated

Because each snapshot volume is of a fixed size, it’s possible for snapshots to run out of room. You may check the current snapshot status using the lvs command and noting the values in the “Snap%” column:

[root@firestorm ~]# lvs
  LV       VG     Attr   LSize  Origin   Snap%  Move Log Copy%  Convert
  eclipse  datavg owi-ao 50.00G                                        
  ereports datavg owi-ao  1.00G                                        
  lvol0    datavg swi-ao  1.00G u2         6.01                        
  lvol1    datavg swi-ao  1.00G eclipse    8.52                        
  lvol2    datavg swi-ao  1.00G ereports   0.00                        
  u2       datavg owi-ao  2.00G                                        
  uvtmp    datavg -wi-ao  4.00G                                        
  backup   rootvg -wi-ao 50.00G                                        
  esupport rootvg -wi-ao 40.00G                                        
  root     rootvg -wi-ao 40.00G                                        
  swap     rootvg -wi-ao  4.00G

If any of these values are 100%, the snapshot became invalid at some point by holding too many changes. Since we’re attempting to verify the backup was successful, we’ll want to note the date/time that the snapshot became invalid, to verify that the CrashPlan process completed prior to the snapshot becoming invalid. The /var/log/messages file contains timestamped entries for all snapshot events, so the following command will show you any relevant snapshot-related messages, including when a snapshot is no longer being monitored:

grep snapshot /var/log/messages

As an alternative to manually checking the snapsave.log file on a daily basis, you may opt to configure your system to automatically email this log to your regular address using the instructions found in How do I forward root’s mail to another address?

Part 2: CrashPlan Verification

Next, we must verify that CrashPlan was able to archive all of the snapshot data successfully between the time the previous snapshot completed and the next one was started (typically 24 hours).

If you prefer using the command line, you may verify the last few most recent backups using the following command:

egrep "Starting|Completed" /usr/local/crashplan/log/history.log.0 | tail

There are a few things to note in this output:

  • The backup process should have started after the snapshot was created
  • The backup process should have completed before the next snapshot was scheduled to be created, or before the snapshot filled to capacity

If you prefer to use the GUI, this same historical information is visible from the CrashPlan Desktop interface.

View a step-by-step screencast of this process:

  • Log into the server’s GUI via VNC, the DRAC, or the local console
  • Double-click the CrashPlanDesktop icon to launch the CrashPlan client interface. If there is no shortcut, follow these instructions to create one.
  • The CrashPlan GUI opens to the “Backup” dashboard, and the current status of each separate backup job is displayed
  • For additional information, select the History tab and scroll through the detailed history.

What is ABS?

The Epicor Automated Backup Solution (ABS) uses CrashPlan PRO to product Eclipse customers with data backup both on-site and at a secure off-site location. The ABS Failover offering utilizes this off-site backup to offer hosting for customers a disaster situation.

The product is currently available for Eclipse Customers running Linux and Windows servers. The ABS software is installed and configured by Epicor.

There are three levels of backup for the ABS product:

  • Tier 1 – CrashPlan PRO software alone offers the ability for local backup copies within the customer network
  • Tier 2 – ABS Off-Site Backup: adds the protection of a secure, off-site backup
  • Tier 3 – ABS Failover: adds the ability to failover to Epicor’s hosting environment

The CrashPlan PRO software used by ABS offers:

  • Local or off-site data backup storage in a secure location
  • Byte differential backups, using compression and de-duplication for efficient transportation and archiving
  • Incremental, backup versions
  • Email backup alerts and reports
  • Data is encrypted during transport and backend storage
  • Failover for Epicor Eclipse systems (Eclipse Database, Forms, Imaging and Internet Gateway servers)

Contact your inside sales representative for more information and pricing.