Collaboration in the Enterprise from the perspective of Anthony Holmes, an IBM Accelerated Value Program Leader (Premium Support Program).

Incremental Backups: Overview

Anthony Holmes  24 May 2010 12:39:10 PM
Here's a summary of how incremental backup programs use Transaction Logs.


The behaviour of point in time restores should be very predictable (although one restore might be very different to another depending upon the time since the last full backup and the time selected for the restore, and what happened on the server during that time frame). This behaviour isn't significantly different between Domino 7 and 8, except if you have turned on the Domino Attachment Object Service.


Here's the theory about a point in time restoration:

You select what needs to be restored, and when through the Backup product's interface. Let's say you decide to restore a database called ABC.NSF as at 1pm Friday 21st May 2010.

The Backup software finds the last full backup of ABC.NSF. Let's assume that it was a 1GB database backed up at 10pm on Sunday 16th May 2010.
  • The backup software reads off the tape(s) that backed up that server at the time of the last full backup.  There may be 500GB of files in that backup. It will read through some or all of that 500GB to find ABC.NSF.
  • ABC.NSF has a record of the last Transaction Log ID and Sequence Number that applied at the moment of the full backup (This is recorded as part of its Database Instance ID).
  • The backup program will then need to read through every transaction log between the moment ABC.NSF was backed up on Sunday until 1pm Friday 21st May 2010. It doesn't matter that you are only restoring one database: every single transaction log between the last full backup and the restore point will need to be read off tape and written to hard disk. The version of ABC.NSF will record (in its DBIID) that the last Transaction Log prior at the time of backup was (say) S000500.TXN.
  • The backup program then reads through every transaction that happened on the server looking for changes to ABC.NSF. More specifically, it looks for transactions labelled with ABC.NSF's DBIID number. These are replayed and written in chronological order so that ultimately ABC.NSF is identical to the way ABC.NSF appeared at the point in time... S0000501.TXN, S0000502.TXN, S0000503.TXN etc..
  • The volume of transaction records that needs to be restored to disk is exactly equal to the volume of transaction logs backed up between the last full backup and the point of restore time.
  • For most days, the number of Transaction Log files written per day will be fairly similar. But it might spike if there is an unusually large volume of work: eg a Design Replace of all mail files, or an email with a 20MB attachment sent to 1000 people on a server that isn't running DAOS.
  • The backup programs call the Domino Backup API, which has a small set of commands on how to read from a Transaction Log and write to a .NSF. The logic of the Domino Backup API is quite simple.

Incremental vs Full Backups: Pros/Cons, Consequences

Incremental backups give you huge benefits in terms of faster backup times. It also lets you run backups more than once each day (so you don't need to squeeze it into a window of time overnight). You get point in time restores as another benefit.

There are two prices you pay for this: Restores are significantly slower than from a full backup, especially if the full backups are infrequent.

If faster backup/restores are needed, Domino can easily provide them through appropriate infrastructure. Just as Exchange provides its Volume Shadow Copy service that uses redundant storage to take a copy that is then backed up, Domino allows you to run a cluster server specifically for backup purposes (or dual purpose high availability and backups). The backup server is then fully backed up on a frequent basis using more frequent full backups and fewer incremental backups, allowing faster restorations. For both Exchange VCS and Domino Replica Server backup, there's a (similar) additional hardware cost to provide a higher service level.  

Another thing that you can do to speed the restore times is to ensure that the size of your transaction log drive is sufficient to comfortably store the full range of Transaction Logs between your last full backup time and the time of the restoration. That way the Transaction Logs will still be on the Transaction Log drive and only the .NSF will need to be read from the last full backup. There will be no need to retrieve Transaction Log files from tape. (The transactions will still need to be read off hard disk and replayed, which may still take some time. But will be much faster then restoring them from tape as well.)