A customer wants to move off an on-premise installation of IBM Informix Database Server (IDS) onto a cloud based platform and they are looking for the best technique to achieve this. In the short term, they want to replicate the data into a system like AWS RedShift and point power users to do their reporting off that system, but they need some help in determining the best way to do that. So not only replicate the data the first time but also keep the database server and the new system in sync (on a daily basis).
- The best way to do a full one-time data dump, with little or no database server downtime.
- The best approach to keep the exported data dump synchronized daily.
- IBM Informix Database Server (IDS): v11.50
- Logical size: 1000+ tables & multiple views (across multiple dbspaces)
- Physical size: ~1TB (spanning multiple raw disks)
- Concurrent users: 50
The two (2) objectives mentionied in section 1.1. are really just asking for how to do a: full backup and an incremental backup on an ongoing basis with little or no downtime for other database users. Figure 1 summarizes the different backup levels that are available in Informix.
Level | Description |
---|---|
Level 0 | Backs up all used pages that contain data for the specified storage spaces. |
Level 1 | Backs up only data that has changed since the last level-0 backup of the specified storage spaces. |
Level 2 | Backs up only data that has changed since the last level-1 backup of the specified storage spaces. |
Based on Figure 1, a full backup can be thought of as a level-0 backup while an incremental backup can be thought of as either a level-1 or a level-2 backup. Level-1 and level-2 backups will typically have shorter backup windows than a level-0 backup. Since the client is requesting for daily incremental backups, we could use the backup plan in Figure 2 below to closely model the client's requirements.
Backup Level | Backup Frequency | Backup Schedule |
---|---|---|
Full backup (level-0) | Monthly | Last Sunday at 6pm |
Incremental backup (level-1) | Weekly | On Sunday at 12am |
Incremental backup (level-2) | Daily | At 12am |
In order to figure out how to make this backup plan work for the client, the next section will highlight and discuss different backup types; recommend a suitable backup type for each backup frequency; then recommend which backup utilities are best suited to realizing the backup plan.
Broadly speaking, there are two (2) types of backups: physical backups and logical backups.
Physical backups involve making raw copies of the directories and files that are used to store a database's data files. This means copying the actual data files -- in binary -- that constitute the database directly from disk. As a result, physical backups offer the fastest speeds for duplicating very large databases but they have a disadvantage of being machine and platform dependent -- the backup might only be recoverable on a machine running an identical version of the database server. Physical backups are also referred to as cold backups.
Logical backups involve reproducing a database's table structure and data (records) usually into a delimited-text file format. Textual backups are usually machine and platform (i.e. operating system) independent but they can take several hours for very large databases. Logical backups are also know as hot backups.
The table below summarizes the characteristics of physical backups when compared with logical backups.
Characteristic | Physical | Logical | |
---|---|---|---|
1 | Backup Data | Exact copies of files containing actual data. | Logical representation of data in storage. |
2 | Backup Granularity | Storage spaces, logical logs, config files etc. | Database- and table-level granularity only. |
3 | Backup Mode | Offline: server may not accept connections. | Online: server may still accept connections. |
4 | Backup Output | Binary format. Compact. | Delimited-text format. Verbose. |
5 | Backup Portability | Machine and platform dependent. | Machine independent and highly portable across platforms. |
6 | Backup Speed | Directly from disk. Very fast. | Requires table locks. Slow for large databases. |
In order to identify which database backup utilities will be useful in accomplishing the backup plan, the next section will present a quick summary of several Informix database utilities, then use the characteristics above to highlight differences between them.
Informix offers several data migration utilites which can be used to perform backups as listed here but there are a few other utilities like High Performance Unload (HPL)
and ontape
which will be included in this review.
-
dbexport/dbimport -
dbexport
anddbimport
utilities import and export a database and its schema to disk or tape.dbexport
locks the database in exclusive mode. To executedbexport
, no users are allowed to be connected to the database. -
onunload/onload -
onunload
andonload
utilities provide the fastest way to move data between computers that use identical IDS versions. Theonunload
andonload
utilities are faster thandbimport
,dbload
, or LOAD but are much less flexible and do not permit modifications to the database schema nor does it allow mixing and matching between different OS or database server versions. -
High Performance Unload (HPL) - HPL is similar to the
dbexport
utility with the additional benefit of allowing data compression during the export to conserve disk space. The main difference withdbexport
is locks -- HPL does not lock out other users from the database, as a result, HPL does not force consistency of the exported data because no exclusive lock is set. The only way to guarantee consistency is to manually prevent the data from being updated while the unload is running. -
onbar -
onbar
can back up and verify storage spaces -- dbspaces, blobspaces, and sbspaces (physical backup) and logical-log files (logical backup).onbar
, does not have direct control of the backup data unlike other utilities. Instead, it delegates read and writes to a Storage Manager (SM) which manages storage devices that might not be as fast as disks. Many Storage Managers will allow backups to disk in place of a tape device. -
ontape -
ontape
utility supports backing up of storage spaces (physical backup) and logical-log files (logical backup). Althoughontape
was designed to use tapes for archiving backups, it equally supports backing up to disks.
In order to implement the full and incremental backups in the backup plan, a suitable tool must support the creation of backups at different backup levels. Level-0 and level-1/level-2 backups, in most cases correspond to the creation of physical and logical backups respectively.
Out of the utilities reviewed, only two (2) utilities support the backup creation at different backup levels: onbar
and ontape
, some feature highlights along with a comparison between them is in order.
In addition to supporting different backup levels, the "v" command line option of the onbar
utility: onbar -v can be used to verify a whole-system or physical-only backup. In other words, it provides backup integrity checking of physical backups but it cannot be used for verifying logical logs.
The onbar -v
command runs the archecker
utility to verify that all pages required to restore a backup exist on the media in the correct form.
For instance, this command verifies the backed-up storage spaces that are listed in the file bkup1
:
onbar -v -f /usr/backups/bkup1
Starting from IDS version 11.70, ontape
supports "back up and restore Informix database data to or from cloud storage". In other words, it can be used to back up and restore data to or from the Amazon Simple Storage Service (S3).
If the client is willing to consider upgrading IDS from 11.5 to 11.7, they will be able take advantage of a vendor supported mechanism of exporting natively to S3, which is essentially replication to AWS with fewer moving parts.
For instance, you can use the ifxbkpcloud.jar
utility to create and name a storage device in the region where you intend to store Informix data.
java -jar ifxbkpcloud.jar CREATE_DEVICE amazon mytapedevice US_Standard
# Set the TAPEDEV and LTAPEDEV configuration parameters in the onconfig file to the cloud storage location.
TAPEDEV '/opt/IBM/informix/tapedev_dir, keep = yes, cloud = amazon,
url = https://mytapedevice.s3.amazonaws.com'
LTAPEDEV '/opt/IBM/informix/ltapedev_dir, keep = yes, cloud = amazon,
url = https://mylogdevice.s3.amazonaws.com'
# Back up data to the cloud storage device by using the ontape utility.
ontape -s -L 0
The exported data in AWS S3 can be imported using the COPY command which appears to be the most efficient way to load data into tables on AWS Redshift.
Most of the information here comes directly from the IBM documentation that compares onbar
with ontape
but note that the backup that ontape
and onbar
produce are not compatible. You cannot create a backup with ontape
and restore it with onbar
, or vice versa.
onbar Backs up and restores storage spaces (dbspaces) and logical files, by using a storage manager to track backups and storage media. Use this utility when you need to:
Select specific storage spaces
Back up to a specific point in time
Perform separate physical and logical restores
Back up and restore different storage spaces in parallel
Use multiple tape drives concurrently for backups and restores
Perform imported restores
Perform external backups and restores
ontape Logs, backs up, and restores data, and enables you to change the logging status of a database. It does not use a storage manager. Use this utility when you need to:
Back up and restore data without a storage manager
Back up without selecting storage spaces
Change the logging mode for databases