Your business critical data and application software stored on your System i and iSeries servers could not be in a more stable and secure environment. There are many factors though that are out of your control and make good system backup and restore procedures vital. These procedures need to be thoroughly tested to ensure that you are able to completely restore your system in the event of a disaster.
With our desire to have our servers up and running as close to 24x7 as possible, it is important to analyze your system backup methods to minimize downtime and impact on your end users. You may have long running daily backup jobs or occasional tape or device failures causing unnecessary burden.
Buy more hardware! Creating a 2nd or 3rd ASP on your system and doubling the amount of DASD can be a great investment when it comes to improving system backup procedures. Consider creating a new library on this new ASP called BACKUP. Write 3 simple CL programs. The first program accepts one parameter - library name. It creates a save file in the BACKUP library using the name passed in. The SAVLIB command is then executed to backup this library into the save file. You may want to use the 'save-while-active' parameters if desired. The second program does a DSPOBJD OBJTYPE(*LIB) command to an output file for all libraries desired. It then reads through this file and does a SBMJOB command for each library, calling the first program and passing the library name as a parameter. Creating a new BACKUP job queue and submitting these jobs to this new job queue will give you the ability to control the number of simultaneous backup jobs via the 'Max Active' parameter on the job queue entry. The third program simply does a SAVLIB to tape of the BACKUP library, during normal business hours. You have eliminated the possibility of tape or device failures from interfering with your critical nightly job stream. Last nights backup is retained on the system for 24 hours for easier restores. You are actually using the capacity of your system to get your backups done as quickly as possible. Multiple processors, disk and memory resources are getting used now versus a single-threaded backup to tape procedure.
What else can be done to improve the duration of a backup procedure?
Analyzing your existing backup job to identify the longest steps in the procedure is a good place to start. Make sure that your current process is running with the 'LOG(4 00 *SECLVL) LOGCLPGM(*YES)' message logging parameters. This will ensure that all CL commands are logged into the joblog and that the joblog is always retained on the system for analysis. You need to find the longest running steps in the process and make sure that they get run via multiple batch jobs in a multi-threaded job queue. Running a single-threaded backup via a CL program or third-party software will guarantee that your backup runs as long as possible - utilizing as little resources as possible. Don't be afraid to bury the system during a backup. It can handle it! If CPU, memory and disk are at capacity - your backup will finish quicker.
Is 'save-while-active' safe?
When it comes to reducing the duration of system backups and nightly job streams, using this capability can significantly improve system uptime. Consider creating another job queue called QUERIES or REPORTS. This job queue remains HELD during normal business hours. Your end users and automated job scheduler submits long running batch reporting jobs to this job queue all day long. If you perform your nightly backups using the 'save-while-active' parameters, you can let this job queue loose at the beginning of the backups. No updates to database files will occur - only reporting jobs. Make sure that you adjust the 'Max Active' parameter on this job queue to the point where your system is buried. Now, rather than having a single-threaded backup with a bunch of single-threaded reporting jobs - you are actually using the capacity of the system to get all of this work done quicker.
What about our full system save where all batch subsystems are ended?
You won't be able to multi-thread batch backup jobs while your system is in a restricted state. You can still analyze the backup joblog and possibly cleanup unnecessary objects on your system. Limiting the number of multi-member database files can be of great help. One member with 30,000 records will get backed up a lot quicker than 30,000 members with 1 record each.If multiple member database files are used extensively throughout your application, consideration should be given to eliminating them. They are not necessary. Consider using a 'break handling program' on your console message queue for automating the system save process. The use of a 2nd or 3rd ASP as a temporary staging area for your full system backups is still a good idea. You should find that saving to disk on a secondary ASP is much quicker than saving directly to tape.
The Workload Performance Series software consists of seven different tools that focus on various aspects of performance and systems management. The Workload Navigator, Disk Navigator and Spool Navigator tools can each play a role in addressing the system backup issues mentioned in this publication. The Disk Navigator tool can be used to easily identify unnecessary objects on your system. Any DASD cleanup will directly affect the duration of your backup procedures. The Spool Navigator tool can be used for disk cleanup as well. It provides easy drill-down access to all spooled files on the system - including your joblogs. The Workload Navigator tool provides easy access to completion history for every job on your system. Each of these tools have a 'hook' into one of our latest new features. 'Workload Navigator (Joblog Analysis)' features have been added to the software to provide an easy method for analyzing any joblog on the system. Just find the joblog from your last backup and select it with our new joblog analysis options. You will be brought to this new set of functionality where you can sort the entries by 'Elapsed'. Data can be dynamically sorted as desired anywhere in our software using the 'F16=Resequence' function key. It is of great help here to sort a joblog by the amount of time that has elapsed between messages in the joblog. Quickly, you can find the longest running steps of your backup procedure and start to make some of the changes mentioned earlier. You may want to just sort the joblog by date and time and then use the 'F21=Print list' function key to generate a much smaller and simpler version of any joblog on the system. If you want to see the additional message information, just use the '8=Details' option on a selected joblog message.
This new set of features basically provides an online analytical application for accessing the details out of any joblog. You can find the information much quicker and easier than scanning through a huge joblog spooled file. Definitely much easier than printing hundreds or thousands of pages of joblog and manually parsing through a hard copy.