Thursday 26 September 2013

DBA Morning Checklist

DBA Morning Checklist


Backups
- Verify that the Network Backups are good by checking the backup emails. If a backup did not complete, contact _____ in the networking group, and send an email to the DBA group.
- Check the SQL Server backups. If a backup failed, research the cause of the failure and ensure that it is scheduled to run tonight.
- Check the database backup run duration of all production servers.
Verify that the average time is within the normal range. Any significant increases in backup duration times need to be emailed to the networking group, requesting an explanation. The reason for this is that networking starts placing databases backups to tape at certain times, and if they put it to tape before the DBAs are done backing up, the tape copy will be bad.
- Verify that all databases were backed up. If any new databases were not backed up, create a backup maintenance plan for them and check the current schedule to determine a backup time.


Disk Space
- Verify the free space on each drive of the servers. If there is significant variance in free space from the day before, research the cause of the free space fluctuation and resolve if necessary. Often times, log files will grow because of monthly jobs.


Job Failures
- Check for failed jobs, by connecting to each SQL Server, selecting "job activity" and filtering on failed jobs. If a job failed, resolve the issue by contacting the owner of the job if necessary.
System Checks
- Check SQL logs on each server. In the event of a critical error, notify the DBA group and come to an agreement on how to resolve the problem.
- Check Application log on each server. In the event of a critical or unusual error, notify the DBA group and the networking group to determine what needs to be done to fix the error.


Performance
- Check Performance statistics for All Servers using the monitoring tool and research and resolve any issues.
- Check Performance Monitor on ALL production servers and verify that all counters are within the normal range.


Connectivity
- Log into the Customer application and verify that it can connect to the database and pull up data. Verify that it is performing at an acceptable speed. In the event of a failure, email the Customer Support Group, DBA group, and the DBA manager, before proceeding to resolve the issue.
- Log into the Billing application and verify that it can connect to the database and pull up data. Verify that it is performing at an acceptable speed. In the event of a failure, email the Billing Support Group, DBA group, and the DBA manager, before proceeding to resolve the issue.


Replication
- Check replication on each server by checking each publication to make sure the distributor is running for each subscription.
- When replication is stopped, or changes to replication are made, send an email to the DBA group. For example, if the DBA stops the distributor, let the other DBAs know when it is stopped and then when it is restarted again.
- Check for any emails for the SQL Jobs that monitor row counts on major tables on the publisher and subscriber. If a wide variance occurs, send an email message to the DBAs and any appropriate IS personnel.


No comments: