Tuesday, 24 September 2013

Troubleshooting Database Mirroring Error: 1418

Troubleshooting Database Mirroring Error 1418

Error 1418 is a common, hard-to-troubleshoot error. It is so common that it has been given its own page dedicated to the error in Books Online: MSSQLSERVER_1418. Unfortunately, that page only lists a few troubleshooting steps. Over the years, I’ve compiled a longer list of troubleshooting steps for this error. I’ve encountered every one of these issues at some point.

The server network address “%.*ls” can not be reached or does not exist. Check the network address name and that the ports for the local and remote endpoints are operational.
  1. Verify service accounts
    1. SQL Server service account requires CONNECT permissions to partners’ endpoints
    2. Best option is to use domain accounts for all partners
    3. If using Local Service or Local System, must use certificate authentication
    4. If using Network Service, must use the computer account (domain\Computer$)
    5. Windows permissions are irrelevant. Do NOT add to local admins group
    6. Check the SQL log for errors
  2. Verify ports
    1. Use Telnet to test that the port is open and something is listening
      1. Good for verifying that port not blocked by firewall
      2. Does not verify that SQL Server is the process on the other end
    2. Use the netstat command to determine that the SQL instance is listening on that port
      1. Get the process ID of the instance using SERVERPROPERTY(‘processid’)
      2. On Windows Server 2003 SP1+: netstat -abn
      3. On Windows Server 2003 pre-SP1: netstat -ano
      4. Verify the process ID if there is more than 1 instance on the server
    3. Confirm that there are no other SQL instances on the server
      1. If other instances, check for mirroring endpoints
      2. If other mirroring endpoints, make sure they are using different ports and/or IP addresses
        1. IP address and port must be a unique combination
    4. Make sure firewall is not blocking the port
    5. Make sure firewall is allowing traffic both directions
  3. Verify all instances can access every other partner
    1. Ping each partner from each of the other partners
    2. Perform a tracert to each partner from each of the other partners
    3. If ping or tracert fails double check the ports (see above)
    4. Verify that the IP address the names resolve to are the correct IP addresses (don’t trust them)
    5. Verify that you are using the fully qualified domain name
      1. Do not use IP address, there are bugs with this
      2. Do not use just server name
  4. Verify endpoints
    1. Make sure all endpoints are started
      1. Use sys.database_mirroring_endpoints
    2. Ensure that all endpoints are using the same encryption algorithm
      1. Use sys.database_mirroring_endpoints
    3. Ensure that the OS supports the chosen encryption algorithm on all partners
      1. Windows 2000 does not support the default encryption algorithm (RC4)
    4. Make sure the queue for the endpoint is running
      1. Queue can be stopped by too many failures
      2. Drop and recreate the endpoint to start the queue


No comments: