We just rebuilt our Domain Controllers because of dated hardware. The way we're set up, we have a DNS Cluster of 3 dedicated DNS servers which is set up as a Secondary to our 2 DCs which are primary DNS. Intermittently, copying the Zone information from the Primary to the Secondary will fail. This causes DNS outage for all of our users.
The error message we see on the Primary (DC) server is this:
The transfer of version 9918764 of zone domain.edu by the DNS server was aborted by the server at 10.X.X.3. To restart the transfer of the zone, you must initiate transfer at the secondary server.
The error message we see on the Secondary servers is this:
Note that we don't see both of those messages at the same time. It's either one or the other.
We did try rebooting everything, manually updating the zones, etc. It will work for a few minutes and then fail again. We will see several successful transfers in the logs in between failures.
There doesn't seem to be any type of networking issue between the severs. They can all ping eachother just fine throughout the problem.
I rebuilt a brand new DNS server and it has the same problem when trying to copy the Zone record from the Primary servers.
The SOA on the primaries looks like this:
Refresh 10 min
Retry 1 min
Expire 1 day
Min TTL 5 Hours
TTL for SOA Record 1 Hour
New DCs are Sever 2008 R2. The DNS Cluster nodes are Server 08 Core.
Any ideas?