Environment: W2K Server, all available service packs and critical updates installed. SERVER01 is the PDC for our single-site AD domain. One additional W2K server, SERVER02 is the only other DC and holds the Domain Naming role and is the Global Catalog.
Domain Operational Mode is "Mixed Mode". 5 W2K3 member servers and 50+ XP Pro PCs and 30 Linux workstations accessing W2K3 servers via SFU.
All servers are multi-homed with one NIC connecting to internal network (192.168.x.x). Second NIC had, until recently, connected to the public Internet.
10 days or so ago we implemented a firewall. By yesterday we had all servers, except the DCs, moved from public to DMZ (10.0.0.x).
Yesterday morning (7am) I moved the DCs to the DMZ as well. This included changing the NICs IP addresses from public addresses to DMZ addresses.
All seemed well to begin with but around 5pm we started throwing authentication errors.
I've been digging through everything I can find to determine and fix the problem. Every test with DCDIAG succeeds on both DCs, except the OutboundSecureChannels test:
SERVER01:
* The Outbound Secure Channels test Could not Check secure channel from SERVER01 to domainname.com: The specified domain either does not exist or could not be contacted. Could not Query Trusted Domain :The system cannot find the file specified.
* Secure channel from [SERVER02] to [\\server01.domainname.com] is working properly. Could not Query Trusted Domain :The system cannot find the file specified.
......................... SERVER01 failed test OutboundSecureChannels
The "* Secure channel from [SERVER02] to [\\server01.domainname.com] is working properly." message confuses me. Isn't this saying that SERVER01 CAN talk to SERVER02?
SERVER02:
* The Outbound Secure Channels test Could not Check secure channel from SERVER01 to domainname.com: Win32 Error 1355 Could not Query Trusted Domain :Win32 Error 2 * Secure channel from [SERVER02] to [\\server01.domainname.com] is working properly. Could not Query Trusted Domain :Win32 Error 2 ......................... SERVER02 failed test OutboundSecureChannels
Both DCs TCP/IP settings point to themselves (127.0.0.1) only for DNS server (both NICs).
DNS is active ONLY on the internal NIC (since all of the servers and workstations that need to authenticate are on that subnet). I also wanted to avoid listing two A records for the DCs (one a 192.168.1.x and anther 10.0.0.x.). The firewall does allow unrestricted traffic between Trust and DMZ - I can ping the DCs 10.0.0.x addresses from a workstation on 192.168.1.x. I just don't want to burden the firewall with unnecessary traffic between the two subnets. Is my thinking wrong here?
It's Friday night and the weekends are our busiest period of the week. I'm desperate to find a solution.
Somewhere in my digging today I ran across mention of the fact that machine SIDs (GUIDs) are based on network cards. These SIDs are the aliases listed in the _msdcs branch under the domain name in DNS (true??)
The two _msdcs aliases DO match the values returned by some of the DCDIAG tests for each DC and point to the correct server name (which is the correct 192.168.1.x addresses for the DCs).
The Secure Channels issue seems to be the problem, right? Am I correct in thinking that I need to reset the passwords between the two DCs so they can communicate? I just don't understand why the PDC, itself, can't communicate with the domain that it controls....
If this is correct is netdom.exe the right tool to use to get things working again? If so, which machine first? SERVER01 (PDC) or SERVER02?
And if this is the correct action, are the following command lines likely to do the trick?:
On SERVER01:
netdom resetpwd /server:server02 /userd:domainname\administrator /passwordd:*
on SERVER02:
netdom resetpwd /server:server01 /userd:domainname\administrator /passwordd:*
Do I execute the command on SERVER01 and then reboot it followed by SERVER02 and another reboot?
Note that I am throwing Event ID: 6702 errors on both DCs in the DNS Event logs. This seems to be saying that the machines cannot update the other DNS servers in the domain with their A records. Is this because the domain is configured to only allow secure updates and the secure channels between the two DCs are hosed?
Event ID 6702 partial text:
DNS Server has updated its own host (A) records. In order to insure that its DS-integrated peer DNS servers are able to replicate with this server, an attempt was made to update them with the new records through dynamic update. An error was encountered during this update, the record data is the error code.
0000: b4 05 00 00
HELP! I've got 12 hours to get this working....
TIA, Rob
|