Cross-Forest Kerberos / SPNEGO issues

Recently I was confronted with an interesting Kerberos authentication issue, so I thought why not share it for prosperity’s sake.

First a bit of background on the architecture…

We were working on migrating a fairly large customer to our Online Workspace environment. The customer had separate account domains for development, testing, acceptance and production (where acceptance and production were part of the same forest, and the other two domains were separate forests) as well as separate resource domains (all single domain forests) for DTAP. The resource forests were all hosted and maintained at KPN’s Utility Hosting, the account domains were managed by the customer itself.

We setup 1-way Incoming Forest Trusts between our AD and the 4 resource forests, but unfortunately the customer’s AD admin insisted on External Trusts to the acceptance and production domains.

When the first users were migrated to our AD and the users started testing with their applications they noticed that Single Sign-On (SSO) wasn’t working for their SAP applications. When using the sapGUI it worked fine, but connecting to the web applications resulted in a login page.

The web application was configured to use Simple Protected Negotiation (SPNEGO) for SSO. After reproducing the issue while running a Wireshark trace I noticed the client was requesting a Kerberos TGS ticket for a different ServicePrincipalName (SPN) than we put in the address bar:

Doing a DNS lookup for the URL revealed they had configured it as a CNAME record instead of an A-record, so when the client did a DNS query for the FQDN in the url, it would request a TGS ticket for the real FQDN instead. No biggy, as long as there is a corresponding SPN configured on the account hosting the application.

As you can see in the packet trace above, the client (x.x.4.3) is contacting the local DC (x.x.1.2), receives a referral ticket and contacts the DC in the acceptance resource domain (x.y.1.13) where the server resides and gets a valid Kerberos ticket. So from a Kerberos point of view it all works fine, because we have a Forest Trust connecting to the resource domains, our local AD has the Name Suffix Routing information for these domains (stored in the Trusted Domain Object or TDO) and is able to issue the appropriate referral tickets…however SSO wasn’t working and the user was still prompted for credentials.

After talking to the SAP administrator I learned that the application was running under a service account in de production account domain. (the reason why they configured it this way was probably because in the past they had also configured the trusts between account and resource domains as External Trusts and couldn’t get Kerberos working between the account and resource domains). So the website FQDN was created in DNS as a CNAME pointing to the server in the acceptance resource domain, but it was running under a user account in the production account domain. (are you still following me?)

I first tried adding an entry in the local HOSTS file pointing the website address directly to the server IP (hence bypassing the CNAME to A-record translation), as a result the client was now trying to request a TGS ticket for the right SPN, but our KDC wasn’t able to issue a referral ticket because it had no Name Suffix routing information for the account domains because of the External Trusts not fully supporting Kerberos authentication. If you would like to know more about this, I would really recommend reading this (6 part!) article:

Packet trace with altered HOSTS file :

(local KDC has no routing information for the name suffix, so returns ‘PRINCIPAL_UNKNOWN’ error)

Now there is one more thing I should mention: Even if the trust had been a Forest Trust this probably wouldn’t have worked since the website address falls within the namespace of the acceptance account domain, while running as a user from the production account domain, on a server in the acceptance resource domain. (it keeps amazing me how difficult some admins make this shit)

After reading through Jorge’s blog I decided to try the Kerberos Forest Search Order (KFSO) GPO. Since I wasn’t sure how setting this GPO on the KDC would affect name suffix routing over the existing Forest Trusts, I decided to configure it on the clients instead. Unfortunately, this didn’t do jack nor shit.

So I went through the other Kerberos settings in the GPO and noticed this one: (Define host name-to-Kerberos Realm Mappings)

I couldn’t find much on this setting through Google, so I decided just to give it a try:

In our case we only needed to enable this for a hand full of servers so we added the list of individual servers to the specified realm, but if you need this to work for dozens of servers I would just enter the entire DNS suffix in there.

After configuring this GPO setting and doing a ‘gpupdate’ it worked! (well, at least after I removed the entry from the HOSTS file, because as it turned out they never configured the URL FQDN as a SPN, but only the server’s FQDN)

Voila, a valid Kerberos ticket:

And the packet trace shows that the client is sending the TGS request straight to the correct KDC:

Lessons learned:

  • Avoid using CNAME’s when configuring Kerberos
  • Avoid using External Trusts, if at all possible, always go for a Forest Trust
  • Keep shit simple! Publishing an application running on a server in Domain A, with a service account from Domain B and using the namespace from Domain C is, imho asking for trouble!
  • Enable Kerberos logging (in Windows 2012 R2 this is no longer enabled by reg_dword LogLevel=”1″, but instead in eventvwr under ‘Applications and Services Logs \ Microsoft \ Windows \ Security-kerberos’ right-click, enable)
  • Use wireshark (portable) and the builtin Kerberos filter.
  • Before reproducing or tracing always clear your existing Kerberos tickets with ‘klist purge’ command!