Federation stops working after enabling a HostingProvider with “EnabledSharedAddressSpace” turned on

This could be, for example, Lync- or Exchange Online / Office365.

Recently, federation broke when implementing “voicemail in the cloud” with Microsoft’s Exchange Online. Microsoft Exchange was configured in Hybrid mode, all successful and operational – users could be moved back and forth and use either Exchange on-premise or Exchange online.

Unified Messaging was already enabled and configured in Lync 2013 on-premise. Recently, we extended the configuration to leverage Exchange Online, e.g. by following this blog article http://blog.msgeneral.nl/2011/11/configuring-voicemail-in-cloud-with.html.

issue

But directly after, forwarding calls to voicemail in Online didn’t work. On Lync front-end server, amongst other errors we found event logs that there were no servers in the  “Dial Plan” which accepted the call.

ESAP-Issue_EventLog-1

At about the same time, we found that we lost connection to all federated contacts. But we couldn’t find any errors related in the Event Viewer on the Lync servers. Strange thing!

But, ofcourse, when federation stops working, Microsoft Lync cannot connect to Microsoft Office365 correctly, thus failing calls to the UM services. This might make sense, but WHY did we loose contact with all federated patners and providers?

The issues arose after configuring the SIP Hosting Provider for Microsoft Office365 / Exchange (UM) Online, the only “global”configuration we did on Lync side. And in there, not so much configuration was possible. One thing in particular was the “EnableSharedAddressSpace” switch.

ESAP-Issue_PowerShell-1

 

Enabled Shared Address Space

“EnabledSharedAddressSpace” is a switch which controls whether the SIP domain namespaces will be “shared” over the on-premises Lync environment and Microsoft Office365. Thus, users in Online can use the same namespace as used on-premises, or of individual namespaces should be used ( “user1@onpremise.contoso.com” and “user2@online.contoso.com”, for example). This switch, at least in our customers environment, causes a major change in how federation and related communications works.

In short, now it becomes more important that the Microsoft Lync on-premises environment is able to look up some external DNS records. It is crucial that the SRV Record, _sipfederationtls._tcp.<sip-domain> is able to resolve successfully to the external access edge address.

In our customers environment, the Edge server uses an internal DNS server for its name resolving, instead of a public DNS server directly. This works well in the usual design – both internal server records, as well as external addresses could be looked up and resolve.

ESAP-Issue_DNSWrong-1

However, as the internal DNS server is also authorities for the used SIP namespaces, it is not looking up EXTERNAL DNS records for these domains. If a record is not defined in its scope, it just will not resolve and the edge server (as any internal server) will treat it as nonexistent. And exactly this is happening to our lookups to _sipfederationtls._tcp.<sip-domain>, as well as its containing A-record to the external Edge IP Address!

After some testing, we found out that when the external SRV-record and A-record for the external edge is resolvable from the Lync environment, at the very least from the Lync Edge server, the federation issue will be resolved! Both external organisations and hosting providers will be “availble” again, federation will start working again and so will the connection to Office 365 be restored. Our newly created voicemail configuration started working as well.

Solution

Now we know the cause, we can work to a solution. We could do this in two ways. Either way, add a large doses of patience to it;

1. Configure Microsoft Lync Edge server to use External DNS servers (and for internal resolving, use a HOSTS file f.i.).

This way, both external organizations as also the own (sip)domains will be looked up at the public internet side, and return the public SRV- and A-records succussfully.

2. When this configuration is not desired, adding the external SRV-record and A-record to the internal DNS server(s) is an alternative solution.

This way, the required DNS records will be resolvable for the Lync servers. This was the solution of choice.

ESAP-Issue_DNSRight-1

ESAP-Issue_DNSRight-2

Be aware that this could mean that the last fall-back lookup option to sip.<sip-domain> at clients will also return the public IP address of the edge. Of course only when sip.<sip-domain> is used as the external Access Edge address, and when all other automatic lookup methods ( Lyncdiscover, SRV-record lookup and “sipinternal”) fail.

When applying a solution, allow your environment to apply the changes. Of course you can force a refresh of its DNS cache by running a “IPconfig /flushdns”, but this is not enough. Even restarting the Lync (edge) services might not be sufficient. Allow yourself at least 15-30 minutes after applying the DNS changes and “flushing” DNS cache on the servers to see any changes in the behaviour.

 

2 Comments

  1. Hey thanks for your post.

    I have the same issue but the external dns of sip.***.com on the local DNS Server will not work. Have you another solution?

     

     

Leave a Reply

Your email address will not be published. Required fields are marked *