One of customer’s migration point to Amazon Web Services was turning on SSO (Single Sign On) – as it’s quite convenient. After fast verification of the possibilities we have, it turned out that we can leverage ADFS. The customer already has ADFS deployed for other services, so there’s no need to convince Security Team that it is basically supposed to be used.
After a few days of a struggle with a complexity of Ping Federate, we managed to turn on SSO, make a nice redirection loop via mentioned ADFS. Returned SAML had everything we needed and Windchill said ”Error 500”.
A more thorough study of documents revealed that SSO with ADFS is definitely supported by Windchill, but creating a new user based on ADFS SAML response is not. Windchill needs user base for comparison, and this can be provided by querying domain’s controller with Idap, best with SSL.
We’ve got infrastructure deployed in AWS and domain controllers in customer’s server rooms. Exposing the controller to the world is not going to work with Security Team, but we have to query them somehow.
Let’s connect to a customer with VPN and query controllers through the tunnel with secure Idaps. But not to make things too easy, a few environments are already deployed in AWS, each of them in different VPC and each of them has to get to AD with its query. Each VPC with different subnet, overlapping customers on-prem networks.
What to do? How to manage?
There were a lot of options (the following are just a few of them):
- Use Transit Gateway and make NAT on the customer’s side
Unfortunately, already deployed VPCs have the same IP addresses as customer had in their network. It’s not that it cannot be dealt with, but the time was running out and playing with routing in a global organization could take years.
- Add a new VPC with network IP address range, which the customer does not have yet, do the routing, peering between already deployed VPSs to new one, add VPN Gateway to this VPC and voila.
A short presentation in PowerPoint, which shed some light to the problem in the customer’s eyes, resulted in making a decision about creating a new VPC, doing the routing, peering VPCs, connecting VPN and deploy AWS Directory Service in Shared VPC.
The great mystery of querying one domain for objects from another one with established Trust Relationship is still mystery. A little failure. Not even mentioning settings in Trust Relationship between the domains.
In the meantime, not to block the development of some additional tools, we set standard Amazon Linux in Shared VPC, and using a simple SSH tunneling, we skipped a few limitations related with the lack of transit VPC in AWS.
Another try was setting a read-only domain controller in AWS. This solution provided the access to the application even in case of problems with VPN tunnels. Organizing VPN went surprisingly well and, basically, it worked with the first one so it was left this way.
We have already had it, read-only domain controller, it could be queried with Idap and everything was fine, but not for the customer’s Security, which had agreed for this scenario before. What to do, Security knows better and they will not give us a public key for the internal CA, because it’s not a public key, just the internal one – ohh, the great PKI mystery.
Driven by some emotions, we decided that we’ve had enough of this ‘dance’. Fast review of the situation:
- VPC connected via VPN with customer’s network – check
- Credentials to log on to AD through ldap – check
What’s missing?
Something like ldap proxy for Active Direcotry. We’re setting something like that in Shared, we’re throwing ldap queries into this proxy, it redirects us smoothly to customer’s AD and we’re successful.
Another 5 hours of attempts and ta-dah! – it’s working! OpenLDAP works. Verification in Windchill and the problem’s gone. EC2 with OpenLdap works and… actually it’s boring.
Hmm, maybe we should put it into a docker container?
Hmm, maybe we should run the container in ECS?
Hmm, generally wd don’t need persistent storage, so maybe Fargate?
Building a container with LDAP Proxy on CentOS is basically just a few lines. Later on, a quick upload to ECR, task definition and service can be created.
But how to direct the traffic to the service? IP will be different every time container restarts. Maybe through LoadBalancer? – nope.
Defining Service, „Service Discovery” option can be used, which creates Hosted Zone in Route53 and updates A record, leading to the service – brilliant.
Now, it’s enough to associate VPC with Hosted Zone and shoot with Idaps queries to the name registered there.
Finally, not to pay special attention to that later on, it’s just enough to define an simple healthcheck, which will make ECS replace a broken container with a new one – within a minute.
What about the cost (without VPN)?
- First option with AWS AD Service – approximately $90/month
- Second option with EC2, which was the domain’s Controler in Read-Only mode – approximately $80/month
- Third option with a container in Fargate run mode – approximately $10 (0,25CPU of 1GB RAM)