[Azure] Application Gateway certificate gotchas
At my current assignment, my team is using the Azure Application Gateway to securely make available some services within Azure such as API Management and WebApps. Up to a couple of weeks ago, we were using the “old” (what’s old, right?) version of the gateway to do this. Until a production outage woke us up, let me describe what was happening.
End-to-end SSL
The Application Gateway allows you to configure a different listening URL compared to the URL that your back-end is using. In our case, some of our backends are simply using the *.azurewebsites.net certificate, but our front-ends are using customized URLs on the customers domain. This effectively means that the gateway will terminate the “outside” SSL and switch to using the internal back-end certificate for internal communication. This way, the entire connection is still secure and thus we have end-to-end SSL.
The V1 gateways have a restriction in the fact that you either can provide your own certificate to do so, or you can provide a custom one. We were using the latter because our API management endpoints also run a custom cert for internal traffic (which in turn is a ‘restriction’ / requirement of API management instances).
Certificate updating
The issue we identified boiled down to the fact that Microsoft had updated their *.azurewebsites.net certificate, but this update didn’t make it to the application gateway instances. So when the back-end hosts started to deploy the new certificate, the gateways started marking the hosts as unhealthy due to an invalid certificate: “BackendServerCertificateNotWhitelisted“. Whoops. It took us a while to find out what has happening as the configuration didn’t change and the certificate itself seemed fine to us. Eventually, forcing an update of the gateway config somehow triggered the certificates to be refreshed which resolved the issue. Microsoft Support confirmed we were not the only ones to have this issue.
Gateway V2: the importance of the certificate chain
After fixing the above issue, support indicated that we might want to consider moving to the V2 SKU of the application gateway. This does not have the limitation of having to pick between either a platform managed certificate or custom certificates, instead it can mix both. It should also be more resilient to updates of the platform certificates, which I guess we just have to believe then.
And so we updated to V2, only to run into the next certificate based issue.
sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Great, so now what? We noticed a part of our Java-based landscape falling over with the new gateways in place. Certificate issues, even though we were using the exact same certificates as before. After again a bit of investigation we found (using ssllabs.com) that the V2 gateway was returning only the primary certificate, where the V1 gateway was returning a full chain. I again got in touch with Microsoft support, who pointed me to https://docs.microsoft.com/en-us/azure/application-gateway/ssl-overview#end-to-end-ssl-with-the-v2-sku.
The problem was with the fact that our PFX we were using did not contain the full chain of the certificate. For the V1 gateway this didn’t matter, but for V2 this does. So again the fix for this issue was a lot simpler than finding the actual issue: we exported the certificate again using the “Include all certificates in the certification path if possible” option. This will create a PFX including the certificate chain.
After uploading these new certs to KeyVault and updating the gateway instance, everything started working again!
October 9, 2019 at 1:06 pm |
Thanks a lot for sharing your information about ‘Gateway V2: the importance of the certificate chain’. Solved my issue after struggeling for two days.
October 9, 2019 at 9:58 pm |
You’re very welcome, great to hear you have resolved the issue!
November 15, 2019 at 8:17 am |
For reason I don’t fully comprehend, getting V2 to output the intermediate certificate seems impossible.
In my dev, I have a trivial Let’s Encrypt cert. Getting AppGw to output the cross-signed X3 intermediate during TLS-handshake eludes me. At this point I’ve tried every single thing I can find from The Net, including a PFX with entire chain up to root, but no avail. Only a single certificate gets outputted.
I don’t know if something changed in AppGw or is there a trivial mistake in my attempts.
December 3, 2019 at 10:35 pm |
That is strange. I would recommend contacting support on this issue. With the correct PFX file (fullchain) it should work.
June 2, 2020 at 1:00 pm |
Do you have an update on this? I am trying to fix an expired Root/intermediate cert and have generated a new full chain pfx but requests to AppGw via curl still fail
March 16, 2020 at 2:19 pm |
Well, just wanted to point out that I had this issue, and doing like this didn’t help. Contacting Microsoft support, they’ve used openssl tool to solve the issue.
Here were the commands they’ve shared with me
#Convert the PFX to PEM
openssl pkcs12 -in certificate.pfx -out certificate.pem -nodes
#See the placement of the Certs in the Bundle
cat certificate.pem
#Convert correct Root from DER form to PEM
openssl x509 -inform der -in test.cer -out test.pem
#Text editor to add 2 correct Roots in and remove incorrect Root
nano certificate.pem
#Command to convert the PEM bundle to PFX
openssl pkcs12 -export -out new.pfx -in certificate.pem
March 17, 2020 at 8:19 am |
Thanks a lot for sharing that solution!
April 8, 2020 at 12:01 am |
The information about the V2 App Gateway not returning the full chain certificate was very helpful. THANK YOU x 1000.
To resolve the issue I generated a PFX file with my full certificate chain by putting the intermediary certificate and website certificate in a txt file using notepad (you just paste them one right after another). Then I ran the command:
openssl pkcs12 -export -out “PFXFile.pfx” -inkey PrivateKey.pem -in “Intermediary_And_ActualCertInOneFile.crt”
And it worked.
In V1 this wasn’t necessary. You could just install the actual certificate. Why did Microsoft change this?
April 8, 2020 at 10:54 am |
Great to hear this helped, Jeff! And thanks for adding some additional info. I’m not sure why they changed it.
July 2, 2021 at 1:58 pm |
I have exported the .pfx file with “Include all certificates in the certification path if possible” and then when i try to upload the exported pfx file to Renew Application Gateway certificates option in azure portal it give me password incorrect error during save this setting. I am using the same password which i have used during export the certificate and gave several try but no luck. What could be the reason?