Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switchboard does not wait for DNS changes to propagate #26

Open
justSem opened this issue Aug 4, 2022 · 4 comments
Open

Switchboard does not wait for DNS changes to propagate #26

justSem opened this issue Aug 4, 2022 · 4 comments

Comments

@justSem
Copy link

justSem commented Aug 4, 2022

While testing this in one of our environments we've ran into an issue regarding DNS propagation.

In this case we're running a K8S cluster on DigitalOcean, which also manages the domain.
We haven't changed any DNS-related configuration from DO-Defaults.

The behavior we're observing is that cert-manager tries to request TLS certificates before DigitalOcean has processed the DNS changes - resulting in cert-manager receiving NXDomain responses until resolver caches have been cleared.

This increases the "wait period" for the entire thing to go through increases from <60s to +3H.

Unfortunately none of our developers done anything with Go, so manually implementing our changes would be time consuming if we want to do it properly (so it's actually production-worthy).

We feel like it'd help to either:

  • Have an option to wait for a certain amount of time for DNS changes to pop before issuing a certificate-request to Cert-manager.
    or
  • Have some kind of automation that waits for DO to successfully process the DNS changes.

Obviously the first one is the easiest to implement, and would be more then sufficient for most use cases.

@borchero
Copy link
Owner

borchero commented Aug 4, 2022

Hmm, how exactly do you issue TLS certificates? Do you add a TXT record for the ACME challenge? If so, cert-manager should handle this problem by itself and I would argue that this issue should rather be redirected to cert-manager than Switchboard.

In my experience, cert-manager handled quite well that DNS propagation takes a few minutes but I have only used it on the Google Cloud.

@justSem
Copy link
Author

justSem commented Aug 5, 2022

We handle ACME certs by standard TLS challenges because we're used to it being more speedy then waiting for a TXT record (since the default certbot scripts from the monolithic days used a 100s wait time).

I could try a DNS based issuer to see if that helps. i'll get back to you on that - but of course the problem with the TLS challenges still exists in that case.

@justSem
Copy link
Author

justSem commented Aug 14, 2022

To follow-up: Using the DNS-01 solver indeed solves our issues. However, as stated before, the behavior still persists when a situation occurs in which the HTTP-01 provider has to be used.

@borchero
Copy link
Owner

Thanks for the follow-up @justSem! I get the issue now, but I'm unsure whether Switchboard is the right place to solve it.

One thing that would be possible for cert-manager is to bypass DNS caches by querying your authoritative server directly. In fact, I found an open issue that attempts to tackle the problem you describe if I'm not mistaken (see cert-manager/cert-manager#4246).

I'm a bit reluctant to put this into Switchboard as delaying interactions with enabled integrations adds quite some complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants