Tim Chase on Nostr: If you're a managed-services provider (MSP) and you plan to create a failover ...
If you're a managed-services provider (MSP) and you plan to create a failover name/alias for a company's primary and secondary servers, make sure that both the primary and secondary servers have the that failover name/alias as part of the TLS certs' SAN.
Otherwise, when applications attempt to connect, it will trigger TLS certificate failures on the primary, triggering the failover mode to the secondary, which will also trigger TLS certificate failures, taking down the entire business for 2+ days.
As a MSP, you might be so fortunate as to have a competent geek notifying you within hours of the noticed downtime of the exact nature of the issue and how to resolve it. But it's up to you to decide whether to actually fix the issues and reissue the certs with the correct CN/SAN values. Or maybe instead you choose to require every application to switch to referring directly to the primary server, and then have to switch everything back another day.
#TalesFromTheDayjob
Published at
2025-02-20 16:17:31Event JSON
{
"id": "2455a22e453b93276f01effd84626540dbbdb72349c866f9baf88702411c5bd1",
"pubkey": "9d1fe9f29c7a1e42464c3985f7185fe112b286140d32b8586dd34c6f92d6d9ee",
"created_at": 1740068251,
"kind": 1,
"tags": [
[
"t",
"talesfromthedayjob"
],
[
"content-warning",
"a totally and completely fictional cautionary-tale about certificate management and failover configurations"
],
[
"proxy",
"https://mastodon.bsd.cafe/users/gumnos/statuses/114037112939817311",
"activitypub"
]
],
"content": "If you're a managed-services provider (MSP) and you plan to create a failover name/alias for a company's primary and secondary servers, make sure that both the primary and secondary servers have the that failover name/alias as part of the TLS certs' SAN.\n\nOtherwise, when applications attempt to connect, it will trigger TLS certificate failures on the primary, triggering the failover mode to the secondary, which will also trigger TLS certificate failures, taking down the entire business for 2+ days.\n\nAs a MSP, you might be so fortunate as to have a competent geek notifying you within hours of the noticed downtime of the exact nature of the issue and how to resolve it. But it's up to you to decide whether to actually fix the issues and reissue the certs with the correct CN/SAN values. Or maybe instead you choose to require every application to switch to referring directly to the primary server, and then have to switch everything back another day.\n\n#TalesFromTheDayjob",
"sig": "5b8deeb5b66682ee79a002f53670d6c26ddf2354cb5ee0284036e6a37ab055e5f376d5150902f3ec74b405eb240888d9839c6c44cca4d9f46cf750c0c94714f7"
}