Secure HTTPS Against Unknown Hosts
When the hostname field in TLS certificates becomes unnecessary for API clients. Examples with Python requests library.
When establishing an HTTPS connection, verifying the identity of the server is important to pertain security. Without identity verification, an illicit actor might be able to route the traffic to an evil HTTPS server. In web browsers, verification is done by comparing the hostname in the URL with the hostname stated in the TLS/SSL certificate sent by the server. If the hostnames match and the certificate is trustworthy, the server has proven its identity. To trust certificates across arbitrary websites on the internet in a seamless manner, web browsers use a “chain of trust” relying on a Public key infrastructure (PKI).
API clients have a slightly different access pattern than browsers. They usually access API servers with a single hostname already known when programming the client. API clients usually don’t, as when surfing the web, access multiple URLs with different hostnames.
In the HTTP standard, it is declared that the hostname in the URL entered MUST be verified against the hostname stated in the certificate. This avoids illicit HTTPS servers from using a valid but wrong certificate. But for an API client that only talks to one hostname, the client might already know the server certificate beforehand. Luckily, the HTTP standard mentions one exemption:
A client might be specially configured to accept an alternative form
of server identity verification. For example, a client might be
connecting to a server whose address and hostname are dynamic, with
an expectation that the service will present a specific certificate
(or a certificate matching some externally defined reference
identity) rather than one matching the target URI’s origin.
Source: HTTP Semantics, https Certificate Verification (RFC 9110 section 4.3.4)
Can we apply this exemption to our client by verifying that our servers “present a specific certificate”?
This would mean that we can avoid the extra complexity of relying on a Public key infrastructure. Below I show two ways of achieving this exemption. Examples are written using the popular requests
library in Python.
Match Against Certificate Fingerprint
A fingerprint is a, often SHA-1-based, hash of the certificate. The client can simply match the fingerprint of the server’s certificate against a certificate fingerprint that the client trusts:
import requests
class IgnoreHostnameCert(requests.adapters.HTTPAdapter):
def cert_verify(self, conn, url, verify, cert):
conn.assert_fingerprint = "720f8dd92888b9971a687c555edcf1d48cc03a74"
return super().cert_verify(conn, url, verify, cert)
session = requests.Session()
session.mount("https://", IgnoreHostnameCert())
session.get("https://10.0.0.1/my-endpoint")
The requests
library allows for advanced TLS verification by subclassing the HTTPAdapter
and mounting it to a session before making the request.
When setting conn.assert_fingerprint
, the library automatically skips checking the certificate’s hostname, so we don’t have to disable it ourselves.
This is a simple way of trusting the server certificate.
When we achieve trust like this, the server can use a self-signed certificate.
Hence, we avoid dealing with external Certificate Authorities (CA).
One limitation with this solution though is that the client needs a fixed list of allowed fingerprints. If the client should trust additional certificates, it needs an update. The next section shows how this limitation can be mitigated.
Use Self-Signed Certificate Authority
For more flexibility, we can generate our own Certificate Authority (CA) and tell the client to allow all certificates signed by this CA:
import requests
class IgnoreHostnameCert(requests.adapters.HTTPAdapter):
def cert_verify(self, conn, url, verify, cert):
conn.assert_hostname = False
return super().cert_verify(conn, url, verify, cert)
session = requests.Session()
session.mount("https://", IgnoreHostnameCert())
session.verify = "ca-public-key.pem"
session.get("https://10.0.0.1/my-endpoint")
We disable hostname checking with conn.assert_hostname = False
. This connection is still safe as we are restricting the session only to certificates signed by our CA with session.verify = "ca-public-key.pem"
.
Using this solution, our CA’s private key can be used to generate new server certificates which the client automatically accepts without updates. cfssl gencert is one tool that can generate self-signed CAs and signed certificates.
Use Cases for Disabled Hostname Checking
No need to put the hostname in certificates. If the server switches domain name or IP, its certificate is independent and needs no update. Traffic routing and authentication gets separated from each other.
Dynamic IPs without DNS. The client might have another way of finding the IP of the server than DNS. For example through peer-to-peer sharing or network scanning. The client can still connect to random IPs securely if following the solutions mentioned above.
SSH connection forwarding works.
The -L
flag of SSH is a convenient way of establishing a connection from where the SSH server is running and “tunneling” it to the SSH client. This requires binding a port on localhost
that then forwards the traffic. With hostname checking disabled, HTTPS connections to the server passing through localhost will still work.
Conclusion
Verifying the hostname of a server is not needed for authentication if we expect the server to provide a specific certificate, as mentioned in the HTTPS standard. For random web surfing, this authentication strategy is impractical as we might access websites not known beforehand from which we don’t know which “specific certificate” to expect. Therefore we need to rely on the Public key infrastructure to gain trust and check the hostname so that we visit the website we expect to visit. But for API clients, checking the hostname is unnecessary if matching against the server certificate.