The Domain Name System (DNS)
What is the Domain Name System?
The Domain Name System (DNS) is sometimes described as the Internet’s phonebook. In abstract terms, it is the mechanism that allows us to address an Internet computer by name, rather than having to use an Internet Protocol (IP) address. IP addresses are essential in order for Internet computers to be able to talk to one another, but they are difficult for human beings to remember. Remembering IP addresses is going to become even more difficult as 32-bit IPv4 addresses are gradually replaced by 128-bit IPv6 addresses. The changeover officially began on 6th June 2012 - the date declared by the Internet Society as "World IPv6 Launch Day". Thanks to the DNS, however, we don’t need to know the IP address of an Internet computer in order to access the services it provides.
Let’s say we want to access a document on a particular website. We type the document’s URL into our browser’s address bar and press ENTER. The browser creates an HTTP request packet containing the URL of the document to be retrieved. But where does our computer send the HTTP request packet? We’ve given it a URL containing the website’s domain name, but in order to send the request packet to the computer associated with that domain name, it needs an IP address. Which is where the Domain Name System comes in.
In (very) simple terms, when you ask your web browser to retrieve a document from a website, it sends a query containing the website’s domain name to the Domain Name System asking for the IP address associated with that domain. Once the browser has the IP address, it sends the HTTP request packet to that address. The web server software on the target computer responds by sending the requested document to your browser. The Domain Name System essentially translates the user-friendly domain names used by human beings into the IP addresses used by Internet computers. In order to do so, the DNS must maintain a database containing a list of registered domain names, together with the IP addresses associated with each of those names.
A domain name may be associated with more than one IP address. Popular websites, for example, are usually hosted on multiple servers because the number of HTTP requests sent to that website would far exceed the capacity of a single web server. Various load-balancing methods are used to ensure that traffic is distributed evenly across the available servers. On the other hand, a single server may host multiple websites. This is the often the case for web servers operated by web hosting services. Incoming HTTP requests include an HTTP header field called Host that includes the website’s hostname, allowing the server to direct the HTTP request to the correct location.
This all sounds relatively straightforward until you consider just how many registered domain names there are. According to the Domain Name Industry Brief for the second quarter of 2019 issued by Verisign - a major provider of Internet infrastructure services - there are now in excess of 350 million registered domains across all top-level domains (TLDs), with the number currently growing at a rate of around 18 million registrations per year. So, when we say that the DNS must “maintain a database”, we are speaking somewhat figuratively. The DNS can best be described as a worldwide distributed directory service that runs on literally thousands of DNS name servers scattered around the globe.
Without the Domain Name System, the World Wide Web as we know it could not exist. In fact, none of the internet services we take for granted, like email or Internet telephony or bulk file transfer, would work without it. Any failure within the DNS system has the potential to severely disrupt the operation of the many businesses that rely on the Internet to market or distribute their goods or services. Thankfully, although failures do occur from time to time, there is considerable redundancy built into the DNS system to allow for such occurrences, and their impact is usually limited.
How the DNS is organised
The DNS namespace has an inverted tree structure, at the top of which is the DNS root zone. The root zone itself does not have a name, and is represented by an empty string (“”). The next layer in the DNS hierarchy consists of the top-level domains (TLDs). In 1985, there were three country code TLDs (us, uk and il) together with six generic top-level domains:
- com - commercial organisations
- edu - U.S. educational institutions
- gov - U.S. national and state government agencies
- mil - U.S. military
- net - networks
- org - organisations
The edu, gov and mil TLDs were (and still are) only available to U.S. educational institutions, government agencies, and the U.S. military respectively. The int domain label was introduced in 1988, and is similarly restricted to organisations, offices and programs that are endorsed by a treaty between two or more nations. 1986 saw the introduction of eight more country code TLDs (au, de, fi, fr, jp, kr, nl, and se).
The remaining TLDs are available to anybody who wishes to register a domain name, although they were originally intended for use by organisations matching specific criteria. The com TLD was intended for commercial (i.e. for-profit) business organisations; net was intended for domains representing distributed computer networks; and org was intended for non-commercial (i.e. non-profit) organisations.
Today, there are literally hundreds of top-level domains, most of which you have probably never heard of, and a number of which are subject to restrictions of one kind or another in relation to hostname registration. Examples of TLDs you might have come across include biz (created as an alternative to com for commercial organisations), and info (intended for websites whose primary function is to provide information of some kind).
As of May 2017, there were 255 country code TLDs (for example uk representing the United Kingdom, or de representing Germany). You can see an alphabetical listing of these TLDs here. There are also a growing number of top-level domains related to specific geographical locations, such as berlin, london and paris, and to brand names such as cartier, panasonic and wallmart.
The overall structure of the Domain Name System has not changed significantly since the publication of RFC 1591 in March 1994, which carries the title “Domain Name System Structure and Delegation”. RFC 1591 has this to say on the way in which names are organised in the DNS:
“. . . there is a hierarchy of names. The root of system is unnamed. There are a set of what are called "top-level domain names" (TLDs). These are the generic TLDs (EDU, COM, NET, ORG, GOV, MIL, and INT), and the two letter country codes from ISO-3166. It is extremely unlikely that any other TLDs will be created.
Under each TLD may be created a hierarchy of names. Generally, under the generic TLDs the structure is very flat. That is, many organizations are registered directly under the TLD, and any further structure is up to the individual organizations.
In the country TLDs, there is a wide variation in the structure, in some countries the structure is very flat, in others there is substantial structural organization. In some country domains the second levels are generic categories (such as, AC, CO, GO, and RE), in others they are based on political geography, and in still others, organization names are listed directly under the country code.”
The words “It is extremely unlikely that any other TLDs will be created” must rank alongside Thomas J. Watson’s 1943 alleged statement that “ . . . there is a world market for maybe five computers" or Bill Gates’s supposed utterance at a trade show in 1981 in defence of the IBM PC’s 640K usable RAM limit that “640K ought to be enough for anybody”.
At the time of writing (September 2019) there are over 1,500 top-level domains. Just under three hundred of these domains are country code TLDs. Of the remainder, the vast majority are generic TLDs. A current list of TLDs, including the name of the organisation responsible for managing each TLD, is available here.
The structure of the domain name system below the top-level domains is relatively flat, with most registered names belonging to either second-level or third-level domains. This website (technologyuk.net) is an example of a second-level domain. Many of the third-level domains are sub-domains of a generic second-level domain such as co or ac. (e.g. google.co.uk). Many third- or fourth-level domains are sub-domains of a registered domain name.
The Domain Name System has a hierarchical structure
The root name servers
The root name servers are a collection of several hundred name servers located at strategic locations around the world. Their purpose is to act as name servers for the root zone of the DNS. Each root name server maintains a copy of the root zone file - a relatively small file (currently just over 2 MB) containing a list of the names and IP addresses for the authoritative name servers for all top-level domains (you can obtain a copy of the root zone file here).
If name servers working at a lower level in the DNS hierarchy are asked to resolve a DNS query, but do not currently hold any information about the top-level domain specified in the query, they will request the information from a root name server. The root name server will respond by returning a list of the authoritative name servers for that top-level domain.
There are ostensibly just thirteen root name servers, which seems an awfully small number when you consider the number of requests they could be asked to handle. The number of DNS queries generated in a single day worldwide is staggering - several trillion per day! Even if only a fraction of these queries were sent to the root name servers, the workload would be too much for thirteen servers to handle.
All is not as it seems, however. Each of the thirteen root server names actually represents a cluster of root name servers, each of which has its own copy of the root zone file, and each of which is capable of handling DNS requests. As of August 2019, there are over one thousand root name servers. But why do they sit behind just thirteen server names?
The answer to that question is that, back in the early days of the DNS system, a decision was taken that it should be possible to transmit the server names, IP addresses, and configuration data for all of the root name servers in a single data packet in order to avoid the additional overhead of transmitting multiple packets.
Furthermore, it was decided to use User Datagram Protocol (UDP) in preference to the Transmission Control Protocol (TCP) because UDP is a connectionless protocol that involves far less overhead than TCP - a connection-oriented protocol (don’t worry if you are unfamiliar with these protocols. The important thing to understand is that, for reasons of efficiency, all of the root name server details were to be transmitted in a single UDP datagram).
Unfortunately, the DNS protocol (which we’ll talk about later) effectively limits the payload of the UDP datagram to 512 bytes (the actual maximum length of a UDP datagram, including header and data, is 65,515 bytes). The server name, IP address and configuration data for each root name server required, on average, 32 bytes of data.
Consequently, it was decided to limit the number of root name servers to thirteen, accounting for 416 bytes of payload space. The remaining 96 bytes were reserved for future use in case additional supporting data was required, and leaving open the option of adding additional root name servers at some future date.
A DNS server that initiates DNS requests on behalf of a client is known as a DNS recursive resolver (or just resolver). In order to function, the resolver must know the names and addresses of the root name servers. This information is stored in the resolver’s root hints file, which by default is a file called Cache.dns. The initial version of this file is provided by the server software vendor or distributor. Over time, however, the contents of the root hints file may become outdated.
The resolver requests an updated list of the root name servers each time it boots up, or after some pre-defined period of time has elapsed. It sends a request called a priming query to the first root name server listed in its root hints file asking for updated information. If it receives no response, it sends the same request to the next root name server. The resolver will repeat this procedure with successive root name servers until it is successful in getting an up-to-date list. The process of priming a DNS resolver with priming queries is described by RFC 8109.
IANA’s current root hints file can be downloaded here.
The thirteen root server clusters are administered by a total of twelve administrative entities (Verisign administers two clusters). A list of the root server names, IP addresses (IPv4 and IPv6), and the names of the entities responsible for administering each server cluster can be found here. You can see an interactive map showing the distribution of the root name severs on the Root Server Website.
The image below is a screenshot of the interactive map at the Root Servers Website showing root name server distribution in the London area. Clicking on the pushpin symbol for an individual server on the map opens an information box showing the location, operator, IPv4 and IPv6 addresses, and autonomous system number (ASN).
Root name server distribution in the London area
The letter on the pushpin symbol relates to the server name associated with the root name server cluster to which the server belongs (in this case k.root-servers.net). The root server names have been standardised so that they all use the same format (<letter>.root-servers.net), where the first part of the name is a single letter of the alphabet, starting with “a“ and ending with “m”.
The addition of IPv6 addresses to the information to be returned in response to a priming query obviously means that the 512-byte payload limit for a UDP datagram carrying a DNS query will be exceeded. In order to get around this, an Extension mechanism for DNS (EDNS) was developed to allow for a larger payload.
Note that a DNS server can handle requests from clients that don’t support the EDNS extension, but it cannot use EDNS to do so. The response to such a query will be inserted into a default 512-byte message. If the response exceeds 512 bytes, the message is marked as a truncated result. The client then has the option to retry using TCP.
Each of the root name servers (except one) has its own informational home page, which you can access by typing the URL of the page in the form http://<letter>.root-servers.org, where <letter> is in the range a – m, into your browsers address bar. The exception is the server g.root-servers.net, which is administered by the U.S. Department of Defense. The screen shot below shows the home page for root name server b.root-servers.net (http://b.root-servers.org), which is administered by the University of Southern California.
The home page for root name server at b.root-servers.net
With regard to the administration of the DNS namespace, RFC 1591 states:
“The Internet Assigned Numbers Authority (IANA) is responsible for the overall coordination and management of the Domain Name System (DNS), and especially the delegation of portions of the name space called top-level domains.
. . . A central Internet Registry (IR) has been selected and designated to handle the bulk of the day-to-day administration of the Domain Name System. Applications for new top-level domains . . . are handled by the IR with consultation with the IANA.
The central IR is INTERNIC.NET. Second level domains in COM, EDU, ORG, NET, and GOV are registered by the Internet Registry at the InterNIC . . . .
. . . . While all requests for new top-level domains must be sent to the Internic . . . the regional registries are often enlisted to assist in the administration of the DNS, especially in solving problems with a country administration.”
InterNIC is the name commonly used to refer to the Network Information Center (NIC), which was the organisation primarily responsible for DNS domain name allocation. Responsibility for running the InterNIC was taken over by the Internet Corporation for Assigned Names and Numbers (ICANN) in 1998. The InterNIC still provides reference documents and information related to domain registration, but the registration of domain names has now been delegated to commercial domain registrars.
Of the 1500 plus top-level domains now in existence, some thirty or so are currently not assigned to an administrative entity. The remaining TLDs are administered by a total of almost 850 separate companies and organisations. Most of these administrative entities are responsible for either a single domain or a small group of domains. In some cases, an entity may administer a much larger number of domains (examples include Amazon Registry Services, Inc. and Binky Moon, LLC).
According to ICANN:
“The role of the registry operator within the Internet ecosystem is to keep the master database of all domain names registered in each top-level domain (TLD) and generate the "zone file" that allows computers to route Internet traffic to and from TLDs anywhere in the world.”
A DNS zone is some portion of the DNS namespace under the administrative control of a single organisation, company or individual. These administrative zones form a hierarchy, at the top of which is the DNS root zone. As of 2016, the day-to-day management of the root zone is the responsibility of the Internet Assigned Numbers Authority (IANA) acting on behalf of the Internet Corporation for Assigned Names and Numbers (ICANN).
Other organisations contributing to the root zone management process include the National Telecommunications Information Administration (NTIA) and the root server operators (see above), including ICANN and Verisign. The primary role of the root server operators is to maintain the operational status and security of the root name servers, to publish the root zone file, and to ensure that the contents of the root zone file are always accurate and up to date.
The next level of administration consists of the zones associated with each top-level domain (TLD). Management of each TLD is delegated to a single organisation, who are then responsible for maintaining the registry for that TLD. The registry is the database of all domain names registered under the TLD. Registry operators are responsible for providing registry services and publishing the zone file for the TLD.
The zone file for a TLD does not contain the DNS records for every domain registered under that TLD. Instead, it maps active second-level domain names for the TLD to the IP addresses of the authoritative name servers for those domains. Authoritative name servers are typically owned and operated by large organisations like Google and Microsoft, or ISPs such as Virgin Media, BT Group Plc., and Comcast, although they can also be operated on behalf of such organisations by an external provider.
Some educational institutions operate their own authoritative name servers. In the UK, for example, both the University of Oxford and the University of Cambridge have their own name servers, as do many universities in the USA. Government organisations also tend to operate their own name servers, or delegate the task to government-sponsored organisations.
The primary authoritative name server for the www.gov.uk website is operated by Jisc (formerly the Joint Information Systems Committee), a UK non-profit company that supports tertiary education, and both undertakes and supports research and development work related to new technologies. Funding is provided by UK further and higher education funding bodies, and through contributions received from higher education institutions.
Administration of the country code top-level domains (ccTLDs) is delegated to national registries. In the United Kingdom, for example, administration of the uk ccTLD is the responsibility of Nominet UK.
An authoritative name server for a domain maintains information about that domain which it stores in its zone file as a resource record (RR). This information is used to respond to any DNS queries received for the domain. When a domain name is registered with a TLD or transferred to a different administrative entity, the names of at least two authoritative name servers for the domain must also be provided (most DNS zones use at least four name servers).
The authoritative name servers for a zone are usually configured as either masters or slaves. The master is the primary authoritative name server that stores the original (i.e. master) version of the zone file. Each secondary authoritative name server is a slave that maintains a copy of the zone file. The copy is automatically updated when changes are made to the master file - a process known as a zone transfer.
The overall structure of a zone file is defined in RFC 1034 and RFC 1035 (note that although these RFCs were written in 1987, and have been updated by a number of other RFCs since that time, the overall format of the zone file has remained essentially the same). The zone file itself is a text file, the bulk of which consists of resource records. Each resource record starts on its own line, and consists of several fields separated by white space, as shown here:
The structure of a DNS resource record
The following list describes the content of each field in more detail:
- name - the alphanumeric identifier for the DNS record. If left blank, it will inherit its value from the previous record.
- ttl - the time to live value. Specifies the time the record data should be kept in the DNS client’s cache. If no value is specified, the global time to live value specified at the beginning of the zone file is used.
- record class - indicates the namespace to which the record data belongs. This will almost always be IN (the Internet namespace).
- record type - the DNS record type (see below).
- record data - the contents of this field will depend on the record type.
The record type field in a resource record takes one of the following values:
- A - the record contains an IPv4 address for the domain.
- AAAA - the record contains an IPv6 address for the domain.
- CNAME - the record contains a canonical name that establishes the domain as an alias for the domain specified by the CNAME value, to which the DNS client will be re-directed.
- NS - a name server record that contains the name of an authoritative name server for the domain. The records for a domain usually include multiple name server records. One will point to a primary authoritative name server that holds the master root zone file. The remainder will point to secondary authoritative name servers, each of which holds a copy of the root zone file.
- SOA - a start of authority record that identifies the administrative entity to whom the domain has been assigned. The SOA record includes the name of the primary name server for the domain and the email address of the zone administrator. Note that the @ sign is replaced with a dot (“.”).
- PTR - a reverse DNS lookup record. Whereas an A record resolves a domain name to an IPv4 address, a PTR record resolves an IPv4 address to a domain name. Note that if the domain shares an IP address with other domains - often the case in a shared web hosting environment – the domain name returned will be that of the host server.
- MX - the mail exchanger record contains the name of an SMTP mail server to which email for the domain should be routed. The name is preceded by a preference value. If multiple mail servers are available, email will be sent to the mail server with the lowest preference value. If the send operation fails, the message is sent to the mail server with the next lowest preference value.
If you want to find out which authoritative root name servers host the DNS records for a particular domain, there are many free online tools available that allow you to do just that. The output below was generated using the NsLookup network tool provided by the CentralOps.net website.
Data returned by the CentralOps.net NsLookup network tool for technologyuk.net
As you may have noticed, there is currently no AAAA record for technologyuk.net. At the time of writing fewer than 20% of active websites worldwide have an IPv6 address, but the number is growing (estimates vary from 10% to 15% annually. There is also no CNAME record. All of the other record types mentioned above are present, however.
The A record indicates that the IPv4 address for technologyuk.net is 220.127.116.11. As you can see, there are four name server (NS) records, indicating that technologyuk.net has four authoritative name servers. The SOA record indicates that the primary authoritative name server is ns1062.ui-dns.org, and gives the administrator email address as email@example.com.
The names of two mail servers are provided by two mail exchanger (MX) records, although the preference value is the same for both, which means that mail will be distributed evenly to both servers, probably using some form of load balancing algorithm. The remainder of the data, starting with the PTR record, relates to reverse DNS lookup.
The technologyuk.net website does not currently reside on a dedicated server, so the PTR record resolves the IPv4 address to the name of the shared server or (more likely) server cluster, which in this case is elastic-ssl.ui-r.com. All of the resource records that follow the PTR record relate to the domain elastic-ssl.ui-r.com.
Domain name registration
Before a domain name can be used to access a website, it must be registered with the Domain Name System. Registering a domain is actually relatively easy. Probably easier, in fact, than finding the perfect domain name. Before we look at how you might go about registering a domain name, however, it might be informative to briefly look at some of the relevant history.
Way back in the 1990s (which to some of us doesn’t seem all that long ago . . . ), life was relatively simple. Up until 1999 a company called Network Solutions Inc. (NSI) was not only responsible for operating the registries for the com, net and org top-level domains, it was also the sole registrar for these domains. It would not be long, however, before NSI’s monopoly in this respect was challenged.
To cut a long story short, other companies wanted a slice of the action, and despite the failure of legal manoeuvrings that included an anti-trust suit against NSI, the pressure that was created would lead to a restructuring of the domain name market in which domain name registration services would be shared between multiple competing commercial entities. Since 1999, literally hundreds of companies have entered the market as domain name registrars.
So how do you go about registering a domain name? The first step - assuming you’ve dreamed up the perfect name for your website - is to make sure that the domain name you want is available, and hasn’t already been registered by someone else. There are numerous websites that allow you to check the availability of a domain name. For the purposes of illustrating how easy the process is, we’ll use the Domain Check facility provided by CentralOps.net.
Let’s assume we are avid fans of the canine species and want to set up a website with the URL www.ilovepuppies.com. The Domain Check facility comes back and tells us the domain is taken. Sure enough, when we type the URL into a browser’s address bar and hit ENTER, we are taken to the following page:
The current home page of www.ilovepuppies.com
This is quite obviously a “parked” domain that somebody expects to make money out of at some point. Lesson number one: if you’ve thought of the perfect domain name for your project, the odds are better than average that someone else got there before you - either in a speculative bid to cash in on a desirable domain name or because they had the same idea as you.
All is not lost, however. It doesn’t have to be a dot.com name, after all. In fact, a website about cute puppies is not really something you would associate with a commercial organisation, which is what the com domain was intended for. There are other, perhaps more suitable options. How about www.ilovepuppies.info? Not so trendy, maybe, but being trendy is far less important than providing good content. And this time, the Domain Check facility tells us the domain name is available!
A business or organisations large enough to run their own local or wide area network will already have one or more registered domain names. The rest of us need to use the services of a domain name registrar. Most Internet Service Providers (ISPs) offer comprehensive web hosting and domain name registration services, and there are plenty of independent web hosting services like GoDaddy.com, LLC or 123-Reg Limited that offer domain name registration as part of their product package.
The cost of setting up a web site very much depends on what you want to do with the web site and on the top-level domain you choose. For example, 1&1 IONOS currently offer a basic package including one free domain name, one wildcard SSL certificate, 25 email accounts with 2GB storage each, 100GB of web space, and 25 databases. The price currently is £1.00 per month for the first six months and thereafter £5.00 per month.
Many companies offer this kind of package, which is ideal if you just need a basic website. The total annual cost is usually affordable, and domain name registration is taken care of by the service provider. Bear in mind, however, that to get the domain name you actually want, you may have to pay a bit more.
All companies offering domain name registration services must be registered with ICANN. You can see a list of all ICANN-accredited domain name registrars here. As part of the registration process, the registrar must provide the registry with the names of at least two authoritative name servers for the domain.
The annual charges which the customer (the domain registrant) pay to an ISP or web hosting service include a contribution to the fees paid annually by the registrar to the registry operator. An important point to note here is that a domain registrant does not own a domain name, but they have the exclusive right to use that domain name during the period for which the domain is registered to them. This is usually two years, although domain names may be registered for up to ten years.
In most cases, domain name registration is renewed automatically at the end of a registration period unless the registrant has given notice that the domain name registration is not to be renewed, or has opted out of the automatic re-registration process.
If a domain name expires and the option to renew the registration is not exercised, the domain name will eventually become available once more, although this may take a while because registrars usually allow a grace period (anything from two weeks to a year, depending on the registrar) to give the registrant a chance to renew the registration.
When the grace period expires, there is a further redemption period of up to 30 days, during which the registrant may still renew the domain on payment of a redemption fee plus the standard renewal fee for the domain. If the redemption period expires without renewal, the domain name may be put up for auction. If there are still no takers, the registry operator is notified and the domain name will (at the discretion of the registry operator) be available once more.
The registration details for a given domain are publicly available using the WHOIS protocol, which allows you to retrieve basic information about a domain, such as when it was registered, when it is due to expire, the name of the registrar, and the contact details of the registrant. You don’t even need to know how the protocol works in order to use it. There are numerous online tools that will retrieve the WHOIS data for you.
The data below was returned by the WHOIS facility at Whois.com. Note that the contact details for the domain (registrant contact, administrative contact and technical contact) are all pointing to the same email address (firstname.lastname@example.org) which effectively hides the real contact details. This is to safeguard the privacy of the registrant in accordance with the EU’s General Data Protection Regulation (GDPR) which came into force on 25th May 2018.
WHOIS data for technologyuk.net
The data above has been tidied up and presented in an easy-to-read format. The image below shows the raw data returned by the WHOIS protocol. You can see here the full extent of the data redaction due to the requirements of the GDPR.
Raw WHOIS data for technologyuk.net
The domain registrar may edit or delete the information held for a domain in the registry database. If a registrant decides to move their database to a different hosting service for some reason, a domain transfer process will be initiated, during which the domain will be transferred from the existing registrar to the new registrar using established domain transfer procedures. This process usually entails a change in the domain’s IP address and the names of the authoritative name servers for the domain.
The address resolution mechanism
The process of address resolution starts when an Internet client wants to contact an Internet server- For example, when you click on a link in a web page, your browser will attempt to retrieve the resource that the link points to (usually another web page) using the HTTP protocol. The HTTP request will contain (among other things) the website’s hostname, the filename of the document being requested, and the location of that file in the website’s directory structure.
Before the HTTP request can be sent, however, the client computer needs to know the IP address of the domain hosting the website. We will assume for the purposes of this discussion that there is no data in the browser cache, or in any cache residing at any other level of the DNS system, pertaining to the domain of interest, which means we won’t be able to skip any steps in the domain name resolution process.
Let’s examine what happens when you click on a hypertext link in a web page. The first thing that happens is that your computer will send a DNS query containing the domain name in the link’s URL to something called a recursive resolver. The resolver bit is fairly self-explanatory. The resolver’s job is to resolve a domain name to an IP address, and then send the IP address to your computer so that it can send the HTTP packet to the right place.
The recursive bit means that the query sent by the client is recursive, because the recursive resolver cannot answer the client’s query itself (unless it has the required data in its cache), and must therefore query another DNS server at a higher level in the DNS hierarchy.
The queries generated by the recursive resolver are iterative in nature, because it will need to query more than one higher-level DNS servers in order to obtain the information it requires. The response to each iterative query will either be the required IP address or the address of another DNS server. In other words, each query brings the resolver one step closer to obtaining the IP address.
The number of DNS servers involved in resolving a DNS query will depend on what kind of network the client is attached to, and how that network is configured. For the purposes of this discussion, we are going to assume that the client computer is attached to a typical home network. The client computer will be connected to a home router of some kind, either wirelessly or via an Ethernet network cable.
Home computers are typically connected to a router which has an Internet connection to an ISP
Let’s assume that the user has clicked on a link on a web page that points to the technologyuk.net home page. Before the user’s browser can send an HTTP request packet to technologyuk.net, it needs the IP address. Let’s also assume that there are no cached records on the client computer, or on the router, or on the ISP’s DNS server (we’ll be looking at the caching of DNS records in more detail shortly).
The client computer will send a DNS query to the ISP’s DNS server via the router. The ISP’s DNS server (acting in the capacity of a recursive resolver) will send an iterative DNS query to one of the thirteen root name servers whose IP addresses are stored in its root hints file. The root server will respond with the DNS records of the authoritative domain name servers for the appropriate top-level domain - in this case, the net domain.
The records returned will include the names and IP addresses of the authoritative name servers for the net domain as follows:
a.gtld-servers.net A 18.104.22.168
a.gtld-servers.net AAAA 2001:503:a83e:0:0:0:2:30
b.gtld-servers.net A 22.214.171.124
b.gtld-servers.net AAAA 2001:503:231d:0:0:0:2:30
c.gtld-servers.net A 126.96.36.199
c.gtld-servers.net AAAA 2001:503:83eb:0:0:0:0:30
d.gtld-servers.net A 188.8.131.52
d.gtld-servers.net AAAA 2001:500:856e:0:0:0:0:30
e.gtld-servers.net A 184.108.40.206
e.gtld-servers.net AAAA 2001:502:1ca1:0:0:0:0:30
f.gtld-servers.net A 220.127.116.11
f.gtld-servers.net AAAA 2001:503:d414:0:0:0:0:30
g.gtld-servers.net A 18.104.22.168
g.gtld-servers.net AAAA 2001:503:eea3:0:0:0:0:30
h.gtld-servers.net A 22.214.171.124
h.gtld-servers.net AAAA 2001:502:8cc:0:0:0:0:30
i.gtld-servers.net A 126.96.36.199
i.gtld-servers.net AAAA 2001:503:39c1:0:0:0:0:30
j.gtld-servers.net A 188.8.131.52
j.gtld-servers.net AAAA 2001:502:7094:0:0:0:0:30
k.gtld-servers.net A 184.108.40.206
k.gtld-servers.net AAAA 2001:503:d2d:0:0:0:0:30
l.gtld-servers.net A 220.127.116.11
l.gtld-servers.net AAAA 2001:500:d937:0:0:0:0:30
m.gtld-servers.net A 18.104.22.168
m.gtld-servers.net AAAA 2001:501:b1f9:0:0:0:0:30
As with the DNS root servers, there appear to be only thirteen authoritative name servers for the top-level domain net. If this were true, it would be even more remarkable given that these same servers are also the authoritative name servers for the com domain. As you’ve probably guessed, however, each server names represents a cluster of name servers, each of which has its own copy of the TLD zone files for com and net.
The recursive resolver will now make one final iterative query, this time addressed to one of the net domain’s authoritative name servers. All being well, the response will contain the IP address of the server hosting the technologyuk.net website, which it will pass back to the client computer via the router. The client computer can now send its HTTP request packet to the web server using the IP address provided.
The entire process usually happens very fast - typically no more than a fraction of a second, although the time can vary depending on the prevailing network conditions. Subsequent requests for the same domain will usually receive an even faster response because the address data for the domain is typically stored in a local DNS cache (see DNS record caching below). A typical DNS lookup sequence is illustrated below.
A complete DNS lookup sequence
DNS record caching
Not every DNS query generated by a web client will result in the sequence of steps illustrated above. DNS servers, routers and even client computers often store the DNS records they receive in response to a DNS query in a DNS cache. The information is retained in the cache for a pre-determined period of time, enabling the device on which it is stored to respond more quickly to a repeat request for the IP address of a given domain.
Assuming the requested information is stored somewhere in the DNS lookup chain, then the response time will depend on the location of the DNS cache relative to the source of the DNS request. The closer the DNS cache is to the source of the request, the faster the response time will be. Another benefit of DNS caching is improved utilisation of internet bandwidth, since fewer DNS query messages will have to be generated in order to resolve a DNS query.
The DNS cache closest to the source of a DNS query is the web browser’s own DNS cache. Most modern browsers are configured to cache DNS records by default, although the amount of time for which a DNS record will be stored, known as the time to live (TTL), varies from one browser to another. In Mozilla Firefox, for example, the TTL is set to 60 seconds by default, although it is possible to change this using the Firefox configuration editor.
If the browser’s DNS cache does not hold a copy of the required DNS record, the next place to check is the operating system’s DNS Resolver cache. The contents of the Windows DNS Resolver cache can be displayed in the Command Prompt window using the following command:
The information displayed will include recently retrieved DNS records and any IP address-to-hostname mappings specified in the Windows hosts file. This is a simple text file in which each address-to-hostname mapping appears on its own line, and consists of an IP address, followed by one or more whitespace characters, followed by one or more hostnames. If multiple hostnames are specified, they must be separated by whitespace characters. The same hostname can be mapped to both an IPv4 and an IPv6 IP address.
The Windows DNS Resolver cache can be cleared using the following command in the Command Prompt window (DNS entries in the Windows hosts file will not be affected):
The default Windows hosts file is shown below. As you can see, there are no active address-to-hostname mappings. Both the IPv4 and the IPv6 mappings for localhost are present in the file, but the entries have been commented out. The rest of the file consists of comments that describe the purpose of the hosts file and specify the format to be used for address-to-hostname mappings.
The default Windows hosts file for Windows 10
In theory, you can add as many entries as you like to the hosts file to speed up access to frequently used websites. In practice, it is highly questionable whether you would see any significant improvement in performance. You do need to edit the hosts file, however, if you are hosting a local version of a website. We test all new content for technologyuk.net using a local desktop computer which has been configured as a web server using XAMPP (an Apache distribution containing MariaDB, PHP, and Perl). The hosts file is as shown above except for the last two lines, which are replaced with the following:
Assuming the Windows DNS Resolver cache for our local web server has been cleared with the ipconfig /flushdns command, running the ipconfig /displaydns command will produce the following:
The contents of the Windows DNS Resolver cache on our local web server
The operating system process that handles DNS requests is commonly referred to as a stub resolver. It checks the operating system’s DNS cache to see if it holds the required DNS record. If it does, the DNS query can be resolved immediately. Otherwise, it will be passed on to a DNS recursive resolver operated by a third-party service provider (typically a local Internet Service Provider).
The third-party recursive resolver will have its own DNS cache, and may be able to resolve the DNS query immediately without having to refer to DNS servers at a higher level in the DNS hierarchy. Even if it does not hold the DNS record for the domain of interest, it may hold the DNS records for the authoritative nameservers for that domain, and can query those servers directly without reference to either the root name servers or the TLD name servers.
If the recursive resolver doesn’t hold DNS records for the domain of interest or the authoritative nameservers for that domain, but does hold the DNS records for the appropriate top-level domain, it can still resolve the DNS query without reference to the root name servers. Failing that, it will go through the complete DNS lookup sequence described above, starting with an iterative DNS query to the root name servers (in most cases, this will not be necessary unless the recursive resolver’s DNS cache has been purged).
The TTL of DNS records on DNS servers can be configured by the server administrator, and its duration may vary from zero to 231-1 seconds (approximately sixty-eight years). In practice, TTL values tend to range from 300 seconds (five minutes) up to 86,400 seconds (twenty-four hours), with a typical default value of 3,600 seconds (one hour). The trade-off is that, while longer TTL values will reduce the workload of DNS servers, shorter TTL values will ensure that changes to DNS records will propagate through the DNS system more quickly.
One final point to mention here is that, in addition to the DNS caches described above, many home networking routers have their own DNS cache. Clearing the operating system’s DNS cache will not affect the contents of the router’s DNS cache. If you need to clear the router’s DNS cache for any reason (to remove out of date or incorrect entries for example), you will probably need to reboot the router.
The DNS protocol
In order for the Domain Name System to function, DNS clients need to communicate with DNS servers, and DNS servers need to communicate with other DNS servers. Two kinds of message are exchanged during this communication - a DNS query message which is used to request information from a DNS server, and a DNS response message which is used to respond to a request. The DNS protocol is an application layer protocol that specifies both the data structure of DNS messages and the mechanism by which they are exchanged.
All DNS messages consist of a header followed by four sections: a question section, an answer section, an authority section, and an additional space section. The format of the header is shown below.
The DNS message header format
The DNS message header consists of the following fields:
- Identifier - a 16-bit number generated by the device that initiates the DNS query. If a DNS server responds to the query, it will use the same identifier. This enables the initiating device to determine which DNS query the response refers to.
- Flags and codes - this 16-bit field contains a number of sub-fields containing flags and control codes. The purpose of each sub-field will be described in more detail below.
- Question count - a 16-bit integer value that specifies the number of questions in the question section of the message.
- Answer record count> - a 16-bit integer value that specifies the number of resource records in the answer section of the message.
- Name server (authority record) count - a 16-bit integer value that specifies the number of resource records in the authority section of the message.
- Additional record count - a 16-bit integer value that specifies the number of resource records in the additional space section of the message.
The flags and control codes take up 16 bits in total and occupy the third and fourth octets in the DNS message header, as shown below.
The DNS flags and control codes occupy the third and fourth octets in the DNS message header
The flags and control codes found in the DNS message header are described below.
- Query/response flag - a 1-bit field that indicates whether the message is a query (0) or a response (1).
- Operation code - a 4-bit number that specifies the type of query the message is carrying. Any response will carry the same operation code. The values this field can take and their meanings are as follows:
- 0 QUERY - a standard query.
- 1 IQUERY - an inverse query (no longer used).
- 2 STATUS - a server status request.
- 3 reserved (not used).
- 4 NOTIFY - used by a master authoritative server to notify secondary authoritative servers of changes to the zone data and prompt them to request a zone transfer.
- 5 UPDATE - a special message type that allows resource records to be added, deleted or updated selectively.
- Authoritative answer flag - a 1-bit field used in a DNS response message to indicate whether or not the responding server is authoritative for the zone in which the domain name specified in the question section resides (1 = authoritative, 0 = non-authoritative).
- Truncation flag - a 1-bit field which, if set, indicates that the message has been truncated because it is longer than the maximum permitted for the transport protocol used (DNS messages sent over UDP were originally limited to a maximum length of 512 bytes).
- Recursion desired - a 1-bit field which, if set, indicates that the server receiving the query should attempt to answer the query recursively. The field will have the same value in the response.
- Recursion available - a 1-bit field used in a DNS response message to indicate whether or not the responding server supports recursive queries.
- Zero - a 3-bit field reserved for future use (all bits are set to 0).
- Response code - a 4-bit number that is set to zero in a DNS query message. Its value in any DNS response message may be modified by the responding server to indicate that an error occurred, or that the query could not be processed for some reason. The values this field can take and their meanings are as follows:
- 0 No Error - no error occurred
- 1 Format Error - the query message was incorrectly formatted.
- 2 Server Failure - the server was unable to respond to the query due to a problem with the server itself.
- 3 Name Error - this error code may be used by an authoritative name server for a zone to indicate that the name specified in the query does not exist in the zone.
- 4 Not Implemented - the server does not support the type of query received.
- 5 Refused - the server refuses to process the query as a matter of policy. For example, a primary authoritative name server for a zone will only honour a zone transfer request from a secondary authoritative name server for the same zone. Zone transfer requests from other name servers will be refused.
- 6 YX Domain - a name exists that should not exist.
- 7 YX RR Set - a resource record set exists that should not exist.
- 8 NX RR Set - a resource record set that should exist does not exist.
- 9 Not Auth - the server receiving the query is not authoritative for the zone specified.
- 10 Not Zone - a name specified in the message is not in the zone specified in the message.
The question section of a DNS message contains one or more (usually just one) entries, each of which consists of the following fields:
- QNAME - a variable-length field containing a domain name represented as a sequence of labels, each preceded by an 8-bit integer value that specifies the length of the label. The last label in the domain name is followed by an 8-bit integer value set to zero to signify that there are no further labels.
- QTYPE - a 16-bit number that specifies the type of query. For example, a value of 1 indicates that the A (IPv4) record for the domain is required, a value of 2 indicates that the name server (NS) records for the domain are required, and a value of 28 indicates that the AAAA (IPv6) record for the domain is required.
- QCLASS - a 16-bit number that indicates the class of record being requested. The most commonly used value is 1 (IN or Internet).
The answer section of a DNS message contains one or more resource records (RRs), each of which consists of the following six fields:
- NAME - this field references a QNAME in the question section and will either be a domain name in the format described for the QNAME field above, or an unsigned 16-bit value containing the offset of the relevant QNAME value from the start of the DNS message (if a pointer is used, the first two bits in the 16-bit value are set to 1 to indicate that this is the case, and the remaining bits specify the offset).
- TYPE - the resource record’s type. For example, a value of 0 indicates an A (IPv4) record, and a value of 28 indicates an AAAA (IPv6) record.
- CLASS - a 16-bit number that indicates the class of the record. The most commonly used value is 1 (IN or Internet).
- TTL - an unsigned 32-bit integer value that specifies the time to live (TTL) for the record, i.e. the time (in seconds) for which the record may be cached. A value of 0 indicates that the record should not be cached.
- RDLENGTH - an unsigned 16-bit integer value that specifies the length (in bytes) of the RDATA record (see below).
- RDATA - the contents of this field will depend on the value found in the TYPE field. For example, a value of 1 indicates that an A (IPv4) record has been requested for the domain name referenced in NAME, so the RDATA field will contain an unsigned 32-bit integer value representing the IPv4 IP address associated with that domain name. If TYPE has a value of 2, the RDATA field will contain the name of a name server for the domain name, presented in the same format as for QNAME in the question section. If TYPE has a value of 28, the RDATA field will contain sixteen octets representing the IPv6 IP address associated with the domain name. And so on.
The authority section and the additional space section of a DNS message are only relevant for DNS response messages. In DNS query messages, these sections will be empty, and the name server (authority record) count and additional record count values will both be zero. In a DNS response message, the records in the authority section will be name server (NS) records.
The additional space section can contain any kind of resource record in theory, although in practice it usually contains the A or AAAA records for the name servers defined in the authority section. Note that the name server records in the authority section and the A or AAAA records in the additional space section are in exactly the same format as records of the same type in the answer section.
As we have mentioned previously, the transport layer protocol of choice for the DNS protocol is the user datagram protocol (UDP), because it is a connectionless protocol that involves far less overhead than the transport control protocol (TCP). A DNS query consists of a single UDP request from the client device, which elicits a single UDP response from the server.
If the length of the response exceeds 512 bytes, and if both the client device and the DNS server support the Extension mechanism for DNS (EDNS), larger UDP packets may be used. If not, the client device will resend the query using TCP. TCP is used in any case for tasks that involve large amounts of data, such as zone transfers, and some resolver implementations now use TCP for all queries by default.
DNS requests are sent from an application port (i.e. a port number greater than 1023) and any response to a DNS request will be addressed to the same application port. DNS servers listen for incoming DNS requests on port 53, regardless of whether the request is sent using UDP or TCP.