Transport Layer Protocols
Transmission Control Protocol (TCP)
The Transmission Control Protocol (TCP) is a widely used connection-oriented transport layer protocol that provides reliable transfer of data between two end points, and includes mechanisms to handle flow-control, segmentation, error recovery, and multiplexing.
TCP provides end-to-end communication
Application data is encapsulated by an application layer protocol, and the resulting protocol data unit (PDU) is passed to TCP, which adds its own header to create a TCP protocol data unit (usually referred to as a segment). The TCP header includes a checksum, and a sequence number if the message consists of more than one segment. A virtual circuit is established between the two end points. The device initiating the transaction transmits a connection request that includes a socket number identifying the local process. The remote device sends a reply containing a socket number that identifies the remote process. These two sockets define the connection between the two devices until the virtual circuit is terminated.
Once the connection has been established, TCP passes each segment to the Internet Protocol (IP), which packages them as datagrams for onward transmission. The source and destination address, protocol identifier, and segment length are also passed to the Internet protocol as parameters inside a 96-bit pseudoheader (not part of the segment itself). The checksum is calculated using the contents of the segment and the pseudoheader together. At the destination device, the TCP software reassembles the original message using the sequence numbers contained in each segment, and may request retransmission of any missing or damaged segments. Any correctly received segments are acknowledged, and each acknowledgement includes the sequence number of one or more segments.
TCP uses a simple sliding window mechanism to handle flow control which allows multiple segments to be transmitted before an acknowledgement is required. An initial window size is negotiated when the connection is opened, and may be varied throughout the transaction to provide a flexible flow control mechanism. A window size of zero would indicate that no further data can be accepted at the present time. TCP uses timers to ensure the sending device does not wait indefinitely for an acknowledgement. If a timer expires before an acknowledgement is received, the segment is retransmitted. Copies of unacknowledged segments are held in a buffer until an acknowledgement is received. Any duplicate datagrams arriving at the receiving device are simply discarded.
The diagram below shows the format of a TCP protocol data unit.
TCP protocol data unit format
The most important fields are described below.
- Source port - the port number used by the source process
- Destination port - the number port used by the destination process
- Sequence number - the position of the current segment in a message
- Acknowledgment number - the next sequence number expected
- Window - specifies the size of the buffer available for incoming data
- Checksum - used to check for errors
As well as the window size, the receiving device will specify the maximum size for an individual protocol data unit (or segment).
TCP uses a three-way handshake to establish a connection. In a typical transaction, a client initiates a connection by sending a PDU with the initial sequence number n, and with the Syn flag set to indicate that the PDU is a connection request. The server stores the client's sequence number (n), and acknowledges the connection request with a PDU that contains the server's own initial sequence number (m), an acknowledgement number set at n+1, and with the Ack flag set. This synchronisation process is completed when the client sends a further PDU in which the acknowledgement number is set at m+1, its own sequence number is set to n+1, and with the Ack flag set. Data transfer can then begin. The diagram below shows the synchronisation sequence that occurs when a TCP connection is established.
TCP connection setup
Each block of data received by the TCP protocol from the upper layer protocols is encapsulated within a PDU and given a sequence number. The destination computer sends an acknowledgement containing the next sequence number, thus acknowledging receipt of the previous block of data. The following diagram shows a two-way transfer of data.
Two-way transfer of data in TCP
The TCP transport service offers the following features:
- Full duplex communication - both ends of a connection can transmit simultaneously
- Timing - timers are used to ensure that data is transmitted in a timely fashion
- Sequencing - message blocks are given sequence numbers to enable messages to be reassembled in the correct order before being passed to the application layer protocols on the destination computer
- Flow control - the flow of data is regulated using buffers and windows
- Error handling - checksums are provided to enable transmission errors to be detected and dealt with
TCP closes a connection when asked to do so by an application layer protocol. In the diagram below, a process on Machine A asks TCP to close its connection with Machine B. A message is sent to Machine B with the FIN flag set. Machine B sends an acknowledgement, and passes the close request to the appropriate application layer protocol. When the application layer acknowledges the request (or when the request has timed out), Machine B sends a second message to Machine A with the FIN flag set. Machine A acknowledges the closure, and the connection is terminated. Note that connections can also be terminated without warning if, for example, one of the machines suffers a power failure. In this case, the other machine may only realise that the connection has been terminated when it fails to receive any further acknowledgements.
Closing a TCP connection
User Datagram Protocol (UDP)
The User Datagram Protocol (UDP) is an unreliable, connectionless protocol that works at the transport layer of TCP/IP, and provides a datagram delivery service to applications with a minimum of overhead. UDP provides a very simple interface between the application layer and the internetwork layer. If only a small amount of data is to be transmitted, the overhead of a connection-oriented reliable delivery service outweighs the effort that may be involved in having to retransmit the data. Similarly, for a real time application (e.g. streaming media), dropping the odd packet is preferable to having to wait for lost or damaged packets to be re-transmitted. In these circumstances, UDP is a more efficient host-to-host transport layer protocol.
UDP does not provide any guarantee of delivery, nor does it provide error recovery or flow control. No connection is established, and hence no handshaking procedure is required. Packets may arrive out of order, not arrive at all, or be duplicated. Applications that use UDP are assumed either not to need flow control or error handling, or to have their own mechanisms for ensuring reliable data delivery that do not depend on the transport layer protocol. UDP is the transport protocol for a variety of application-layer protocols, including Simple Network Management Protocol (SNMP), Dynamic Host Configuration Protocol (DHCP), Routing Information Protocol (RIP), and the Domain Name System (DNS), as well as streaming media applications such as Voice over IP (VoIP).
Like TCP, UDP uses sockets to identify the end points in a data transmission. The UDP packet header contains four fields, which are described below. Padding can be added to the datagram to ensure that the message is a multiple of 16 bits.
The UDP packet format
The UDP packet header fields:
- Source port - an optional 16-bit port number - if a port number is not specified because a reply is not required, this field is set to zero.
- Destination port - a 16-bit destination port number.
- Length - a 16-bit field that specifies the length of the datagram in bytes, including header and data. Minimum length is 8 bytes (i.e. the length of the header itself), and the maximum length is 65,515 bytes.
- Checksum - a 16-bit one's complement of the one's complement sum of the datagram, including a pseudoheader similar to that of TCP, the UDP header, and the data padded with zeros if necessary to make the datagram a multiple of 32 bits. This field is optional, and if not used is set to 0.
When used over IP version 4 (IPv4), the pseudoheader (shown below in pink) includes the 32-bit source and destination addresses from the IPv4 header. The Protocol field contains the protocol number for UDP (17), while the UDP length field contains the length of the UDP header and data in bytes.
The UDP packet with pseudoheader (IPv4)
When used over IP version 6 (IPv6), the pseudoheader (shown below in pink) includes the 128-bit source and destination addresses from the IPv6 header (note that in an IPv6 packet, a routing header may be included, in which case the Destination address field will contain the address in the last element of the routing header at the originating node, and the destination address in the IPv6 header at the receiving node). The Next Header field in the IPv6 pseudoheader replaces the Protocol field in the IPv4 pseudoheader, and as with IPv4 contains the protocol number for UDP (17). The UDP length field contains the length of the UDP header and data in bytes. The UDP checksum is no longer optional when UDP is used over IPv6.
The UDP packet with pseudoheader (IPv6)