How does Data travel across the Internet?

You may have heard of the popular system design question asking "What happens when you type google.com into your web browser?" In this post, I will give a detailed and complete picture of how does data travel across the internet in order to accomplish tasks such as browsing the web.

In order to answer this question, we have to first understand the OSI model.

When we use any application to send data across the internet, it needs to send data from the top of the OSI model to the bottom through the process called encapsulation.

From L5-L7, it can be collectively grouped to be called Application Layer as referenced from TCP/IP model, which we use as a more practical version of the OSI model.

For example, typing google.com into our web browser and pressing enter sends the following data through the HTTP protocol:

1GET / HTTP/1.1
2Host: www.google.com
3User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0
4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
5Accept-Language: en-US,en;q=0.5
6Accept-Encoding: gzip, deflate, br
7Upgrade-Insecure-Requests: 1
8Sec-Fetch-Dest: document
9Sec-Fetch-Mode: navigate
10Sec-Fetch-Site: none
11Sec-Fetch-User: ?1
12Priority: u=0, i
13Te: trailers
14Connection: keep-alive

This data will be encapsulated and move to the L4 Transport Layer. In our example, since HTTP/1.1 uses the TCP protocol, the TCP header will be added to the data which contains the source and destination port.

The source port (eg: port 1234) is randomly generated by our host machine whereas the destination port will be 443 since we are communicating with google.com web server via HTTPS. The Protocol Data Unit (PDU) at this layer when the data is encapsulated with the TCP header is called segment. If UDP is the transport protocol, it will be called datagram.

Now, the segment will move to L3 Network layer, whereby the IP header will be added to the segment to form a packet. The IP header contains the source and destination IP addresses. The source IP is the IP of the host machine. The destination IP is obtained via the browser's or operating system's cache. If the cache does the contain the IP address for google.com, DNS resolution will be performed.

This packet is now handed to L2 Data Link layer, whereby the L2 header will be added which contains the source and destination MAC address to facilitate hop-to-hop delivery. This is called a frame. However, our host does not know the MAC address of the destination IP if it is not present in its Address Resolution Protocol (ARP) table/cache. ARP is used to link Layer 3's IP address to Layer 2's MAC address.

ARP Table/Cache: Mapping of IP Address to MAC address. It does not need to be populated ahead of time.

ARP

Our host machine uses ARP to resolve the target's MAC address. It does this by sending an ARP request. This request includes the sender's IP and MAC address and the target's IP address. It is a broadcast sent to everyone on the network. Only the L2 header is added to the ARP request which contains the source and destination MAC addresses. The destination MAC address is ffff.ffff.ffff.

Host machine's first step when sending data is to determine if the target IP is on local or foreign network:

Foreign - ARP for Default Gateway IP
Local - ARP for Target IP

Our host machine will first look at its own IP address and subnet mask. Then, it will compare to the IP address of our target which will prove that it is in another network. This probes the host machine to send the frame to its default gateway, which is the router.

With the incomplete frame on hold, our host machine will send an ARP request such as:

1If anyone has the IP 10.1.1.1, send me your MAC. 
2My IP/MAC is 10.1.1.22 / a2a2

This request will reach the router. The router will generate an ARP response with the L2 header containing its source MAC and the destination MAC of our host machine:

1I am 10.1.1.1, my MAC is e5e5

This ARP response will reach our host machine, where the host machine will store this mapping in its ARP cache.

Our L2 frame will be completed with the destination MAC address being the router's MAC address. The L2 header can now handle hop-to-hop delivery which will travel from our host machine to the router.

In L1 Physical layer, the frame is converted to 1s and 0s and put onto the wire which will reach our router. When the router receives this data, the data will go through decapsulation. Its L2 header will be discarded by the router since its purpose for hop-to-hop delivery has already been accomplished from the host machine's NIC to the router's NIC. The router will populate the L2 header with the MAC address of the next router or the next machine depending on what is the next hop.

Router is a device whose primary purpose is routing which is the process of moving data between networks. It's a node that forwards IPv4/IPv6 packets not explicitly addressed to itself.

Routers have an IP address and a MAC address on each interface they are connected to. Each router store a routing table which contains a mapping of the network and the respective interface to forward the packet to. If a router receives a packet with an unknown destination IP, it will drop the packet.

Routing Table: Mapping of IP Network to Interface/IP address of the next Router in the path (next hop). It must be populated ahead of time. A routing table always follows the Longest Prefix Match rule. This means the machine will always prefer the most specific, narrowest rule first.

A host also has a routing table with the default route of 0.0.0.0/0 mapped to the default gateway IP. This applies to all packets it is sending with a remote destination IP which will be sent to the router.

For local IP addresses in the subnet, the local subnet route is already automatically assigned in the routing table with the default route meaning machine skips router entirely as direct communication can happened with MAC addresses.

The routing table can be populated in three ways:

Directly Connected - Routes for the networks which are connected to the router
Static - Routes manually updated by Administrator
Dynamic - Routes learned automatically from other routers. Dynamic routing protocols include OSPF, BGP and RIP

Let's take a look at the diagram above. There are two key points:

Any device with an IP address has an ARP table.
A routing table has to be pre-populated while an ARP table can be empty at the start.

Continuing from our example and travelling down the OSI model when sending data, Host A will send data to Host C which is google.com's web server:

Host A from the diagram sends the HTTP request which is the data from L5-7.
The data is encapsulated with L4 header with a randomly generated source port and destination port of 443.
After we obtain the IP address of google.com's web server, it will further encapsulate the data + L4 header with the L3 header with the source and destination IP. Host A will look at its source IP address and subnet mask, and determine that the destination IP address is in a foreign network.
As the destination IP is in a foreign network, it will encapsulate the L2 header with the destination MAC address of our default gateway. However, if this is the first packet that Host A sent, it does not know Router 1's MAC address and cannot construct the L2 header.
Therefore, Host A broadcast ARP request for 10.0.66.2. Router 1 receives the ARP request, populates its ARP table with Host A IP to MAC address mapping and sends the ARP response with its MAC address which Host A needs.
Host A receives this ARP response, populates its ARP table and is now able to construct the L2 header and send the packet to the router.
When Router 1 receives this packet, it will travel up the OSI model. Router 1 discards the L2 header as its purpose was to complete hop-to-hop delivery from Host A's NIC to Router 1's NIC.
Router 1 looks up the destination IP from L3 header of the packet to its own Routing Table. It will determine where to send the packet next. It will send the packet to 10.0.55.1, which is Router 2 as shown in its Routing Table.
Router 1 will construct the L2 header. It will send the ARP broadcast request which will be receive by Router 2. Router 2 will send the ARP response. This step is similar to Step 5 and 6. Once Router 1 receives the response, it will construct the L2 header and send the packet from its NIC to Router 2's NIC, hop-to-hop.
Similar to step 7, Router 2 will discard L2 header and look at the destination IP address in its own Routing Table to determine where to send it next. Since Router 2 knows the packet is going to a directly connected route, it knows it will be the packet's final hop.
Router 2 will likewise perform ARP to get Host C's mac address to construct L2 header and get the packet to Host C.
When Host C receives the packet, it will discard the L2 header since packet has been delivered from Router 2's NIC to its own NIC. Then L3 header will be discarded since end-to-end delivery has been achieved from Host A's IP to Host C's IP. Host C will then process the data and send a response.
Host C will send data back to Host A, except that the process will be quicker since the ARP cache has already been populated in each host.

Steps 7-10 would repeat for any amount of Routers in the path between Router 1 and 2.

Each router:

Looks up destination IP in routing table to determine next-hop's IP
Adds a L2 header with next Router's MAC as destination MAC (Perform ARP if necessary)

To be continued as my schedule is packed at the moment...