I was watching this amazing lightning talk1 by Karla Burnett and wanted to understand how
traceroute
works in Unix. Traceroute is a tool that shows the route of a network packet
from your computer to another computer on the internet. It also tells you how long it takes
for the packet to reach each stop along the way.
It’s useful when you want to know more about how your computer connects to other computers on the internet. For example, if you want to visit a website, your computer sends a request to the website’s server, which is another computer that hosts the website. But the request doesn’t go directly from your computer to the server. It has to pass through several other devices, such as routers, that help direct the traffic on the internet. These devices are called hops. Traceroute shows you the list of hops that your request goes through, and how long it takes for each hop to respond. This can help you troubleshoot network problems, such as slow connections or unreachable websites.
This is how you usually use traceroute
:
traceroute example.com
This returns:
traceroute to example.com (93.184.216.34), 64 hops max, 52 byte packets
1 192.168.1.1 (192.168.1.1) 2.386 ms 1.976 ms 1.703 ms
2 142-254-158-201.inf.spectrum.com (142.254.158.201) 9.970 ms 9.463 ms 9.867 ms
3 lag-63.uparohgd02h.netops.charter.com (65.25.145.149) 52.340 ms 26.224 ms 18.094 ms
4 lag-31.clmcohib01r.netops.charter.com (24.33.161.216) 24.277 ms 10.391 ms 16.529 ms
5 lag-27.rcr01clevohek.netops.charter.com (65.29.1.38) 16.485 ms 16.258 ms 16.999 ms
6 lag-416.vinnva0510w-bcr00.netops.charter.com (66.109.6.164) 23.478 ms 24.685 ms
lag-415.vinnva0510w-bcr00.netops.charter.com (66.109.6.12) 25.211 ms
7 lag-11.asbnva1611w-bcr00.netops.charter.com (66.109.6.30) 24.541 ms
lag-21.asbnva1611w-bcr00.netops.charter.com (66.109.3.24) 24.574 ms
lag-31.asbnva1611w-bcr00.netops.charter.com (107.14.18.82) 24.253 ms
8 xe-7-3-1.cr0.chi10.tbone.rr.com (209.18.36.1) 24.283 ms 26.479 ms 45.171 ms
9 ae-65.core1.dcb.edgecastcdn.net (152.195.64.129) 24.550 ms 24.753 ms 25.007 ms
10 93.184.216.34 (93.184.216.34) 23.998 ms 24.086 ms 24.180 ms
11 93.184.216.34 (93.184.216.34) 23.627 ms 24.238 ms 24.271 ms
This traceroute output draws the path of a network packet from my computer to
example.com
’s server, which has an IP address of 93.184.216.34
. It shows that the packet
goes through 11 hops before reaching the destination. The first hop is my router
(192.168.1.1
), the second hop is my ISP’s router (142.254.158.201
), and so on. The last
column shows the time it takes for each hop to respond in milliseconds (ms). The lower the
time, the faster the connection.
Some hops have multiple lines with different names or IP addresses. This means that there
are multiple routers at that hop that can handle the traffic, and traceroute
randomly
picks one of them for each packet. For example, hop 7 has three routers with names starting
with lag-11
, lag-21
, and lag-31
. These are probably load-balancing routers that
distribute the traffic among them.
The last hop (93.184.216.34
) appears twice in the output. This is because traceroute sends
three packets to each hop by default, and sometimes the last hop responds to all three
packets instead of discarding them. This is not a problem and does not affect the accuracy
of the traceroute.
This is all good and dandy but I wanted to understand how traceroute
can find out what
route a packet takes and how long it takes between each hop. So I started reading blogs like
this2 one that does an awesome job at explaining what’s going on behind the scene. The
gist of it goes as follows.
How traceroute works
Traceroute works by sending a series of ICMP (Internet Control Message Protocol) echo request packets, which are also known as pings, to the target IP address or URL that you want to reach. Each packet has an associated time-to-live (TTL) value, which is a number that indicates how many hops (or intermediate devices) the packet can pass through before it expires and is discarded by a router. Yeah, strangely, TTL doesn’t denote any time duration here.
Traceroute starts by sending a packet with a low TTL value, usually 1. This means that the packet can only make one hop before it expires. When a router receives this packet, it decreases its TTL value by 1 and checks if it is 0. If it is 0, the router discards the packet and sends back an ICMP time exceeded message to the source of the packet. This message contains the IP address of the router that discarded the packet. This is how the sender knows the IP address of the first hop (router, computer, or whatsoever).
Traceroute records the IP address and round-trip time (RTT) of each ICMP time exceeded message it receives. The RTT is the time it takes for a packet to travel from the source to the destination and back. It reflects the latency (or delay) between each hop.
Traceroute then increases the TTL value by 1 and sends another packet. This packet can make
2 hops before it expires. The process repeats until traceroute reaches the destination or a
maximum TTL value, usually 30. When the returned IP is the same as the initial destination
IP, traceroute
knows that the packet has completed the whole journey. By doing this,
traceroute can trace the route that your packets take to reach the target IP address or URL
and measure the latency between each hop. The tool prints out the associated IPs and
latencies as it jumps through different hops.
I snagged this photo from an SFU (Simon Fraser University) slide3 that I think explains
the machinery of traceroute
quite well:
Writing a crappier version of traceroute in Python
After getting a rough idea of what’s going on underneath, I wanted to write a simpler and
crappier version of traceroute
in Python. This version would roughly perform the following
steps:
- Establish a UDP socket connection that’d be used to send empty packets to the hops.
- Create an ICMP socket that’d receive ICMP time exceeded messages.
- Start a loop and use the UDP socket to send an empty byte with a TTL of 1 to the first hop.
- The TTL value of the packet would be decremented by 1 at the first hop. Once the TTL reaches 0, the packet would be discarded, and an ICMP time exceeded message would be returned to the sender through the ICMP socket. The sender would also receive the address of the first hop.
- Calculate the time delta between sending a packet and receiving the ICMP time exceeded message. Also, capture the address of the first hop and log the time delta and address to the console.
- In the subsequent iterations, the TTL value will be incremented by 1 (2, 3, 4, …) and
the steps from 1 through 5 will be repeated until it reaches the
max_hops
value, which is set at 64.
Here’s the complete self-contained implementation. I tested it on Python 3.11:
# script.py
from __future__ import annotations
import socket
import sys
import time
from collections.abc import Generator
from contextlib import ExitStack
def traceroute(
dest_addr: str, max_hops: int = 64, timeout: float = 2
) -> Generator[tuple[str, float], None, None]:
"""Traceroute implementation using UDP packets.
Args:
dest_addr (str): The destination address.
max_hops (int, optional): The maximum number of hops.
Defaults to 64.
timeout (float, optional): The timeout for receiving packets.
Defaults to 2.
Yields:
Generator[tuple[str, float], None, None]: A generator that
yields the current address and elapsed time for each hop.
"""
# ExitStack allows us to avoid multiple nested contextmanagers
with ExitStack() as stack:
# Create an ICMP socket connection for receiving packets
rx = stack.enter_context(
socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP)
)
# Create a UDP socket connection for sending packets
tx = stack.enter_context(
socket.socket(
socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP
)
)
# Set the timeout for receiving packets
rx.settimeout(timeout)
# Bind the receiver socket to any available port
rx.bind(("", 0))
# Iterate over the TTL values
for ttl in range(1, max_hops + 1):
# Set the TTL value in the sender socket
tx.setsockopt(socket.IPPROTO_IP, socket.IP_TTL, ttl)
# Send an empty UDP packet to the destination address
tx.sendto(b"", (dest_addr, 33434))
try:
# Start the timer
start_time = time.perf_counter_ns()
# Receive the response packet and extract the source address
_, curr_addr = rx.recvfrom(512)
curr_addr = curr_addr[0]
# Stop the timer and calculate the elapsed time
end_time = time.perf_counter_ns()
elapsed_time = (end_time - start_time) / 1e6
except socket.error:
# If an error occurs while receiving the packet, set the
# address and elapsed time as None
curr_addr = None
elapsed_time = None
# Yield the current address and elapsed time
yield curr_addr, elapsed_time
# Break the loop if the destination address is reached
if curr_addr == dest_addr:
break
def main() -> None:
# Get the destination address from command-line argument
dest_name = sys.argv[1]
dest_addr = socket.gethostbyname(dest_name)
# Print the traceroute header
print(f"Traceroute to {dest_name} ({dest_addr})")
print(f"{'Hop':<5s}{'IP Address':<20s}{'Hostname':<50s}{'Time (ms)':<10s}")
print("-" * 90)
# Iterate over the traceroute results and print each hop information
for i, (addr, elapsed_time) in enumerate(traceroute(dest_addr)):
if addr is not None:
try:
# Get the hostname corresponding to the IP address
host = socket.gethostbyaddr(addr)[0]
except socket.error:
host = ""
# Print the hop information
print(f"{i+1:<5d}{addr:<20s}{host:<50s}{elapsed_time:<10.3f} ms")
else:
# Print "*" for hops with no response
print(f"{i+1:<5d}{'*':<20s}{'*':<50s}{'*':<10s}")
if __name__ == "__main__":
main()
Running the script will give you the following nicely formatted output:
sudo python script.py example.com
Traceroute to example.com (93.184.216.34)
Hop IP Address Hostname Time (ms)
----------------------------------------------------------------------------------------
1 192.168.1.1 1.420 ms
2 142.254.158.201 142-254-158-201.inf.spectrum.com 9.669 ms
3 65.25.145.149 lag-63.uparohgd02h.netops.charter.com 139.603 ms
4 24.33.161.216 lag-31.clmcohib01r.netops.charter.com 14.493 ms
5 65.29.1.38 lag-27.rcr01clevohek.netops.charter.com 19.221 ms
6 66.109.6.70 lag-17.vinnva0510w-bcr00.netops.charter.com 25.803 ms
7 66.109.3.24 lag-21.asbnva1611w-bcr00.netops.charter.com 24.969 ms
8 209.18.36.1 xe-7-3-1.cr0.chi10.tbone.rr.com 24.351 ms
9 152.195.64.129 ae-65.core1.dcb.edgecastcdn.net 25.114 ms
10 93.184.216.34 23.546 ms
Recent posts
- SSH saga
- Injecting Pytest fixtures without cluttering test signatures
- Explicit method overriding with @typing.override
- Quicker startup with module-level __getattr__
- Docker mount revisited
- Topological sort
- Writing a circuit breaker in Go
- Discovering direnv
- Notes on building event-driven systems
- Bash namerefs for dynamic variable referencing