Tuesday, December 22, 2015

An introduction to network packet analysis

I love that even more than most CTFs, the 2015 SANS Holiday Hack is designed to appeal to kids. My 12-year-old daughter has shown an interest in cybersecurity, so this turned into a great way to teach her a few things. Even better, most of the lessons were in response to her own questions.

I will publish a write-up of the entire challenge (or at least as far as I am able to complete it) in January once the contest concludes; in the meantime, the early challenge goals involve some network packet analysis. My tool of choice for packet analysis is Wireshark. To understand packet analysis though, it is useful to understand a little bit about how networks work.

Traditionally, network concepts are defined in terms of "layers." At each layer, one device talks to another, and each layer does not care what is happening at the other layers. Keep in mind that what follows is the simplified explanation I gave to my 12 year old; Microsoft describes things in more detail in a knowledge base article, and for even more education, Cisco has mountains of training and certifications available.

Photo credit: Luca Ghio (Wikimedia Commons)

At the bottom, or Layer 1, is the Physical Layer - the physical wires that connect one computer to another, or the air and radio waves that connect devices wirelessly. At this layer, signals are sent by one computer and received by the other. The network standard chosen will specify things such as how many wires are used, how a device physically connects to the network, and whether the data are sent via electrical signals, light pulses, or radio waves.

Layer 2 is the Data-Link Layer. This layer doesn't care about the electrical signals - it relies on the Physical Layer to handle that. The Data-Link Layer provides traffic control, defining how devices react if more than one device wishes to communicate at the same time.

Of note, network traffic at Layer 2 is specific to a single local network or "collision domain" (I will ignore VPNs for the moment). You can imagine a large room: if only a few people are in the room, it is practical to simply call out to the person you wish to speak with, but if there are hundreds of people in the room, all trying to talk at once, it very quickly becomes pure chaos.

In modern networks, layer 2 addresses typically take the form of a "MAC address," 6 pairs of hexadecimal digits such as 00-05-9A-AA-12-CC. The first 3 pairs are unique to a network card manufacturer, while the last 3 pairs are intended to be unique within that manufacturer - a serial number of sorts. Uniqueness is not guaranteed, as there are ways to change or imitate the MAC address, but those are beyond the scope of this lesson.

Layer 3, or the Network layer, solves this problem. At Layer 2, each computer conceptually yells out "hey you." The Network Layer separates large environments into multiple collision domains, limiting the number of computers in any one "room" trying to talk at one time.

This layer also solves another problem in the lower layers. Much like your voice can be heard over a limited distance, the electrical or optical signals at the Physical Layer have a finite usable range beyond which it is not possible to communicate. The Network Layer solves this through "routing." At Layer 3, each computer talks to a "default gateway." The default gateway knows where to find devices on its own network, and if it cannot find a device locally, it in turn asks its own default gateway.

This enables computers to talk across the Internet, routing traffic whether across the highway or across the ocean to a distant continent. In concept, this is not all that different from the postal service: you could hand-deliver a letter to a member of your own household, but to send a letter to another home (or city, state, or country), you would put the letter in your mailbox - your "default gateway." Your local post office knows how to deliver in your city, but would hand the letter off to a distribution center - its default gateway - if the delivery address is outside the local region.

Layer 3 addresses traditionally take on the form of 4 numbers, each ranging from 0 to 255 (for instance, 10.0.0.1, 192.168.1.1, or 172.18.32.251). Several blocks of addresses are reserved to be used by anyone in their local network. 10.0.0.0 through 10.255.255.255; 172.16.0.0 through 172.31.255.255; and 192.168.0.0 through 192.168.255.255 are available to use by anyone, with the expectation that your router or default gateway will translate your private address into an Internet-routable address. (IPv6 is beyond the scope of this lesson.)

Layer 4 is the Transport Layer. Its primary purpose is to ensure the data from higher layers gets to the other computer intact and in the right order. Since Layers 1 through 3 have specific size limits as to how much data they can send at a time, the Transport Layer takes whatever is being sent and breaks it into smaller pieces, then at the other end reassembles the pieces in the right order, and in the case of TCP, verifies nothing is missing or duplicated.

The Transport Layer also provides another important function: what happens if two different applications want to talk to the same computer? How does a computer distinguish between HTTP and HTTPS? The Layer 4 protocols TCP and UDP provide "ports," or a way for each application to identify itself.

Layers 1 through 4 are enough for basic packet analysis, so that concludes today's lesson.