Part 2: Parse the response

In part 1 we got a response to the DNS query for example.com. Now we want to parse the response.

2.1 Define DNSRecord class

The DNSRecord class:

struct DNSRecord {
    name: String, // domain name
    r#type: u16, // A, AAAA, MX, NS, TXT, etc (encoded as an integer)
    class: u16, //  always the same (1). We’ll ignore this.
    ttl: u16, //  how long to cache the query for. We’ll ignore this.
    data: Vec<u8> // the record’s content, like the IP address.
}

2.2 Parse the DNS header

In part 1, we implemented a generic ToBytes macro and trait so that we could convert any struct to bytes without needing to implement it for every struct.

Here we need to do the reverse, converting bytes into a struct. Following the same approach as before, I will implement a FromBytes macro and trait.

First I will implement the FromBytes trait for the datatypes we are using currently (String and u16). For convenience, I will use the bytebuffer crate to handle the logic for reading the correct amount of bytes for each datatype from a buffer, and keeping track of the position.

Each from_bytes function will take a mutable reference to the buffer as it will read the bytes it needs before being used by the next field in the struct.

use bytebuffer::ByteReader;
pub trait FromBytes {
  fn from_bytes(buf: &mut ByteReader) -> Self;
}

impl FromBytes for u16 {
  fn from_bytes(buf: &mut ByteReader) -> Self {
      buf.read_u16().unwrap()
  }
}

impl FromBytes for String {
  fn from_bytes(buf: &mut ByteReader) -> Self {
    buf.read_string().unwrap()
  }
}

And then implement the FromBytes macro by following the same pattern for the ToBytes macro (from part 1). However, instead of building a vector, this time we are building a struct like Self { a: u16::to_bytes(buf), b: String::to_bytes(buf) }

#[proc_macro_derive(FromBytes)]
pub fn derive_from_bytes(input: TokenStream) -> TokenStream {
    // Parse the input tokens into a syntax tree
    let input = parse_macro_input!(input as DeriveInput);

    let struct_name = input.ident;

    // Generate an expression to populate each field from bytes
    let data = from_bytes(&input.data);

    let expanded = quote! {
        // The generated impl.
        impl struct_bytes::FromBytes for #struct_name {
            fn from_bytes(mut buf: &mut ByteReader) -> Self {
                #data
            }
        }
    };

    // Hand the output tokens back to the compiler.
    TokenStream::from(expanded)
}

// Generate an expression to populate each field from bytes
fn from_bytes(data: &Data) -> proc_macro2::TokenStream {
    match *data {
        Data::Struct(ref data) => {
            match data.fields {
                Fields::Named(ref fields) => {
                    let recurse = fields.named.iter().map(|f| {
                        let name = &f.ident;
                        let ty = &f.ty;
                        quote! {
                            #name: #ty::from_bytes(&mut buf)
                        }
                    });
                    quote! {
                        Self {
                            #(#recurse,)*
                        }
                    }
                }
                Fields::Unnamed(_) | Fields::Unit => unimplemented!()
            }
        }
        Data::Enum(_) | Data::Union(_) => unimplemented!(),
    }
}

Now we can annotate the struct and have it automatically populated from bytes received over the network:

#[derive(FromBytes, ToBytes)]
struct DNSHeader { ... }

One tip I found for debugging procedural macros is to use the cargo expand crate. Running cargo expand will output the result of the macro expansion, allowing for easy debugging. The code generated by the procedural macro looks like this:

impl struct_bytes::FromBytes for DNSHeader {
    fn from_bytes(mut buf: &mut ByteReader) -> Self {
        Self {
            id: u16::from_bytes(&mut buf),
            flags: u16::from_bytes(&mut buf),
            num_questions: u16::from_bytes(&mut buf),
            num_answers: u16::from_bytes(&mut buf),
            num_authorities: u16::from_bytes(&mut buf),
            num_additionals: u16::from_bytes(&mut buf),
        }
    }
}

Let’s quickly test it out on the response from part1. The first 12 bytes of the response make up the header:

let header = DNSHeader::from_bytes(&mut ByteReader::from_bytes(&buf[..12]));
println!("{:?}", &buf[..12]);
println!("{:?}", header);

Gives the following:

[214, 26, 129, 128, 0, 1, 0, 1, 0, 0, 0, 0]
DNSHeader { id: 54810, flags: 33152, num_questions: 1, num_answers: 1, num_authorities: 0, num_additionals: 0 }

Which confirms that the FromBytes macro works correctly! ✅

2.3 Parse the domain name (wrong)

The next 21 bytes of the response are the domain name:

let mut br = ByteReader::from_bytes(&buf);
let header = DNSHeader::from_bytes(
    &mut ByteReader::from_bytes(
        br.read_bytes(12)?.as_slice()
    )
);
println!("{:?}", String::from_utf8(br.read_bytes(21)?));

Which prints:

Ok("\u{3}www\u{7}example\u{3}com\0\0\u{1}\0\u{1}")

So we need to parse this to a domain name, implementing the “simple version” that doesn’t quite work from the original book. I will add it to the DNSQuestion implementation, alongside the encode_dns_name function from part 1.

The new function looks like this:

fn decode_name_simple(buf: &mut ByteReader) -> String {
    let mut parts = Vec::<String>::new();
    while let Ok(length) = buf.read_bytes(1) { // read 1 byte to get the length
        if length[0] == 0 { // remember the 0 to mark the end of the name
            break;
        } else {
            let part = String::from_utf8(
                buf.read_bytes(length[0].into()).unwrap() // read the length from the buffer
            );
            parts.push(part.unwrap());
        }
    }
    parts.join(".") // join them with "." to recreate the url
}

2.4 Parse the question

To use the new decode_name_simple function along with the FromBytes trait we will need to introduce a new DNSName struct and implement FromBytes for it so it can be used with the macro:

struct DNSName {
    name: String
}
impl ToBytes for DNSName {
    fn to_bytes(&self) -> Vec<u8> {
        Self::encode_dns_name(&self.name).to_bytes()
    }
}
impl FromBytes for DNSName {
    fn from_bytes(buf: &mut ByteReader) -> Self {
        Self {
            name: Self::decode_name_simple(buf)
        }
    }
}

This gets used in the DNSQuestion as follows:

#[derive(FromBytes, ToBytes)]
struct DNSQuestion {
    name: DNSName,
    r#type: u16, 
    class:  u16,
}

Now, to test it:

let mut br = ByteReader::from_bytes(&buf);
let header = DNSHeader::from_bytes(
    &mut ByteReader::from_bytes(
        br.read_bytes(12)?.as_slice()
    )
);
let question = DNSQuestion::from_bytes(&mut br);
println!("{:?}", question);

And this prints:

DNSQuestion { name: "www.example.com", type: 1, class: 1 }

Which is the correct output ✅

2.5 Parse the record

In the original, the decode_name_simple function will break here because of DNS compression, so we need to implement DNS compression.

2.6 implement DNS compression

A DNS response contains many copies of the same domain name, so instead of copying the domain name every time it is needed in the response, the domain name is only included once and from that point onwards a pointer to the position of the domain name in the DNS packet is included instead.

In the domain name section of a DNS record, if the first two bits of a length byte are 11 it means the following domain name is compressed.

We can check for this by performing a bitwise & on the length byte and comparing to 0. The & will compare each bit and set the result to 1 if both bits are 1 e.g.:

length = 11000000
is_compressed = length & 11000000 != 0

The above works because the maximum supported length of any part is 63, which is represented in binary as 00111111, so the 2 MSBs can never be 1 as that length would not be supported. The compression code 11000000 is 192 in decimal which is too long to be a real length (as it’s > 63). That’s how it’s possible to distinguish between a real length and a compressed domain name.

After determining that the domain name is compressed, the location in the packet is decoded by taking the bottom 6 bits of the length byte and the next byte. The bottom 6 bits are taken by performing a bitwise & with 00111111.

Putting it all together looks like this:

fn decode_name(buf: &mut ByteReader) -> String {
    let mut parts = Vec::<String>::new();
    while let Ok(length) = buf.read_u8() { // read 1 byte to get the length
        if length == 0 {
            break;
        } else {
            if length & 0b11000000 != 0 { // checks if the MSBs are 11, then the domain name is compressed
                parts.push(Self::decode_compressed_name(length, buf));
                break
            } else { // uncompressed domain name, decode the "simple" way
                let part = String::from_utf8(
                    buf.read_bytes(length.into()).unwrap()
                );
                parts.push(part.unwrap());
            }
        }
    }
    parts.join(".")
}

fn decode_compressed_name(length: u8, buf: &mut ByteReader) -> String {
    let pointer_bytes = vec![length & 0b00111111, buf.read_u8().unwrap()];
    let pointer = u16::from_be_bytes(pointer_bytes.try_into().unwrap());
    let pos = buf.get_rpos(); // save current reader position
    buf.set_rpos(pointer.try_into().unwrap()); // seek to pointer position in DNS packet
    let result = Self::decode_name(buf);
    buf.set_rpos(pos); // restore current reader position
    result
}

2.7 Finish our DNS Record parsing

Now we need to modify the from_bytes implementation for DNSRecord to use the new decode_name function in place of decode_name_simple:

impl FromBytes for DNSName {
    fn from_bytes(buf: &mut ByteReader) -> Self {
        Self {
            name: Self::decode_name(buf)
        }
    }
}

And to test it:

let header = DNSHeader::from_bytes(
    &mut ByteReader::from_bytes(
        br.read_bytes(12)?.as_slice()
    )
);
let question = DNSQuestion::from_bytes(&mut br);
let record: DNSRecord = DNSRecord::from_bytes(&mut br);
println("{:?}", record);

prints:

DNSRecord { name: DNSName { name: "www.example.com" }, type: 1, class: 1, ttl: 302, data: DNSData { data: [93, 184, 215, 14] } }

2.8 Parse our DNS packet

Now we put it all together to parse the whole DNS packet.

A DNS packet actually consists of multiple records, the header specifies how many records to expect in each section of the packet in the num_questions, num_answers, num_additionals, and num_authorities fields.

So let’s create a new DNSPacket class and FromBytes implementation:

struct DNSPacket {
    header: DNSHeader,
    questions: Vec::<DNSQuestion>,
    answers: Vec::<DNSRecord>,
    authorities: Vec::<DNSRecord>,
    additionals: Vec::<DNSRecord>
}
impl FromBytes for DNSPacket {
    fn from_bytes(buf: &mut ByteReader) -> Self {
        let header = DNSHeader::from_bytes(
            &mut ByteReader::from_bytes(&buf.read_bytes(12).unwrap())
        )
        Self { 
            header: header,
            questions: (0..header.num_questions).map(|_| DNSQuestion::from_bytes(buf)).collect(),
            answers: (0..header.num_answers).map(|_| DNSRecord::from_bytes(buf)).collect(),
            authorities: (0..header.num_authorities).map(|_| DNSRecord::from_bytes(buf)).collect(),
            additionals: (0..header.num_additionals).map(|_| DNSRecord::from_bytes(buf)).collect(),
        }
    }
}

2.9 Pretty print the IP Address

For this we can use the rust stdlib: use std::net::Ipv4Addr to make things easier. Ipv4Addr already implements the Display trait, so no need to implement anything extra to pretty print the IP. I will introduce a new struct DNSData and repace the data field in the DNSRecord with this new type (this was previously a Vec):

struct DNSRecord {
    name: DNSName, // domain name // could add custom function override here e.g. `decode_name_simple`
    r#type: u16, // A, AAAA, MX, NS, TXT, etc (encoded as an integer)
    class: u16, //  always the same (1). We’ll ignore this.
    ttl: u32, //  how long to cache the query for. We’ll ignore this.
    data: DNSData // the record’s content, like the IP address.
}

struct DNSData{
    ip: Ipv4Addr
}
impl FromBytes for DNSData {
    fn from_bytes(buf: &mut ByteReader) -> Self {
        let data_len = buf.read_u16().unwrap();
        let data = buf.read_bytes(data_len.try_into().unwrap()).unwrap();
        Self {
            ip: Ipv4Addr::new(data[0], data[1], data[2], data[3])
        }
    }
}

The IP address can now be printed with:

let packet = DNSPacket::from_bytes(&mut br);
println!("{}", packet.answers[0].data.ip)

Which prints 93.184.215.14.

2.10 Test out all our code

Here is the function to lookup any domain name and pretty print the IP address:

fn lookup_domain(domain_name: String) -> std::io::Result<()> {
    let query = build_query(domain_name, TYPE_A);
    let socket = UdpSocket::bind("0.0.0.0:34254")?;
    socket.send_to(&query, "8.8.8.8:53")?;
    let mut buf = [0; 1024];
    socket.recv_from(&mut buf)?;
    let mut br = ByteReader::from_bytes(&buf);
    let packet = DNSPacket::from_bytes(&mut br);
    println!("{}", packet.answers[0].data.ip);
    Ok(())
}

fn main() -> std::io::Result<()> {
    {
        lookup_domain(String::from("example.com"))?;
        lookup_domain(String::from("recurse.com"))?;
        lookup_domain(String::from("metafilter.com"))?;
    }
    Ok(())
}

And the result after running:

93.184.215.14
18.165.201.88
54.203.56.158

Which matches the expected output from the Python article! 🎉