Part 2: Parse the response
In part 1 we got a response to the DNS query for example.com
. Now we want to parse the response.
2.1 Define DNSRecord class
The DNSRecord
class:
struct DNSRecord {
name: String, // domain name
r#type: u16, // A, AAAA, MX, NS, TXT, etc (encoded as an integer)
class: u16, // always the same (1). We’ll ignore this.
ttl: u16, // how long to cache the query for. We’ll ignore this.
data: Vec<u8> // the record’s content, like the IP address.
}
2.2 Parse the DNS header
In part 1, we implemented a generic ToBytes
macro and trait so that we could convert any struct to bytes without needing to implement it for every struct.
Here we need to do the reverse, converting bytes into a struct. Following the same approach as before, I will implement a FromBytes
macro and trait.
First I will implement the FromBytes
trait for the datatypes we are using currently (String
and u16
). For convenience, I will use the bytebuffer
crate to handle the logic for reading the correct amount of bytes for each datatype from a buffer, and keeping track of the position.
Each from_bytes
function will take a mutable reference to the buffer as it will read the bytes it needs before being used by the next field in the struct.
use bytebuffer::ByteReader;
pub trait FromBytes {
fn from_bytes(buf: &mut ByteReader) -> Self;
}
impl FromBytes for u16 {
fn from_bytes(buf: &mut ByteReader) -> Self {
buf.read_u16().unwrap()
}
}
impl FromBytes for String {
fn from_bytes(buf: &mut ByteReader) -> Self {
buf.read_string().unwrap()
}
}
And then implement the FromBytes
macro by following the same pattern for the ToBytes
macro (from part 1). However, instead of building a vector, this time we are building a struct like Self { a: u16::to_bytes(buf), b: String::to_bytes(buf) }
#[proc_macro_derive(FromBytes)]
pub fn derive_from_bytes(input: TokenStream) -> TokenStream {
// Parse the input tokens into a syntax tree
let input = parse_macro_input!(input as DeriveInput);
let struct_name = input.ident;
// Generate an expression to populate each field from bytes
let data = from_bytes(&input.data);
let expanded = quote! {
// The generated impl.
impl struct_bytes::FromBytes for #struct_name {
fn from_bytes(mut buf: &mut ByteReader) -> Self {
#data
}
}
};
// Hand the output tokens back to the compiler.
TokenStream::from(expanded)
}
// Generate an expression to populate each field from bytes
fn from_bytes(data: &Data) -> proc_macro2::TokenStream {
match *data {
Data::Struct(ref data) => {
match data.fields {
Fields::Named(ref fields) => {
let recurse = fields.named.iter().map(|f| {
let name = &f.ident;
let ty = &f.ty;
quote! {
#name: #ty::from_bytes(&mut buf)
}
});
quote! {
Self {
#(#recurse,)*
}
}
}
Fields::Unnamed(_) | Fields::Unit => unimplemented!()
}
}
Data::Enum(_) | Data::Union(_) => unimplemented!(),
}
}
Now we can annotate the struct and have it automatically populated from bytes received over the network:
#[derive(FromBytes, ToBytes)]
struct DNSHeader { ... }
One tip I found for debugging procedural macros is to use the cargo expand crate. Running cargo expand
will output the result of the macro expansion, allowing for easy debugging. The code generated by the procedural macro looks like this:
impl struct_bytes::FromBytes for DNSHeader {
fn from_bytes(mut buf: &mut ByteReader) -> Self {
Self {
id: u16::from_bytes(&mut buf),
flags: u16::from_bytes(&mut buf),
num_questions: u16::from_bytes(&mut buf),
num_answers: u16::from_bytes(&mut buf),
num_authorities: u16::from_bytes(&mut buf),
num_additionals: u16::from_bytes(&mut buf),
}
}
}
Let’s quickly test it out on the response from part1. The first 12 bytes of the response make up the header:
let header = DNSHeader::from_bytes(&mut ByteReader::from_bytes(&buf[..12]));
println!("{:?}", &buf[..12]);
println!("{:?}", header);
Gives the following:
[214, 26, 129, 128, 0, 1, 0, 1, 0, 0, 0, 0]
DNSHeader { id: 54810, flags: 33152, num_questions: 1, num_answers: 1, num_authorities: 0, num_additionals: 0 }
Which confirms that the FromBytes
macro works correctly! ✅
2.3 Parse the domain name (wrong)
The next 21 bytes of the response are the domain name:
let mut br = ByteReader::from_bytes(&buf);
let header = DNSHeader::from_bytes(
&mut ByteReader::from_bytes(
br.read_bytes(12)?.as_slice()
)
);
println!("{:?}", String::from_utf8(br.read_bytes(21)?));
Which prints:
Ok("\u{3}www\u{7}example\u{3}com\0\0\u{1}\0\u{1}")
So we need to parse this to a domain name, implementing the “simple version” that doesn’t quite work from the original book. I will add it to the DNSQuestion
implementation, alongside the encode_dns_name
function from part 1.
The new function looks like this:
fn decode_name_simple(buf: &mut ByteReader) -> String {
let mut parts = Vec::<String>::new();
while let Ok(length) = buf.read_bytes(1) { // read 1 byte to get the length
if length[0] == 0 { // remember the 0 to mark the end of the name
break;
} else {
let part = String::from_utf8(
buf.read_bytes(length[0].into()).unwrap() // read the length from the buffer
);
parts.push(part.unwrap());
}
}
parts.join(".") // join them with "." to recreate the url
}
2.4 Parse the question
To use the new decode_name_simple
function along with the FromBytes
trait we will need to introduce a new DNSName
struct and implement FromBytes
for it so it can be used with the macro:
struct DNSName {
name: String
}
impl ToBytes for DNSName {
fn to_bytes(&self) -> Vec<u8> {
Self::encode_dns_name(&self.name).to_bytes()
}
}
impl FromBytes for DNSName {
fn from_bytes(buf: &mut ByteReader) -> Self {
Self {
name: Self::decode_name_simple(buf)
}
}
}
This gets used in the DNSQuestion
as follows:
#[derive(FromBytes, ToBytes)]
struct DNSQuestion {
name: DNSName,
r#type: u16,
class: u16,
}
Now, to test it:
let mut br = ByteReader::from_bytes(&buf);
let header = DNSHeader::from_bytes(
&mut ByteReader::from_bytes(
br.read_bytes(12)?.as_slice()
)
);
let question = DNSQuestion::from_bytes(&mut br);
println!("{:?}", question);
And this prints:
DNSQuestion { name: "www.example.com", type: 1, class: 1 }
Which is the correct output ✅
2.5 Parse the record
In the original, the decode_name_simple
function will break here because of DNS compression, so we need to implement DNS compression.
2.6 implement DNS compression
A DNS response contains many copies of the same domain name, so instead of copying the domain name every time it is needed in the response, the domain name is only included once and from that point onwards a pointer to the position of the domain name in the DNS packet is included instead.
In the domain name section of a DNS record, if the first two bits of a length byte are 11
it means the following domain name is compressed.
We can check for this by performing a bitwise &
on the length byte and comparing to 0. The &
will compare each bit and set the result to 1
if both bits are 1
e.g.:
length = 11000000
is_compressed = length & 11000000 != 0
The above works because the maximum supported length of any part is 63, which is represented in binary as 00111111
, so the 2 MSBs can never be 1
as that length would not be supported. The compression code 11000000
is 192 in decimal which is too long to be a real length (as it’s > 63). That’s how it’s possible to distinguish between a real length and a compressed domain name.
After determining that the domain name is compressed, the location in the packet is decoded by taking the bottom 6 bits of the length byte and the next byte. The bottom 6 bits are taken by performing a bitwise &
with 00111111
.
Putting it all together looks like this:
fn decode_name(buf: &mut ByteReader) -> String {
let mut parts = Vec::<String>::new();
while let Ok(length) = buf.read_u8() { // read 1 byte to get the length
if length == 0 {
break;
} else {
if length & 0b11000000 != 0 { // checks if the MSBs are 11, then the domain name is compressed
parts.push(Self::decode_compressed_name(length, buf));
break
} else { // uncompressed domain name, decode the "simple" way
let part = String::from_utf8(
buf.read_bytes(length.into()).unwrap()
);
parts.push(part.unwrap());
}
}
}
parts.join(".")
}
fn decode_compressed_name(length: u8, buf: &mut ByteReader) -> String {
let pointer_bytes = vec![length & 0b00111111, buf.read_u8().unwrap()];
let pointer = u16::from_be_bytes(pointer_bytes.try_into().unwrap());
let pos = buf.get_rpos(); // save current reader position
buf.set_rpos(pointer.try_into().unwrap()); // seek to pointer position in DNS packet
let result = Self::decode_name(buf);
buf.set_rpos(pos); // restore current reader position
result
}
2.7 Finish our DNS Record parsing
Now we need to modify the from_bytes
implementation for DNSRecord
to use the new decode_name
function in place of decode_name_simple
:
impl FromBytes for DNSName {
fn from_bytes(buf: &mut ByteReader) -> Self {
Self {
name: Self::decode_name(buf)
}
}
}
And to test it:
let header = DNSHeader::from_bytes(
&mut ByteReader::from_bytes(
br.read_bytes(12)?.as_slice()
)
);
let question = DNSQuestion::from_bytes(&mut br);
let record: DNSRecord = DNSRecord::from_bytes(&mut br);
println("{:?}", record);
prints:
DNSRecord { name: DNSName { name: "www.example.com" }, type: 1, class: 1, ttl: 302, data: DNSData { data: [93, 184, 215, 14] } }
2.8 Parse our DNS packet
Now we put it all together to parse the whole DNS packet.
A DNS packet actually consists of multiple records, the header specifies how many records to expect in each section of the packet in the num_questions
, num_answers
, num_additionals
, and num_authorities
fields.
So let’s create a new DNSPacket
class and FromBytes
implementation:
struct DNSPacket {
header: DNSHeader,
questions: Vec::<DNSQuestion>,
answers: Vec::<DNSRecord>,
authorities: Vec::<DNSRecord>,
additionals: Vec::<DNSRecord>
}
impl FromBytes for DNSPacket {
fn from_bytes(buf: &mut ByteReader) -> Self {
let header = DNSHeader::from_bytes(
&mut ByteReader::from_bytes(&buf.read_bytes(12).unwrap())
)
Self {
header: header,
questions: (0..header.num_questions).map(|_| DNSQuestion::from_bytes(buf)).collect(),
answers: (0..header.num_answers).map(|_| DNSRecord::from_bytes(buf)).collect(),
authorities: (0..header.num_authorities).map(|_| DNSRecord::from_bytes(buf)).collect(),
additionals: (0..header.num_additionals).map(|_| DNSRecord::from_bytes(buf)).collect(),
}
}
}
2.9 Pretty print the IP Address
For this we can use the rust stdlib: use std::net::Ipv4Addr
to make things easier. Ipv4Addr
already implements the Display
trait, so no need to implement anything extra to pretty print the IP. I will introduce a new struct DNSData
and repace the data
field in the DNSRecord
with this new type (this was previously a Vec):
struct DNSRecord {
name: DNSName, // domain name // could add custom function override here e.g. `decode_name_simple`
r#type: u16, // A, AAAA, MX, NS, TXT, etc (encoded as an integer)
class: u16, // always the same (1). We’ll ignore this.
ttl: u32, // how long to cache the query for. We’ll ignore this.
data: DNSData // the record’s content, like the IP address.
}
struct DNSData{
ip: Ipv4Addr
}
impl FromBytes for DNSData {
fn from_bytes(buf: &mut ByteReader) -> Self {
let data_len = buf.read_u16().unwrap();
let data = buf.read_bytes(data_len.try_into().unwrap()).unwrap();
Self {
ip: Ipv4Addr::new(data[0], data[1], data[2], data[3])
}
}
}
The IP address can now be printed with:
let packet = DNSPacket::from_bytes(&mut br);
println!("{}", packet.answers[0].data.ip)
Which prints 93.184.215.14
.
2.10 Test out all our code
Here is the function to lookup any domain name and pretty print the IP address:
fn lookup_domain(domain_name: String) -> std::io::Result<()> {
let query = build_query(domain_name, TYPE_A);
let socket = UdpSocket::bind("0.0.0.0:34254")?;
socket.send_to(&query, "8.8.8.8:53")?;
let mut buf = [0; 1024];
socket.recv_from(&mut buf)?;
let mut br = ByteReader::from_bytes(&buf);
let packet = DNSPacket::from_bytes(&mut br);
println!("{}", packet.answers[0].data.ip);
Ok(())
}
fn main() -> std::io::Result<()> {
{
lookup_domain(String::from("example.com"))?;
lookup_domain(String::from("recurse.com"))?;
lookup_domain(String::from("metafilter.com"))?;
}
Ok(())
}
And the result after running:
93.184.215.14
18.165.201.88
54.203.56.158
Which matches the expected output from the Python article! 🎉