# Integer Representation in HPACK

July 9, 2015

After watching my Erlang User Conference talk, "HTTP/2 and You!", I remembered that I skipped right over integer representation in HPACK because of time constraints. As I'm about to dive into the topic again at LambdaJam next week in Chicago, I figured I'd write it up here as a supplement to both talks. I feel like it might go down easier as a blog post anyway.

## The HPACK Spec

The HPACK spec, RFC-7541, goes into how integer representation works here and it's a good, if not dense, read.

### Header Types

HPACK is really stingy. It loves to save bits on the wire whenever it can. When it encodes a header in a set of bytes, it starts that set of bytes with a set of bits letting everyone know what type of header it is:

#### Indexed Header Field

```
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| 1 | Index (7+) |
+---+---------------------------+
```

We're sending over a single integer to represent the header on the wire. The bit
prefix `1`

, is telling us that it's this type, and the following 7 bits are the
integer we're sending. You already see the problem, don't you? What if the
integer we're sending is greater than `2^7-1`

? Since that's only 127, it's a likely
situation. That's why that above figure from the HPACK spec calls Index `(7+)`

.

#### Literal Header Field with Incremental Indexing

```
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| 0 | 1 | Index (6+) |
+---+---+-----------------------+
| H | Value Length (7+) |
+---+---------------------------+
| Value String (Length octets) |
+-------------------------------+
```

This time, the header value isn't in the encoding and decoding contexts, but we want
it to be in the future. The header name happens to already be in the contexts,
so we're going to use the bit prefix `01`

. When we use `01`

, the Index is now a
`(6+)`

. Which means we could only fit `2^6-1`

integers in that remaining 6 bits
before we run out of space.

Value Length is another integer, this time of `(7+)`

.

#### Other Prefixes

I won't deep dive into the types of headers on the wire here, except to give you a list of the other bit prefixes that an integer might follow:

`H`

- String lengths are all prefixed by a bit that specifies wether or not

the string is Huffman encoded

`0000`

- Literal Header Field without Indexing`0001`

- Literal Header Field Never Indexed`001`

- Dynamic Table Size Update

#### Integer Prefix Lengths

What that all means is that HPACK will be trying to encode integers in the following formats:

`4+`

`5+`

`6+`

`7+`

## How It Does It

Since the algorithm for encoding integers is the name for all four, we'll start
referring to the number of bits we have as `N`

.

For integers less than `2^N-1`

, it's super easy. Just stick the bits in those N
that are available and you're done.

It's for `Integer`

greater than `2^N-1`

that it gets tricky.

Step 1: Set all `N`

bits to 1.

Step 2: Since HPACK is super stingy with bits, it doesn't want to waste `N`

bits,
so it will consider `2^N-1`

of the integer you want to send already sent.

Now we're left with `Remaining = Integer - (2^N-1)`

to send over the wire and
we're going to do it in whole bytes.

The first bit in that byte is going to be a continuation bit.

`1`

means there are more bytes`0`

means this is the last one

Ok, great, 7 more bits to work with. We're just going to take the least
significant 7 bits from the `Remaining`

and put them in this next byte. If
`Remaining`

less than 128, we set the continuation bit to `0`

and we're done. If
it's greater, we set the continuation bit to `1`

, then we bitshift right `Remaing`

and do the whole thing again.

Here's how I did it in Erlang. You can see the whole integer module at hpack_integer.erl

```
-spec encode(non_neg_integer(), pos_integer()) -> binary().
encode(Int, N) when Int < (1 bsl N - 1) ->
<<Int:N>>;
encode(Int, N) ->
Prefix = 1 bsl N - 1,
Remaining = Int - Prefix,
Bin = encode_(Remaining, <<>>),
<<Prefix:N, Bin/binary>>.
```

`encode/2`

takes `Int`

that you want to encode and `N`

. It takes care of Steps 1
and 2 I described above, as well as the case where `Int < 2^N-1`

.

For numbers larger than `2^N-1`

, it calls `encode_/2`

which will just recurse
until we're out of bits to encode.

```
-spec encode_(non_neg_integer(), binary()) -> binary().
encode_(I, BinAcc) ->
LeastSigSeven = (I rem 128),
RestToEncode = I bsr 7,
case RestToEncode of
0 ->
<<BinAcc/binary, LeastSigSeven>>;
_ -> %% Adds the continuation bit
ThisByte = 128 + LeastSigSeven,
encode_(RestToEncode, <<BinAcc/binary, ThisByte>>)
end.
```

#### What Happens when Integer = 2^N-1

I've been tip-toeing around this one because I assumed it would just be setting
all `N`

bits to `1`

and done. Then I wiresharked it and it didn't work. Turns out
that all `N`

bits to `1`

is a signal that we're doing this encoding method, and
HPACK expects the rest to be encoded in the following bytes, but there is no `Rest`

.

We've got to send a zero byte to follow this up with. So, sending 31 with N = 5 looks like this:

```
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| X | X | X | 1 | 1 | 1 | 1 | 1 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+
```

Which is a lot of wasted `0`

s, which we know HPACK hates; but, it's an edge case
so HPACK still sleeps well at night.

I've created some simple clauses of `encode/2`

to handle this edge case:

```
encode( 1,1) -> << 1:1,0:8>>;
encode( 3,2) -> << 3:2,0:8>>;
encode( 7,3) -> << 7:3,0:8>>;
encode( 15,4) -> << 15:4,0:8>>;
encode( 31,5) -> << 31:5,0:8>>;
encode( 63,6) -> << 63:6,0:8>>;
encode(127,7) -> <<127:7,0:8>>;
encode(255,8) -> <<255,0>>;
```

### Examples

Appendix C.1 of the HPACK RFC-7541 has a few examples that I'd just be copying into this space, so I'll just link to them instead