TASK 1 (Compression + entropy)
1, x, x+1, 2x, 2x+1, 3x, 3x+1, 4x
1.1 For x = 10, gap-encode using Golomb with M = 8
Gaps are: +1, +9, +1, +9, +1, +9, +1, +9 (eight gaps, not seven) Golomb code for +1 : 1001 Golomb code for +9 : 01001 Encoding of the whole sequence: 1001 01001 1001 01001 1001 01001 1001 01001
1.2 No larger M gives shorter sequence
For M > 8, writing the residual (r) part alone costs >= 4 Bits. Together with the 1 separating it from the q part, this is >= 5 Bits per code. That is >= 40 Bits for the sequence above. With M = 8 we need only 36 Bits.
1.3 Function that computes Golomb code for M = 8
ArrayList<Bool> golombCode(int x) { result = new ArrayList<Bool>; int q = x / 8; int r = x % 8; for (int i = 0; i < q; i++) { result.add(0); } result.add(1); result.add(r & 4 > 0); result.add(r & 2 > 0); result.add(r & 1 > 0); return result;
1.4 Entropy-optimal code for sequence above, for abitrary x
We have two symbols, each occuring the same number of times. The entropy of the corresponding random variable (X taking on either value with probability 1/2) is 1/2 * log_2 1/2 + 1/2 * log_2 1/2 = 1. And there is indeed a (prefix-free) encoding with just 1 Bit per value:
1 : 0 x-1 : 1