number theory - Repeating patterns in sequence of binary bit counts

Saturday, 10 March 2018

number theory - Repeating patterns in sequence of binary bit counts

I was solving a problem on leetcode where you produce a list of bit counts for every number $i$ in the range $1 \le i \le n$ . The bit count is simply the amount of 1's in a numbers binary representation (e.g. 5 is 101 in binary, meaning that its bit count is 2)(Link to the problem since I'm not sure I'm explaining it clearly)

Now that thats out of the way, I believe I found something quite interesting. I'm sure its been discovered before but my google fu skills when it comes to mathematics is extremely lacklustre so I havent been able to find anything.

Let's start off with an example to explain this problem more simply. Say we are given $n = 32$ , we'd find the bit count (amount of 1s in a numbers binary representation) for each number from 1 up to 32. The bit count for the number 1 is 1 (0001 has one "1"). The bit count for the number 2 is also 1 (0010 has one "1"). The bit count for the number 3 is 2 (0011 has two "1"). We do this for all the numbers up until 32 and we are left with
this.

Within this sequence of numbers there is an interesting pattern. We can see that between the 4th and 8th indices (3rd and 4th red "1"'s respectively) that there is a sequence which is repeated in subsequent numbers; [2, 2, 3]. Since this is the first sequence we've found, we can call it "1". To make this clearer I have highlighted them here

As we can see, this sequence is shown after the introduction of a new bit (as shown by the red "1"s). Also, if we look between the 8th and 16th indices, we can see that another repeating sequence has emerged; [2, 3, 3, 4]. Lets call this sequence "2" since its the second repeating sequence we've come across. To make this pattern and future patterns a bit clearer, I will be making $n = 64$ . If we circle the repeating sequences we are left with this.

Lets continue the same naming convention for these sequences based off the order of their discovery; [3, 4, 4, 5] = "3" and [4, 5, 5, 6] = "4". There are two observations that can be made at this point:

The amount of sequences double between subsequent red "1"s. E.g. between the 4th and 8th indices there is one sequence. Between the 8th and 16th indices there are two sequences. Between the 16th and 32nd there are four sequences and between the 32nd and 64th there are eight sequences. However, I'm sure this is easily explained and isn't all that interesting.

Each sequence of length 4 follows the same pattern as the previous sequences; [n, n+1, n+1, n+2], where n is one greater than the first digit of the previous sequence. However, this is also probably easily explained and isn't the interesting thing.

To make things clearer, every digit that isn't in a sequence will be removed, which leaves us with this. As stated earlier, each sequence was given a name signifying the order of its discovery, these names and their corresponding sequences are as follows:

[2, 2, 3] = "1"

[2, 3, 3, 4] = "2"

[3, 4, 4, 5] = "3"

[4, 5, 5, 6] = "4"

If we substitute each sequence with its given name, we are left with
this colour coded sequence. If we compare this new sequence with the sequence from right at the start, we can see that it repeats the exact same pattern (starting from index 2, represented by the second red "1").

My question is, why is this happening and what is its significance?

Blog

Saturday, 10 March 2018