avatar

Derek Zeng

A programmer

Segment tree and Binary indexed tree (2)

by coderek

Binary indexed tree (BIT) is an interesting data structure. It makes calculating and updating prefix sum efficient. Using old way of keeping prefix sum, you have to keep an accumulative array c[]. Even though getting the prefix sum is O(1) but updating one element (thus the sums associated with it) is O(n). With BIT, both operation is O(lgn).

Lets take a look at the example using BIT:

We have an array a=[1,2,3,4,5] and want to calculate its sums efficiently. BIT is represented by another array of the size a.length+1, say it's bit[]. When matching BIT index to a's index, we increment a's index to get BIT's index.

When constructing BIT, we loop all element in a, call updateBIT(bit, i+1, a[i]).

void updateBIT(int[] bit, int idx, int val) {
    while (idx<=bit.length) {
        bit[idx] += val;
        idx+=idx&-idx;
    }
}

Here we used a while loop to add the val to multiple places. Let's take a look what exactly are the places updated. Let's say idx=5 initially.

The expression idx&-idx basically get the trailing 1 of the idx's binary form. So binary form of 5 is 0b101. idx&-idx should be 1. If idx is 0b10100, idx&-idx is 0b100.

We are then adding this trailing 1 to idx, so the index sequence for idx=5 becomes 0b101 => 0b110 => 0b1000, in base 10, 5=>6=>8.

Why do we do this? It's because, in BIT, we split prefix sum into several parts. Each part is exclusive to each other. For example, sum(0~5) = sum(0~4) + sum(5) or sum(0~15)=sum(0~8)+sum(9~12)+sum(13~14)+sum(15). So the sums that contains sum(5) are those index obtained by adding the trailing 1 to its index.

An easy way to remember the coverage of an index in BIT is the look at it's binary form. e.g. 12 or 0b1100 is covers 4 numbers, namely 12, 11, 10, 9. The range covered is equal to 2 to the power of no. of trailing zeros.

To get the sum, we simply do

int getBIT(int[] bit, int idx) {
    int sum = 0;
    while (idx>0) {
        sum+=bit[idx];
        idx-=idx&-idx;
    }
    return sum;
}

We can see the code for BIT is very simple while the idea is very powerful.

BIT can do more than keeping just prefix sum. Any associative operation fit into the same realm as prefix sum problem. For example max.

In the Skyline problem, when we traverse the start and end points in the order of x coordinates, we compare it with current max height. If the point being examined is higher than the max, we record it down in the skyline map.

So the problem becomes how to get the max building seen so far. An easy way is to record down the highest building we've seen so far. It's like a running max.

But the problem of the approach above is that, you don't know when the max should end. (it's the building's height, so technically it should end when building's end point is encountered) Also you need to revert to the second max when the previous max exits.

Intuitively one can solve it using a heap. But it also can be easily solved by binary indexed tree. We just need to keep a prefix max of building heights. How to do that?

We use point position as index. And whenever a start point is seen, we immediately update the max at the end point of that building. So the coverage of that height value is actaully between begining of the skyline and the end point of the building.

The trick here is that we can't use the usual way to update the BIT. The usual way update values upwards. If we do that, the max will be compared globally.

Looking at the BIT tree below (copied from topCoder, ignore the prefix sum):

None

Look at it carefully, you will notice a pattern.

None

So the values of BIT have a scope. The scope is increasing as the index increases. For the skyline problem, we want to make sure the local max has the scope that only affect prefix points. So we have to update the values downwards instead.

    void add(int[] bit, int i, int h) {
        while (i>0) {
            bit[i]=Math.max(bit[i], h);
            i-=(i&-i);
        }
    }

With this, the max I set at position 9 won't have any effect after 9. The max value will be stored inside 9 and 8 covering 0 through 7.

When requesting the current max, we need go to the scope and fetch it.This means, we go upwards and recursively take the max until we are at the end of the BIT array.

If there is a building that spans the entire length of the BIT array, its height will be considered since we will compare the max of the outmost scope.

The code for retrieving max

    int find(int[] bit, int i) {
        int h=0;
        while (i<bit.length) {
            h=Math.max(h, bit[i]);
            i+=(i&-i);
        }
        return h;
    }

This problem demonstrate the alternative way we can use binary indexed tree. When using it as max tree, it not only query/update prefix value efficiently, but also handles well when the value exits its effect zone (finds a second max efficiently).

(End of article)