Largest Rectangle in Histogram (ease in)

Problem Description

Find the Largest Rectangle in Histogram is one of the most popular algorithm problems during the interview. Given a histogram, it asks what is the maximum rectangular area in this histogram.

For instance,

Histogram [2, 1, 5, 6, 2, 3] has a maximum rectangular area of 10(=2*5) unit. (Source: Leetcode)

Take a minute and see whether you can solve this problem.

TL;DR

Formulate

We can first formulate this problem as:

$\large \begin{align*} A &= \max_{i<j} (S_{i,j}) \\ S_{i, j} &= (j-i+1)*\min_{i\le k \le j}hist[k] \end{align*}$

$\large S_{i, j}$ is the maximum area we can cover in range $\large [i, j]$ (inclusive), which depends on the width $\large (j-i+1)$ and height $\large \min_{i\le k \le j}hist[k]$ . There are a lot of solutions available online and most of them just tell you how to apply a greedy algorithm to solve it. I personally will not recommend those tutorials because:

A greedy algorithm is very problem-specific. This means it’s hard to apply the same greedy strategy from one problem to another unless they are very similar. Which means, even if you understand how to solve this problem with a greedy strategy, you can hardly solve the next one with your own.
Solving problems with greedy strategies can be “dangerous”. In fact, to prove the correctness of a greedy strategy is tedious but necessary. Blindly applying greedy strategies might make mistakes in some corner cases. (e.g. some Dynamic Programming problems seem to have a greedy solution but actually not)
It is hard to come up with a greedy strategy immediately. We tend to find the pattern first and then propose a strategy.

But, admittedly, a good greedy strategy can help to reduce the complexity greatly. I will show you how to go there step by step using re-useable methods.

Solution I: O(N^3)

A simple brutal-force solution will be:

Enumerate all possible range. Takes O(N^2)
- Find the minimum height in that range. Takes O(N)
Multiply the width of the range with the minimum height to get $\large S$ .
Update the answer.

The total time complexity is O(N^3)

Solution II: O(N^2)

If I was told that this problem can be solved in O(N^2), my first intuition is to optimize the most inner operation, which is part where we have to find the minimum height in a range. Because the outer loop already takes O(N^2), we have to do the min operation in O(1).

The most common operations with O(1) complexity is either looking up in an indexable structure (e.g. array/hashmap) or applying a constant number of operators (e.g. a+b/min(a,b)). It’s impossible to apply a constant number of operators to get the min value in a range with variable length. Therefore, we should turn to “looking up” method.

So we assume that we have a matrix called $\large h$ , from which we can query the minimum value by looking up the table, namely, $\large h[i][j] = \min_{i\le k \le j}hist[k]$ . Is it possible to pre-calculate this look-up table in O(N^2)? The answer is yes and the solution is quite simple. I will just leave that to you.

In the example, our $\large h$ looks like:

2 1 1 1 1 1 / 1 1 1 1 1 / / 5 5 2 2 / / / 6 2 2 / / / / 2 2 / / / / / 3

Now we can also print out $\large (j-i+1)$ as a wdith matrix $\large w$ :

1 2 3 4 5 6 / 1 2 3 4 5 / / 1 2 3 4 / / / 1 2 3 / / / / 1 2 / / / / / 1

If we do an element-wise multiplication: we can get the answer matrix

2 2 3 4 5 6 / 1 2 3 4 5 / / 5 10 6 8 / / / 6 4 6 / / / / 2 4 / / / / / 3

We can calculate three matrices step by step and each of them only takes O(N^2):

Calculate the $\large h$ matrix. Takes O(N^2)
Calculate the $\large w$ matrix. Takes O(N^2)
Calculate the answer matrix. Takes O(N^2)
Find the maximum in answer matrix. Takes O(N^2)

We can actually combine the last three steps in one loop.

Solution II: O(N)

So how to move from O(N^2) to O(N)? To clarify, it is not a guarantee that a grid/range search problem can be optimized in O(N). In this case, we can observe that we don’t have to go through the entire $\large h$ matrix. For example, in the second row, all elements (min height) are the same (1) and we only have to pick the one on the right because it has the longest width (5).

More formally speaking, a cell in $\large h$ cannot be an optimal cell when its right cell has the same value (the right one has the same height but longer width, thus, larger area). We can apply this property everywhere when a number appears continuously in a row in $\large h$ . In the following matrix, we can remove those non-optimal cells from $\large h$ .

2 1 1 1 1 1 / 1 1 1 1 1 / / 5 5 2 2 / / / 6 2 2 / / / / 2 2 / / / / / 3

Since the matrix is not guaranteed to be aligned anymore, the $\large h$ is not a matrix but a list of tuple now. We can replace use a tuple $\large (value, index_{start}, index_{end})$ to replace the original one:

i=0 (2, 0, 0) (1, 1, 5) i=1 (1, 1, 5) i=2 (5, 2, 3) (2, 4, 5) i=3 (6, 3, 3) (2, 4, 5) i=4 (2, 4, 5) i=5 (3, 5, 5)

If we iterate from i=0 to i=5, then we still need O(N^2) time to calculate because i=0 takes 6 steps, i=1 takes 5 steps and so on. But, if we do that reversely, we can use the result in $\large h[i+1]$ to calculate $\large h[i]$ .

i=5 (3, 5, 5) i=4 (2, 4, 5) i=3 (6, 3, 3) (2, 4, 5) i=2 (5, 2, 3) (2, 4, 5) i=1 (1, 1, 5) i=0 (2, 0, 0) (1, 1, 5)

The last row only contains the last value. To calculate a row $\large h[i]$ that is not at the bottom, we can assume that we have already calculated $\large h[i+1]$ .

If $\large hist[i]$ is smaller than the top ones in $\large h[i]$ , say $\large (v_0, s_0, e_0)$ , we know that $\large s_0$ will never be updated and $\large v_0$ will never appear in $\large h[k] (k\le i)$ again because $\large hist[i]$ is a smaller value. SInce it is $\large (v_0, s_0, e_0)$ ‘s last appearance, we can calculate $\large v_0 * (e_0-s_0+1)$ and update the answer. After than, we have to push $\large hist[i]$ into the $\large hist[i+1]$ , which starts with $\large i$ and ends with $\large s_0-1$ , where $\large s_0$ is the start index of the top value in $\large h[i]$ . If $\large h[i]$ is all popped out, then we know $\large hist[i]$ is the smallest value from $\large i$ to the end and therefore it ends with $\large n-1$ .

If $\large hist[i]$ is greater than or equal to the top ones in $\large h[i]$ , it cannot affect the value on its right. We can just push $\large (hist[i], i, i)$ into $\large h[i]$ to get $\large h[i+1]$ .

As you can see, we can use only one stack to maintain the changing of $\large h$ because we only use the value when popping. Since each height $\large hist[i]$ can only be pushed and popped once. Our solution has a time complexity of O(N)

Here is a sample code but it’s developed from the beginning, which means $\large h[i]$ stands for range ends at $\large i$ rather than starts at $\large i$ . Trick: I added a minus one (less than any possible value) at the end of the array to force the stack to pop up and calculate all feasible solution.

class Solution:
    def largestRectangleArea(self, heights):
        """
        :type heights: List[int]
        :rtype: int
        """
        
        ans = 0
        h = []
        for ind, ele in enumerate(heights+[-1]):
            while h and h[-1][0] > ele:
                height, start, end = h.pop(-1)
                ans = max(ans, (ind-start)*height)
                
            start = h[-1][2]+1 if h else 0
            h.append((ele, start, ind))
            # print(h)
        return ans

Takeaway:

A correct greedy strategy can be hard to design/prove but might have a less time/space complexity.
If you can to trade time complexity with space complexity, try to think of how can you pre-calculate some value and use a look-up table to store it.
Make use the property of the problem.
- In this case:
  - the descending property of the minimum height allows us to only check the top one of the array/stack
    - (Otherwise, we will have to take O(N) during popping)
  - the increasing property of the width allows us to infer the best width for each height
    - (Otherwise, one height might have multiple copies in the stack and multiple pushing and popping are needed for each height. The total time complexity cannot be guaranteed to be O(N) anymore.)