I'll do my best to explain it here on simple terms, but be warned that this topic takes my students a couple of months to finally grasp. You can find more information on the Chapter 2 of the Data Structures and Algorithms in Java book.
There is no mechanical procedure that can be used to get the BigOh.
As a "cookbook", to obtain the BigOh from a piece of code you first need to realize that you are creating a math formula to count how many steps of computations get executed given an input of some size.
The purpose is simple: to compare algorithms from a theoretical point of view, without the need to execute the code. The lesser the number of steps, the faster the algorithm.
For example, let's say you have this piece of code:
int sum(int* data, int N) {
int result = 0; // 1
for (int i = 0; i < N; i++) { // 2
result += data[i]; // 3
}
return result; // 4
}
This function returns the sum of all the elements of the array, and we want to create a formula to count the computational complexity of that function:
Number_Of_Steps = f(N)
So we have f(N)
, a function to count the number of computational steps. The input of the function is the size of the structure to process. It means that this function is called such as:
Number_Of_Steps = f(data.length)
The parameter N
takes the data.length
value. Now we need the actual definition of the function f()
. This is done from the source code, in which each interesting line is numbered from 1 to 4.
There are many ways to calculate the BigOh. From this point forward we are going to assume that every sentence that doesn't depend on the size of the input data takes a constant C
number computational steps.
We are going to add the individual number of steps of the function, and neither the local variable declaration nor the return statement depends on the size of the data
array.
That means that lines 1 and 4 takes C amount of steps each, and the function is somewhat like this:
f(N) = C + ??? + C
The next part is to define the value of the for
statement. Remember that we are counting the number of computational steps, meaning that the body of the for
statement gets executed N
times. That's the same as adding C
, N
times:
f(N) = C + (C + C + ... + C) + C = C + N * C + C
There is no mechanical rule to count how many times the body of the for
gets executed, you need to count it by looking at what does the code do. To simplify the calculations, we are ignoring the variable initialization, condition and increment parts of the for
statement.
To get the actual BigOh we need the Asymptotic analysis of the function. This is roughly done like this:
- Take away all the constants
C
.
- From
f()
get the polynomium in its standard form
.
- Divide the terms of the polynomium and sort them by the rate of growth.
- Keep the one that grows bigger when
N
approaches infinity
.
Our f()
has two terms:
f(N) = 2 * C * N ^ 0 + 1 * C * N ^ 1
Taking away all the C
constants and redundant parts:
f(N) = 1 + N ^ 1
Since the last term is the one which grows bigger when f()
approaches infinity (think on limits) this is the BigOh argument, and the sum()
function has a BigOh of:
O(N)
There are a few tricks to solve some tricky ones: use summations whenever you can.
As an example, this code can be easily solved using summations:
for (i = 0; i < 2*n; i += 2) { // 1
for (j=n; j > i; j--) { // 2
foo(); // 3
}
}
The first thing you needed to be asked is the order of execution of foo()
. While the usual is to be O(1)
, you need to ask your professors about it. O(1)
means (almost, mostly) constant C
, independent of the size N
.
The for
statement on the sentence number one is tricky. While the index ends at 2 * N
, the increment is done by two. That means that the first for
gets executed only N
steps, and we need to divide the count by two.
f(N) = Summation(i from 1 to 2 * N / 2)( ... ) =
= Summation(i from 1 to N)( ... )
The sentence number two is even trickier since it depends on the value of i
. Take a look: the index i takes the values: 0, 2, 4, 6, 8, ..., 2 * N, and the second for
get executed: N times the first one, N - 2 the second, N - 4 the third... up to the N / 2 stage, on which the second for
never gets executed.
On formula, that means:
f(N) = Summation(i from 1 to N)( Summation(j = ???)( ) )
Again, we are counting the number of steps. And by definition, every summation should always start at one, and end at a number bigger-or-equal than one.
f(N) = Summation(i from 1 to N)( Summation(j = 1 to (N - (i - 1) * 2)( C ) )
(We are assuming that foo()
is O(1)
and takes C
steps.)
We have a problem here: when i
takes the value N / 2 + 1
upwards, the inner Summation ends at a negative number! That's impossible and wrong. We need to split the summation in two, being the pivotal point the moment i
takes N / 2 + 1
.
f(N) = Summation(i from 1 to N / 2)( Summation(j = 1 to (N - (i - 1) * 2)) * ( C ) ) + Summation(i from 1 to N / 2) * ( C )
Since the pivotal moment i > N / 2
, the inner for
won't get executed, and we are assuming a constant C execution complexity on its body.
Now the summations can be simplified using some identity rules:
- Summation(w from 1 to N)( C ) = N * C
- Summation(w from 1 to N)( A (+/-) B ) = Summation(w from 1 to N)( A ) (+/-) Summation(w from 1 to N)( B )
- Summation(w from 1 to N)( w * C ) = C * Summation(w from 1 to N)( w ) (C is a constant, independent of
w
)
- Summation(w from 1 to N)( w ) = (N * (N + 1)) / 2
Applying some algebra:
f(N) = Summation(i from 1 to N / 2)( (N - (i - 1) * 2) * ( C ) ) + (N / 2)( C )
f(N) = C * Summation(i from 1 to N / 2)( (N - (i - 1) * 2)) + (N / 2)( C )
f(N) = C * (Summation(i from 1 to N / 2)( N ) - Summation(i from 1 to N / 2)( (i - 1) * 2)) + (N / 2)( C )
f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2)( i - 1 )) + (N / 2)( C )
=> Summation(i from 1 to N / 2)( i - 1 ) = Summation(i from 1 to N / 2 - 1)( i )
f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2 - 1)( i )) + (N / 2)( C )
f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N / 2 - 1) * (N / 2 - 1 + 1) / 2) ) + (N / 2)( C )
=> (N / 2 - 1) * (N / 2 - 1 + 1) / 2 =
(N / 2 - 1) * (N / 2) / 2 =
((N ^ 2 / 4) - (N / 2)) / 2 =
(N ^ 2 / 8) - (N / 4)
f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N ^ 2 / 8) - (N / 4) )) + (N / 2)( C )
f(N) = C * (( N ^ 2 / 2 ) - ( (N ^ 2 / 4) - (N / 2) )) + (N / 2)( C )
f(N) = C * (( N ^ 2 / 2 ) - (N ^ 2 / 4) + (N / 2)) + (N / 2)( C )
f(N) = C * ( N ^ 2 / 4 ) + C * (N / 2) + C * (N / 2)
f(N) = C * ( N ^ 2 / 4 ) + 2 * C * (N / 2)
f(N) = C * ( N ^ 2 / 4 ) + C * N
f(N) = C * 1/4 * N ^ 2 + C * N
And the BigOh is:
O(N²)
The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.
The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or freed at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.
Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).
To answer your questions directly:
To what extent are they controlled by the OS or language runtime?
The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.
What is their scope?
The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.
What determines the size of each of them?
The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).
What makes one faster?
The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.
A clear demonstration:
Image source: vikashazrati.wordpress.com
Best Solution
I think there are several questions buried in this topic:
buildHeap
so it runs in O(n) time?buildHeap
runs in O(n) time when implemented correctly?How do you implement
buildHeap
so it runs in O(n) time?Often, answers to these questions focus on the difference between
siftUp
andsiftDown
. Making the correct choice betweensiftUp
andsiftDown
is critical to get O(n) performance forbuildHeap
, but does nothing to help one understand the difference betweenbuildHeap
andheapSort
in general. Indeed, proper implementations of bothbuildHeap
andheapSort
will only usesiftDown
. ThesiftUp
operation is only needed to perform inserts into an existing heap, so it would be used to implement a priority queue using a binary heap, for example.I've written this to describe how a max heap works. This is the type of heap typically used for heap sort or for a priority queue where higher values indicate higher priority. A min heap is also useful; for example, when retrieving items with integer keys in ascending order or strings in alphabetical order. The principles are exactly the same; simply switch the sort order.
The heap property specifies that each node in a binary heap must be at least as large as both of its children. In particular, this implies that the largest item in the heap is at the root. Sifting down and sifting up are essentially the same operation in opposite directions: move an offending node until it satisfies the heap property:
siftDown
swaps a node that is too small with its largest child (thereby moving it down) until it is at least as large as both nodes below it.siftUp
swaps a node that is too large with its parent (thereby moving it up) until it is no larger than the node above it.The number of operations required for
siftDown
andsiftUp
is proportional to the distance the node may have to move. ForsiftDown
, it is the distance to the bottom of the tree, sosiftDown
is expensive for nodes at the top of the tree. WithsiftUp
, the work is proportional to the distance to the top of the tree, sosiftUp
is expensive for nodes at the bottom of the tree. Although both operations are O(log n) in the worst case, in a heap, only one node is at the top whereas half the nodes lie in the bottom layer. So it shouldn't be too surprising that if we have to apply an operation to every node, we would prefersiftDown
oversiftUp
.The
buildHeap
function takes an array of unsorted items and moves them until they all satisfy the heap property, thereby producing a valid heap. There are two approaches one might take forbuildHeap
using thesiftUp
andsiftDown
operations we've described.Start at the top of the heap (the beginning of the array) and call
siftUp
on each item. At each step, the previously sifted items (the items before the current item in the array) form a valid heap, and sifting the next item up places it into a valid position in the heap. After sifting up each node, all items satisfy the heap property.Or, go in the opposite direction: start at the end of the array and move backwards towards the front. At each iteration, you sift an item down until it is in the correct location.
Which implementation for
buildHeap
is more efficient?Both of these solutions will produce a valid heap. Unsurprisingly, the more efficient one is the second operation that uses
siftDown
.Let h = log n represent the height of the heap. The work required for the
siftDown
approach is given by the sumEach term in the sum has the maximum distance a node at the given height will have to move (zero for the bottom layer, h for the root) multiplied by the number of nodes at that height. In contrast, the sum for calling
siftUp
on each node isIt should be clear that the second sum is larger. The first term alone is hn/2 = 1/2 n log n, so this approach has complexity at best O(n log n).
How do we prove the sum for the
siftDown
approach is indeed O(n)?One method (there are other analyses that also work) is to turn the finite sum into an infinite series and then use Taylor series. We may ignore the first term, which is zero:
If you aren't sure why each of those steps works, here is a justification for the process in words:
Since the infinite sum is exactly n, we conclude that the finite sum is no larger, and is therefore, O(n).
Why does heap sort require O(n log n) time?
If it is possible to run
buildHeap
in linear time, why does heap sort require O(n log n) time? Well, heap sort consists of two stages. First, we callbuildHeap
on the array, which requires O(n) time if implemented optimally. The next stage is to repeatedly delete the largest item in the heap and put it at the end of the array. Because we delete an item from the heap, there is always an open spot just after the end of the heap where we can store the item. So heap sort achieves a sorted order by successively removing the next largest item and putting it into the array starting at the last position and moving towards the front. It is the complexity of this last part that dominates in heap sort. The loop looks likes this:Clearly, the loop runs O(n) times (n - 1 to be precise, the last item is already in place). The complexity of
deleteMax
for a heap is O(log n). It is typically implemented by removing the root (the largest item left in the heap) and replacing it with the last item in the heap, which is a leaf, and therefore one of the smallest items. This new root will almost certainly violate the heap property, so you have to callsiftDown
until you move it back into an acceptable position. This also has the effect of moving the next largest item up to the root. Notice that, in contrast tobuildHeap
where for most of the nodes we are callingsiftDown
from the bottom of the tree, we are now callingsiftDown
from the top of the tree on each iteration! Although the tree is shrinking, it doesn't shrink fast enough: The height of the tree stays constant until you have removed the first half of the nodes (when you clear out the bottom layer completely). Then for the next quarter, the height is h - 1. So the total work for this second stage isNotice the switch: now the zero work case corresponds to a single node and the h work case corresponds to half the nodes. This sum is O(n log n) just like the inefficient version of
buildHeap
that is implemented using siftUp. But in this case, we have no choice since we are trying to sort and we require the next largest item be removed next.In summary, the work for heap sort is the sum of the two stages: O(n) time for buildHeap and O(n log n) to remove each node in order, so the complexity is O(n log n). You can prove (using some ideas from information theory) that for a comparison-based sort, O(n log n) is the best you could hope for anyway, so there's no reason to be disappointed by this or expect heap sort to achieve the O(n) time bound that
buildHeap
does.