2.6. Analysis of Array and Vector Operators¶
As we know, vectors use contiguous storage locations in an underlying (typically larger) array. Both array and vector elements can be accessed and traversed with the help of iterators, and they can also be accessed randomly using indexes.
However, unlike basic arrays, vectors have a dynamic size meaning that whenever a new element is inserted or deleted, their size changes automatically. A new element can be inserted into or deleted from any part of a vector, and automatic reallocation for other existing items in the vector is applied. Nevertheless, computing time for insertion and deletion might differ depending on the location of the item, and how many items need to be reallocated. For example, the last item in a vector is typically removed at a constant time, because no resizing of the vector is typically needed for this operation, while an item is removed or inserted into the beginning or the middle of a vector at a linear time because all of the remaining items to the right of that element must be shifted.
Two common operations for both arrays and vectors are indexing and assigning to an index position that already exists. Both of these operations take the same amount of time no matter how large the array or vector is. When an operation like this is independent of the size of the array or vector they are \(O(1)\).
Although not possible with basic arrays, a common programming technique is growing a vector.
As we have seen, one
way to create a longer vector is to use the push_back()
method.
The push_back()
method is typically \(O(1)\), provided
there is adequate capacity in the underlying array.
First we’ll use push_back()
method.
Listing 3 shows the code for
making our vector.
Listing 3
#include <iostream>
#include <vector>
using namespace std;
void test1(int num){
vector <int> vect;
vect.reserve(num);
for (int i = 0; i < num; i++){
vect.push_back(i);
}
}
int main() {
test1(1000);
}
To capture the time it takes for each of our functions to execute we
will use C++’s ctime
module. The ctime
module is designed
to allow C++ developers to make cross-platform timing measurements by
running functions in a consistent environment and using timing
mechanisms that are as similar as possible across operating systems.
To use ctime
you create two clock
objects. The first clock object marks
the current start time; the second clock object marks the current time after
the function runs a set number of times (the end time). To get the total runtime,
you subtract the start time from the end time to get the elapsed time.
The following session shows how long it takes to run each
of our test functions 10,000 times within a for
loop.
In the experiment above the statement that we are timing is the function
call to test1
. From the experiment, we see the amount of time taken by the push_back operation.
We can improve the runtime a bit further by setting an adequate reserve for the vector in advance. Doing this will keep us from having to move the entire vector to an adequately sized space in memory.
Now that we have seen how performance can be measured concretely you can
look at Table 2 to see the Big-O efficiency of all the
basic vector operations. When pop_back()
is called, the element
at the end of the vector is removed and it typically takes
\(O(1)\) but when erase()
is called on the first element in the vector
or anywhere in the middle it is \(O(n)\). The reason for this lies
in how C++ chooses to implement vectors. When an item is taken from the
front of the vector, in C++ implementation, all the other elements in
the vector are shifted one position closer to the beginning. This may seem
silly to you now, but if you look at Table 2 you will see
that this implementation also allows the index operation to be
\(O(1)\). This is a tradeoff that the C++ implementers thought
was a good one.
Operation | Big-O Efficiency |
---|---|
index [] | O(1) |
index assignment = | O(1) |
push_back() | typically O(1) |
pop_back() | O(1) |
erase(i) | O(n) |
insert(i, item) | O(n) |
find(srt, stp, item) | O(log n) or O(n) |
reserve() | O(1) |
begin() | O(1) |
end() | O(1) |
size() | O(1) |
The push_back() operation is \(O(1)\) unless there is inadequate capacity, in which case the entire vector is moved to a larger contiguous underlying array, which is \(O(n)\).
Note that the vector class provides a find command which can determine whether a given item is in the vector. It is is \(O(log n)\) if the vector is sorted and is \(O(n)\) otherwise. We will explain why this is in Chapter 3.
As a way of demonstrating the difference in performance between push_back
and insert, let’s do
another experiment using the ctime
module. Our goal is to be able
to verify the performance of the pop_back()
operation on a vector of a known
size when the program pops from the end of the vector using pop_back()
, and again when the
program pops from the beginning of the vector using erase()
. We will also want to
measure this time for vectors of different sizes. What we would expect to
see is that the time required to pop from the end of the vector will stay
constant even as the vector grows in size, while the time to pop from the
beginning of the vector will continue to increase as the vector grows.
Listing 4 shows one attempt to measure the difference
between the pop_back()
and erase()
.
There are a couple of things to notice about Listing 4.
This approach allows us to time just the single pop_back()
statement
and get the most accurate measure of the time for that single operation.
Because the timer repeats 10,000 times it is also important to point out
that the vector is decreasing in size by 1 each time through the loop.
Listing 4