## It can be intimidating — it doesn’t need to be

Published in

·

10 min read

·

Jan 13, 2020

--

As an instructor of front end engineering, I’m often faced with a question from junior developers when it comes to talking about big O notation:

“Can’t you get along as an engineer without knowing these things? Especially a frontend engineer?”

Sure — there’s no stopping anyone from brute-forcing their way to a solution.

“But isn’t making it work the most important thing?”

Fair enough. Working is the first (and sometimes only) step, depending on the size of the data set or your deadline for shipping a product.

But what happens when you go from working with a collection of 100,000 elements to 100 million? Or 10 billion? When it comes to scalability, these things matter.

The fact is you can write code that works without knowing the ins and outs of big O notation. But it’s difficult to tell whether big O is something that you should even be considering when you’re creating an algorithm if you have no idea what it is. In my opinion, dismissing foundational computer science fundamentals that can help you make decisions and write better code is foolish at best and disastrous at worst.

Big O notation is the language we use for talking about how long an algorithm takes to run (time complexity) or how much memory is used by an algorithm (space complexity). Big O notation can express the best, worst, and average-case running time of an algorithm. For our purposes, we’re going to focus primarily on Big-O as it relates to time complexity.

As a software engineer, you’ll find that most discussions of big O focus on the upper-bound run time of an algorithm, which is often termed the worst-case. An important thing to note is that the running time when using big O notation does not directly equate to time as we know it (e.g. seconds, milliseconds, microseconds, etc.) Analysis of running times does not take certain things into account, such as the processor, the language, or the run-time environment. Instead, we can think of time as the number of operations or steps it takes to complete a problem of size *n*. In other words, big O notation is a way to track how quickly the run-time grows relative to the size of the input.

When we’re thinking about the worst-case, the question becomes this: What are the *most* operations/steps that *could* happen for an input of size *n*?

`O(1)`

means that it takes a constant time to run an algorithm, regardless of the size of the input.

Bookmarks are a great example of how constant time would play out in the real world. Bookmarks allow a reader to find the last page that you read in a quick, efficient manner. It doesn’t matter if you are reading a book that has 30 pages or a book that has 1000 pages. As long as you’re using a bookmark, you will find that last page in a single step.

In programming, a lot of operations are constant. Here are some examples:

- math operations
- accessing an array via the index
- accessing a hash via the key
- pushing and popping on a stack
- insertion and removal from a queue
- returning a value from a function

Take a look at `findFirstIndex`

**, **listed below. Passing `smallCollection`

or `giganticCollection`

** **will produce the same run-time of `O(1)`

when accessing the `0`

index. The return of `firstIndex`

is also a constant time operation. Regardless of the size of *n,* both of these operations will take a constant amount of time.

`O(n)`

means that the run-time increases at the same pace as the input.

The act of reading a book is an example of how linear time would play out in the real world. Let’s assume that it takes me exactly one minute to read a single page of a large print book. Given that, a book that has 30 pages will take me 30 minutes to read. Likewise, a book that has 1000 pages will equal 1000 minutes of reading time. Now, I don’t force myself to finish books that aren’t great so there is always a chance that I won’t get through that 1,000-page book. However, before I start reading it, I can know that my worst-case reading time is 1000 minutes for a 1000 page book.

In programming, one of the most common linear-time operations is traversing an array. In JavaScript, methods like `forEach`

, `map`

, and `reduce`

** **run through the entire collection of data, from start to finish.

Take a look at our `printAllValues`

function below. The number of operations that it will take to loop through *n *is directly related to the size of *n*. Generally speaking (but not always), seeing a loop is a good indicator that the particular chunk of code you’re examining has a run time of `O(n)`

.

But what about the method `find`

? Since it doesn’t always run through the entire collection, is it actually linear? In the example below, the first value that is less than three is at index `0`

of the collection. Why wouldn’t this be constant time?

Remember, since we’re looking for the worst-case scenario, we must assume that the input will not be ideal and that the element or value that we seek could be the last value in the array. In the second scenario (below), you’ll see just that. Finding the number that is less than three, in the least ideal situation, would require iterating through the entire array.

`O(n²)`

means that the calculation runs in quadratic time, which is the squared size of the input data.

In programming, many of the more basic sorting algorithms have a worst-case run time of `O(n²)`

:

- Bubble Sort
- Insertion Sort
- Selection Sort

Let’s look at `countOperations`

below. Here we have two nested loops that are incrementing the `operations`

variable after each iteration. If

is our *n*`smallCollection`

, we will end up with a count of 16 operations. Not horrible. But what about if *n* is our `gigantic collection`

? A billion times a billion is a quintillion — or 1,000,000,000,000,000,000. Yikes. That’s a LOT of operations. Even an array with as little as 1,000 elements would end up creating a million operations.

Generally speaking (but not always), seeing two nested loops is typically a good indicator that the piece of code you’re looking at has a run time of `O(n²)`

. Likewise — three nested loops would indicate a run time of `O(n³)`

!

`O(log n)`

means that the running time grows in proportion to the logarithm of the input size. this means that the run time barely increases as you exponentially increase the input.

Finding a word in a physical dictionary by halving my sample size is an excellent example of how logarithmic time works in the real world. For instance, when looking for the word “senior,” I could open the dictionary precisely in the middle — at which point I could determine whether the words that begin with “s” come before or after the words that I’m currently viewing. Once I determine that “s” is in the second half of the book, I can dismiss all of the pages in the first half. I then repeat the same process. By following this algorithm to the end, I would cut the number of pages I must search through in 1/2 every time until I find the word.

In programming, this act of searching through a physical dictionary is an example of a binary search operation — the most typical example used when discussing logarithmic run times.

Let’s take a look at a modified version of our `countOperations`

function. Note that *n* is now a number: *n* could be an input (number) or the size of an input (the length of an array).

In the example above, we end up with 11 operations if *n *= 2000. If *n* = 4,000, we end up with 12 operations. Every time that we double the amount of *n*, the amount of operations only increases by one. Algorithms that run in logarithmic time have big implications when it comes to larger inputs. Using our example below, `O(log(7))`

would return three operations. An `O(log(1000000))`

would return only 20 operations!

*Note: **O(n log n)**, which is often confused with **O(log n)**, means that the running time of an algorithm is linearithmic, which is a combination of linear and logarithmic complexity. Sorting algorithms that utilize a divide and conquer strategy are linearithmic, such as the following:*

*merge sort**timsort**heapsort*

*When looking at time complexity, **O(n log n)** lands between **O(n2)** and **O(n)**.*

One of my favorite sites to reference for big O is Big O cheat sheet. As you can see from the chart, other run times have pretty horrible time complexity, like `O(2^n)`

and `O(n!)`

.

Algorithms that run in `O(2^n)`

and `O(n!)`

don’t fare well with anything but tiny data sets. `O(2^n)`

, or exponential time, doubles in time with each addition to the input. `O(n!)`

, or factorial time, is even worse. Any time *n* increases by 1, the running time increases by a factor of n.

It’s important to note that there is no fixed list of run times. However, the ones that we’ve covered in this post are some of the most common.

Thus far, we’ve only focused on talking about big O by isolating a few lines of code at a time. How do we calculate big O if we are dealing with an algorithm that has several parts?

## Drop the constants

Let’s start with `logEverythingTwice`

.

Since big O is concerned with how quickly our run-time grows, the first rule that you want to remember is to drop any constants when you analyze an algorithm.

In the example above, we have two separate loops that are iterating through the length of an array (linear). Each loop logs an item in the collection (constant). Since we don’t concern ourselves with operations that run in constant-time (they make little difference in the overall approximation), we’re only going to take the two loops into account and add them together, which gives us `O(2n)`

. Since the number two is also a constant in this case, we throw it out and call this `O(n)`

.

But what if these loops were nested?

With nesting these loops, we are no longer logging `items[i]`

twice but five times! In this case, we multiply `O(n) * O(n)`

instead of adding the run times together. We do this because the execution of our log is dependent on iterating through the entirety of the second loop (logging

five times) before we can increment *i*

and move to the next index via our first loop. Here’s what our log ends up returning:*i*

`1`

1

1

1

1

2

2

2

2

2

3

3

....

Just like the first example, we still want to drop the constants from the logs. In the end, `O(n * n)`

gives us `O(n²)`

.

Your takeaway should not be that you *always* avoid nested loops due to performance implications. You will undoubtedly come across situations that warrant such a solution (tiny datasets, manipulating multi-dimensional arrays, etc.). However, you should be aware that there may be performance implications, depending on what you are doing.

## Drop the non-dominant terms

Let’s take a look at another example.

After dropping the constants in `printMultiplesThenSum`

,** **we can see that the big O notation for this function would be `O(n² + n)`

. Since big O is also not concerned with non-dominant terms, we drop the

(quadratic wins since it is worse than linear time). Throwing out non-dominant terms is the second rule to follow when analyzing the run time of an algorithm. In the end, *n*`O(n² + n)`

gives us `O(n²)`

.

As mentioned earlier, you can write code without understanding the ins and outs of big O. You should also be aware that it’s an approximation, as the run time you calculate by hand may not play out when running in production when performance is affected by things like environments, processors, and languages.

However, this shouldn’t stop you from learning big O. Having a good understanding of big O notation gives you more context about what does and does not matter when designing algorithms. Who wouldn’t want that?