Algorithms

Motivation

Why we need algorithms, and what they are.

Algorithms: Why

Once we solve a given problem, how can we tell people about our solution?

Consider baking; you transmit recipes, detailing how much of ingredient $x$ at what time.

Consider assembling furniture; you send blueprints for each step, including where screw $y$ goes and when it should be installed.

Similarly to these scenarios, when we solve a computational problem, we want to transmit the solution to others who may make use of it. In computer science, our recipes for success are called algorithms

Algorithms: How

To convey algorithms to others, we use pseudocode, which is intended to be read by humans, not executed by computers.

In this class, most pseudocode will bear a striking resemblance to Python, a simple, high-level, interpreted language


              def algorithm(x: type, y: type) -> output:
                step 1
                step 2
                step 3
            

Algorithms: Implementing

We implement algorithms as functions.

Conclusions

In this class we will study the creation, implementation, and analysis of algorithms extensively.

Some questions we will answer:

  • How fast can we sort things?
  • How fast can we search for substrings in a given text?
  • How fast can we determine if an element is present in a given collection?

Searching

How to find things

Searching: Unordered Arrays of Integers

Problem: Given an unordered array of elements, how can you tell if $x \in A$

Solution: Traverse the entirety of the array, and check each element.

Searching: Ordered Arrays of Integers

Problem: Assume $A$ is now sorted, create an algorithm to tell if $x \in A'$

            
              def binary_search(x: int, A: List[int]) -> bool:
                if A == []: 
                  return False
                elif A[len(A) // 2] == x: 
                  return True
                elif A[len(A) // 2] < x: 
                  return binary_search(x, A[len(A) // 2:]) 
                else: 
                  return binary_search(x, A[:len(A) // 2]) 
            
          

Comparing our two approaches

From our previous two approaches, which do you think is faster? Why?

Linear search has to look at all $n$ elements.

Binary search cuts the search space in half at each call

How many elements must ordered search examine?

$\log_2 n$

Which would you prefer? $n$ or $\log_2 n$

Algorithm Analysis

How to prove one algorithm is faster than another

Motivation

We need some way to decide which algorithms are better than others

What are some ways we can measure runtimes?

  • Empirically
  • Analytically

Empirical Analysis

The stopwatch approach

Vary your inputs, timing each execution, then plot the results.

Pros:

  • Easy
  • Quick
  • Good for comparing similar algorithms

Cons:

  • Hardware dependant
  • Doesn't Generalize
  • Have to implement the algorithm
  • Can take months to exhaustively time
  • Need input data

Asymptotic Analysis

Framing runtime as a function of an algorithms' inputs

How? Counting basic operations


            def search(A: List[int], key: int) -> bool:
              for element in A:
                if element == key:
                  return True
              return False
          

For each of the $n$ elements, search performs 1 operation, so $T(n) = n$

Asymptotic Analysis: Bounds

There are two primary ways to describe an algorithms asymptotic performance

  • Upper Bound: the algorithm can do no worse than this
  • Lower Bound: the algorithm can do no better than this

Big-Oh, Omega, and Theta

  • $O(?)$
    • Pronounced "Big-Oh"
    • Represents the upper bound
    • May not always be tight
  • $\Omega(?)$
    • Pronounced "Big-Omega"
    • Represents the lower bound
    • May not always be tight
  • $\Theta(?)$
    • Pronounced "Big-Theta"
    • Used when $O(T(n)) == \Omega(T(n))$ for a given algorithm
    • Is the strongest promise, since the upper and lower bounds must be equal

$O(?)$, $\Omega(?)$, $\Theta(?)$


            def f1(A: List[int], sum: int) -> bool:
              for x in A:
                for y in A:
                  if x + y == sum:
                    return True
              return False
          

$O(n^2), \Omega(1)$

$O(?)$, $\Omega(?)$, $\Theta(?)$


            def f2(x: int) -> int:
              product = 1
              for i in range(1, x):
                product *= i
              return product
          

$O(n), \Omega(n), \Theta(n)$

Visual Growth

Dominant Terms

Asymptotic analysis can be thought of as a limit problem. $$O(kn^2) \rightarrow \lim_{n \rightarrow \infty} O(kn^2) = O(n^2)$$ As $n \rightarrow \infty$, we know that $k$ becomes irrelevant

Thus we will normally discuss "classes" of algorithms

Ordering Algorithms by Speed

Nodation Name
$O(1)$ Constant Time
$O(\log n)$ Logarithmic
$O(n)$ Linear
$O(n \cdot \log n)$ Linearithmic
$O(n^c)$ Polynomial
$O(c^n)$ Exponential
$O(n!)$ Factorial

Recursion

How we can analyze the runtime of recursive programs

Recursive Problem Solving

Breaking down a large problem into a smaller one.


            def sum(A: List[int]) -> int:
              sum = 0
              for element in A:
                sum += element
              return sum

            def sum(A: List[int]) -> int:
              if A == []: return 0
              else: return A[0] + sum(A[1:])
          

Base Case; Recursive Call

Select the best base case


            def sum(A: List[int]) -> int:
              if len(A) == 2: return A[0] + A[1]
              else: return A[0] + sum(A[1:])

            def sum(A: List[int]) -> int:
              if len(A) == 1: return A[0]
              else: return A[0] + sum(A[1:])
            
            def sum(A: List[int]) -> int:
              if len(A) == 0: return A[0]
              else: return A[0] + sum(A[1:])
          

Recursive Definitions


            def factorial(n: int) -> int:
              if n <= 1: return 1
              else: return n * fact(n - 1)
          

Factorial of n = $$n! = (n - 1)! \cdot n, \forall n > 1; 1! = 0! = 1$$

Recurrence Relations

Factorial of n = $$T(n) = T(n - 1) + 1; T(0) = T(1) = 1$$

Again, we specify the base case, by saying the time to compute $n = 0$ or $n = 1$ takes 1 operation.

Unrolling a Recurrence Relation

$$ \begin{align} T(n) &= T(n - 1) + 1; T(0) = T(1) = 1 \\ T(n - 1) &= T(n - 2) + 1 \\ T(n) &= (T(n - 2) + 1) + 1 \\ T(n - 2) &= T(n - 3) + 1 \\ T(n) &= ((T(n - 3) + 1) + 1) + 1 \Rightarrow T(n - 3) + 3 \\ &= T(n - k) + k \textit{ (Substitute k)} \\ n - k &= 0 \textit{ (Solve the base case)} \\ n &= k \\ T(n) &= T(n - n) + n \Rightarrow n \Rightarrow O(n) \end{align} $$

Unrolling a Recurrence Relation: 2

$$ \begin{align} T(n) &= n + T(n - 1) \\ &= n + (n - 1 + T(n - 2)) \\ &= n + (n - 1 + (n - 2 + T(n - 3))) \\ &= n + (n - 1 + (n - 2 + (n - 3 + T(n - 4)))) \\ &= n + (n - 1 + (n - 2 + (n - 3 + (n - 4 + (... + 1)))) \\ &= \sum_{i = 1}^n i \Rightarrow \frac{n(n + 1)}{2} \end{align} $$

Unrolling a Recurrence Relation: 3

\begin{align} T(n) &= 2T(n - 1) + 3; T(1) = 1 \\ &= 2(2T(n - 2) + 3) + 3 \\ &= 2(2(2T(n - 3) + 3) + 3) + 3 \\ &= 2(2^2 T(n - 3) + 2 \cdot 3 + 3) + 3 \\ &= 2^3 T(n - 3) + 2^2 \cdot 3 + 2 \cdot 3 + 3 \\ &= 2^k T(n - k) + \sum_{i = 0}^{k - 1} 2^i \cdot 3 \\ T(n) &= 2^{n - 1} + \sum_{i = 0}^{n - 2} 2^i \cdot 3 \\ T(n) &= 2^{n - 1} + 3 (2^{n - 1} - 1) = O(2^n) \end{align}

Summation Identities