Once we solve a given problem, how can we tell people about our solution?
Consider baking; you transmit recipes, detailing how much of ingredient $x$ at what time.
Consider assembling furniture; you send blueprints for each step, including where screw $y$ goes and when it should be installed.
Similarly to these scenarios, when we solve a computational problem, we want to transmit the solution to others who may make use of it. In computer science, our recipes for success are called algorithms
To convey algorithms to others, we use pseudocode, which is intended to be read by humans, not executed by computers.
In this class, most pseudocode will bear a striking resemblance to Python, a simple, high-level, interpreted language
def algorithm(x: type, y: type) -> output:
step 1
step 2
step 3
We implement algorithms as functions.
In this class we will study the creation, implementation, and analysis of algorithms extensively.
Some questions we will answer:
Problem: Given an unordered array of elements, how can you tell if $x \in A$
Solution: Traverse the entirety of the array, and check each element.
Problem: Assume $A$ is now sorted, create an algorithm to tell if $x \in A'$
def binary_search(x: int, A: List[int]) -> bool:
if A == []:
return False
elif A[len(A) // 2] == x:
return True
elif A[len(A) // 2] < x:
return binary_search(x, A[len(A) // 2:])
else:
return binary_search(x, A[:len(A) // 2])
From our previous two approaches, which do you think is faster? Why?
Linear search has to look at all $n$ elements.
Binary search cuts the search space in half at each call
How many elements must ordered search examine?
$\log_2 n$
Which would you prefer? $n$ or $\log_2 n$
We need some way to decide which algorithms are better than others
What are some ways we can measure runtimes?
The stopwatch approach
Vary your inputs, timing each execution, then plot the results.
Pros:
Cons:
Framing runtime as a function of an algorithms' inputs
How? Counting basic operations
def search(A: List[int], key: int) -> bool:
for element in A:
if element == key:
return True
return False
For each of the $n$ elements, search performs 1 operation, so $T(n) = n$
There are two primary ways to describe an algorithms asymptotic performance
def f1(A: List[int], sum: int) -> bool:
for x in A:
for y in A:
if x + y == sum:
return True
return False
$O(n^2), \Omega(1)$
def f2(x: int) -> int:
product = 1
for i in range(1, x):
product *= i
return product
$O(n), \Omega(n), \Theta(n)$
Asymptotic analysis can be thought of as a limit problem. $$O(kn^2) \rightarrow \lim_{n \rightarrow \infty} O(kn^2) = O(n^2)$$ As $n \rightarrow \infty$, we know that $k$ becomes irrelevant
Thus we will normally discuss "classes" of algorithms
Nodation | Name |
---|---|
$O(1)$ | Constant Time |
$O(\log n)$ | Logarithmic |
$O(n)$ | Linear |
$O(n \cdot \log n)$ | Linearithmic |
$O(n^c)$ | Polynomial |
$O(c^n)$ | Exponential |
$O(n!)$ | Factorial |
Breaking down a large problem into a smaller one.
def sum(A: List[int]) -> int:
sum = 0
for element in A:
sum += element
return sum
def sum(A: List[int]) -> int:
if A == []: return 0
else: return A[0] + sum(A[1:])
Base Case; Recursive Call
def sum(A: List[int]) -> int:
if len(A) == 2: return A[0] + A[1]
else: return A[0] + sum(A[1:])
def sum(A: List[int]) -> int:
if len(A) == 1: return A[0]
else: return A[0] + sum(A[1:])
def sum(A: List[int]) -> int:
if len(A) == 0: return A[0]
else: return A[0] + sum(A[1:])
def factorial(n: int) -> int:
if n <= 1: return 1
else: return n * fact(n - 1)
Factorial of n = $$n! = (n - 1)! \cdot n, \forall n > 1; 1! = 0! = 1$$
Factorial of n = $$T(n) = T(n - 1) + 1; T(0) = T(1) = 1$$
Again, we specify the base case, by saying the time to compute $n = 0$ or $n = 1$ takes 1 operation.
$$ \begin{align} T(n) &= T(n - 1) + 1; T(0) = T(1) = 1 \\ T(n - 1) &= T(n - 2) + 1 \\ T(n) &= (T(n - 2) + 1) + 1 \\ T(n - 2) &= T(n - 3) + 1 \\ T(n) &= ((T(n - 3) + 1) + 1) + 1 \Rightarrow T(n - 3) + 3 \\ &= T(n - k) + k \textit{ (Substitute k)} \\ n - k &= 0 \textit{ (Solve the base case)} \\ n &= k \\ T(n) &= T(n - n) + n \Rightarrow n \Rightarrow O(n) \end{align} $$
$$ \begin{align} T(n) &= n + T(n - 1) \\ &= n + (n - 1 + T(n - 2)) \\ &= n + (n - 1 + (n - 2 + T(n - 3))) \\ &= n + (n - 1 + (n - 2 + (n - 3 + T(n - 4)))) \\ &= n + (n - 1 + (n - 2 + (n - 3 + (n - 4 + (... + 1)))) \\ &= \sum_{i = 1}^n i \Rightarrow \frac{n(n + 1)}{2} \end{align} $$
\begin{align} T(n) &= 2T(n - 1) + 3; T(1) = 1 \\ &= 2(2T(n - 2) + 3) + 3 \\ &= 2(2(2T(n - 3) + 3) + 3) + 3 \\ &= 2(2^2 T(n - 3) + 2 \cdot 3 + 3) + 3 \\ &= 2^3 T(n - 3) + 2^2 \cdot 3 + 2 \cdot 3 + 3 \\ &= 2^k T(n - k) + \sum_{i = 0}^{k - 1} 2^i \cdot 3 \\ T(n) &= 2^{n - 1} + \sum_{i = 0}^{n - 2} 2^i \cdot 3 \\ T(n) &= 2^{n - 1} + 3 (2^{n - 1} - 1) = O(2^n) \end{align}