Recursion

Recursive Functions

Definition 1 A function is recursive if it is recursive.

Definition 2 A function is recursive if it refers to itself.

It is probably easier to see some examples.

Divide and Conquer

A common situation in which recursion is used is in Divide and Conquer algorithms.

The basic idea is to break up a problem into two (or more) sub-problems and combine the answer to get the final solution

Finding the maximum element of a list

One way to find the maximum element of a list A is to find the maximum element in the first half of the list, and find the maximum element in the second half of the list, and then take the maximum of these two values.

How do we find the maximum element of each half of the list? We can use this function recursively to break up the problem into smaller and smaller pieces.

At some point, the pieces will be trivial to solve. In this case, we don’t recurse further, we just solve the problem explicitly. For this problem, we’ll just solve the problem directly if the list is length 3 or less by using the built-in max function:

max(1,2)
2
def maximum(A):
    """
    find the maximum element of a list A recursively
    """
    # check if the list is short enough to do explicitly
    if len(A) < 4:
        return max(A)

    # else, we will use recursion to solve the problem
    n = len(A) // 2 # partition the list into two halves here
    max1 = maximum(A[:n]) # left half
    max2 = maximum(A[n:]) # right half
    
    return max(max1, max2)
    
A = list(range(100))
maximum(A)
99

Mergesort

Mergesort is an example of a sorting algorithm. The input is a list a, and the output is a list b which has the same elements as a, but appearing in sorted order.

Python has the sorted function built-in to sort iterables like lists:

a = [2,3,1]
sorted(a)
[1, 2, 3]

There are a variety of sorting algorithms which use different techniques.

Merge sort uses the observation that if two arrays a1 and a2 are already sorted, that it is easy to merge the two arrays into a single sorted array in a single loop

def merge(a1, a2):
    """
    merge sorted lists a1 and a2 into a sorted list b
    """
    b = []
    i1 = 0
    i2 = 0
    # insert the smallest element into b
    while (i1 < len(a1) and i2 < len(a2)):
        if a1[i1] < a2[i2]:
            b.append(a1[i1])
            i1 = i1 + 1
        else:
            b.append(a2[i2])
            i2 = i2 + 1
            
    # at most one of the following while-loops will do anything
    while (i1 < len(a1)):
        b.append(a1[i1])
        i1 = i1 + 1
        
    while (i2 < len(a2)):
        b.append(a2[i2])
        i2 = i2 + 1
        
    return b

merge([1,3,4], [0,2,5])
[0, 1, 2, 3, 4, 5]

The divide-and-conquer strategy is to start with an input list a, sort the left and right halves, and then merge the two halves to sort a. We can employ recursion by using merge sort to sort each half. By definition, an list with 1 or no elements is already sorted.

def mergesort(a):
    """
    sort a using merge-sort algorithm
    """
    # if a has 1 or zero elements, it is already sorted
    if len(a) < 2:
        return a
    k = len(a) // 2
    L = a[:k] # left half
    R = a[k:] # right half
    
    # recurse to sort L and R
    L = mergesort(L)
    R = mergesort(R)
    
    # now merge L and R
    a = merge(L, R)
    return a
    
    
mergesort([6,5,4,3,2,1,0])
[0, 1, 2, 3, 4, 5, 6]

Quicksort

Quicksort was named by SIAM editors as one of the top-10 algorithms of the 20th century. Like mergesort, it also uses a divide and conquer strategy.

Quicksort works by partitioning a list into two halves divided by a pivot. All elements less than the pivot are moved to the first part of the list, and all elements greater than the pivot are moved to the second part of the list.

def swap(a, i, j):
    """
    swap elements i and j in-place in list a
    """
    a[i], a[j] = a[j], a[i]

a = [1,2,3]
print(a)
swap(a, 0, 2)
print(a)
[1, 2, 3]
[3, 2, 1]
from numpy.random import randint
def partition(a, lo, hi):
    """
    choose a pivot in a[lo:hi+1] randomly
    swap all elements of a[lo:hi+1] less than the pivot value to appear before the pivot
    swap all elements of a[lo:hi+1] greater than the pivot value to appear after the pivot
    """
    pi = randint(lo, hi+1) # pivot index
    swap(a, pi, hi) # put pivot index in last position
    pivot = a[hi]
    i = lo # i is the pivot index for elements we have seen so far
    for j in range(lo, hi+1):
        if a[j] < pivot:
            swap(a, i, j)
            i = i+1 # increment pivot index
    swap(a, i, hi) # put pivot in correct place
    return i

def quicksort(a, lo=0, hi=None):
    """
    perform quicksort algorithm on array a
    performs operations in-place
    """
    if hi is None:
        hi = len(a) - 1
        
    if lo < hi:
        i = partition(a, lo, hi)
        quicksort(a, lo, i-1) # recurse on lower half
        quicksort(a, i+1, hi) # recurse on higher half
        
    return a

quicksort([6,5,4,3,2,1,0])
[0, 1, 2, 3, 4, 5, 6]

Analysis of Recursive Algorithms

Recursive algorithms are usually easy to implement, but they aren’t always fast. You’ll see an example in homework 0.

On the other hand, mergesort has optimal asymptotic complexity for a sorting algorithm: \(O(n\log n)\)

if quicksort is very unlucky with choice of pivot, it can take \(O(n^2)\) time, but its expected runtime is \(O(n \log n)\). It also does all operations in-place, unlike mergesort, which uses auxillary memory.

The Master Theorem

The master theorem helps us to obtain asymptotic time complexity for divide and conquer algorithms. The first step is to write down the time it takes to run a function on input size \(n\), \(T(n)\), in terms of the time it takes to run subproblems $\( T(n) = a T(n/b) + f(n)\)\( where \)f(n)$ is the work done to stitch the subproblems together.

The critical exponent is defined as \(c = \log_b a\). The amount of work to complete all trivial subproblems is \(\Theta(n^c)\). There are three regimes:

  1. \(f(n) = o(n^c)\). In this case, the work to do the subproblems dominate, and the time complexity is \(T(n) = \Theta(n^c)\).

  2. \(f(n) = \Theta(n^c)\). In this case, the work to do subproblems and combine them is comparible. The complexity is \(T(n) = \Theta(n^c \log n)\)

  3. \(f(n) = \omega(n^c)\). In this case, the work to combine subproblems dominates, and the complexity is \(T(n) = \Theta(f(n))\).

For example, in our maximum function, we split into two subproblems of size \(n/2\), so \(a = 2, b=2\), and \(c = 1\). \(f(n)\) is the time to take the max of two numbers, so is \(O(1)\). This means we are in case 1 above, and the time complexity of maximum is \(\Theta(n^c) = \Theta(n)\).

Exercise

Apply the master theorem to obtain the complexity of merge sort.

\(O(n\log n)\)

Exercise

Implement the correlation coefficient defined in this paper

Tail Recursion

TODO: this isn’t handled by the Python interpreter, but you can do it yourself!

Memoization

TODO: example on Fibonacci