My practice implementation of a single linked list in Python.
Category Archives: Algorithms
My implementation of popular Computer Science Algorithms.
Using The Master Theorem To Find the Big O Of Recursive Algorithms
For some persons finding the Big O of recursive algorithms can prove to be difficult especially if the algorithms are complexed. The Master Theorem provides a relatively simple way to solve recurrence relations as long as the recurrence fits the theorem constraints.
Basic definition:
T(n) = aT(n/b) + f(n^{c}) where a >= 1, b > 1, and c >= 1
T(n) = Θ(n^{c}) if c >= log_{b}a
T(n) = Θ(n^{c} log n) if c = log_{b}a
T(n) = Θ(n^{logba}) if c < log_{b}a
We can think of “b” as the number of branches and “a” as the number of recurrences done; lets take a look at the example.
If we have a function:
T(n) = 2T(n/1) + f(n^{0}) : a = 1, b = 2, c = 0
log_{2}1 = 0 therefore c = log_{b}a which satisfies Θ(n^{c}log n) = O(log n)
References:
anupcowkur.com
wikipedia.com
Introduction To Memoization
From a previous post on recursion we see that sometimes the sub problems of a recursive solution are related or overlapping in certain ways, the Fibonacci number sequence has sub-problems which overlap. From the recursion tree we notice that most calculations repeat, which causes inefficiency in the algorithm.
Recursion Tree:
The solution for Fibonacci 4, 3, 2, 1 and 0 is repeated on the left and right of the recursion tree, which makes the algorithm very inefficiently. What if there was a way we could store values for previously calculated Fibonacci numbers so that in the event that we need those solutions we could find them O(1) without doing any calculations.
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs. Memoization has also been used in other contexts (and for purposes other than speed gains), such as in simple mutually recursive descent parsing^{[1]} in a general top-down parsingalgorithm^{[2]}^{[3]} that accommodates ambiguity and left recursion in polynomial time and space. Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement. In the context of some logic programming languages, memoization is also known as tabling;^{[4]} see also lookup table more…
Original Algorithm:
def fib(num): if num < 2:return num if num > 0 :return fib(num-1) + fib(num-2)
Memoized Version:
cache = {} def fib(num): if num in cache: return cache[num] else: cache[num] = num if num < 2 else fib(num-1) + fib(num-2) return cache[num]
Pruned Recursion Tree:
The red lines on the above diagram shows the pruning on the recursion tree caused by the memoization of the algorithm. The algorithm is now much more efficient because it no longer needs to traverse the recursion tree for a value which has already been calculated. Until next time have fun 🙂
Refrences
Wikipedia.com
Recursion in Computer Science
Recursion, though important is frequently overlooked by programmers who do not understand its potential and place in algorithm design. It allows the expression of some problems in a very elegant and succinct manner, problems such as the famous Fibonacci number sequence are better understood and written with the aid of recursion. We will go through a few algorithms that employ It in an attempt to explain its function in a clear and concise manner.
I will be using python in this post because its a scripting language and therefore there is no need to recompile while testing my code. If you do not have python on you machine you can get it here python; python IDE .
“A picture is worth a thousand words” is certainly true for this photo, in order for recursion to work there has to be at least one base case. The algorithm can have multiple base cases each of which is responsible for a particular state in which the algorithm may be in at each step. So lets look at our first example of a simple recursive function written in python.
Counting backwards from 10 to 0
def countDescending(num): print num if num == 0: return countDescending(num - 1)
Analysis
- The countDescending accepts a number as input then prints that number
- Base case: When num = 0 return from the function.
- Recursive step:The function calls itself with the input being reduced by 1
NB* All recursive solutions the initial problem must be one that can be broken down into simpler problems, which eventually leads to the base case. Also to note, each function call along with scoped non-static variables are held on the stack until the base case is reached at which point the stack is unwound and the resources released.
Fibonacci Number
def fib(num): if num < 2:return num if num > 0 :return fib(num-1) + fib(num-2)
Analysis
Fibonacci Number sequence: 0, 1, 1, 2, 3, 5, 8
Recursive formula:
Base case: If input is less than 2 return the original input.
Recursive step: If input greater than 0 call fib with number – 1 and number – 2
Recursion Tree:
A recursion tree allows us to gain insight into the workings of a recursive algorithm, the above diagram shows the recursion tree for the Fibonacci number 8. From it we note that there is lot of repetition in calculation, this could be avoided with the use momoization which we will discuss in a future article.
NP-complete Problems
NP-complete problem, any of a class of computational problems for which no efficient solution algorithm has been found. Many significant computer-science problems belong to this class—e.g., thetraveling salesman problem, satisfiability problems, and graph-covering problems.
So-called easy, or tractable, problems can be solved by computer algorithms that run in polynomial time; i.e., for a problem of size n, the time or number of steps needed to find the solution is a polynomial function of n. Algorithms for solving hard, or intractable, problems, on the other hand, require times that are exponential functions of the problem size n. Polynomial-time algorithms are considered to be efficient, while exponential-time algorithms are considered inefficient, because the execution times of the latter grow much more rapidly as the problem size increases.
A problem is called NP (nondeterministic polynomial) if its solution can be guessed and verified in polynomial time; nondeterministic means that no particular rule is followed to make the guess. If a problem is NP and all other NP problems are polynomial-time reducible to it, the problem is NP-complete. Thus, finding an efficient algorithm for any NP-complete problem implies that an efficient algorithm can be found for all such problems, since any problem belonging to this class can be recast into any other member of the class. It is not known whether any polynomial-time algorithms will ever be found for NP-complete problems, and determining whether these problems are tractable or intractable remains one of the most important questions in theoretical computer science. When an NP-complete problem must be solved, one approach is to use a polynomial algorithm to approximate the solution; the answer thus obtained will not necessarily be optimal but will be reasonably close.
List of NP-complete problems (not an exhaustive list), that you may face in a Job Interview.
References:
Britanica
Big O (Omega) Analysis Explained
Recently I had a technical interview with a major IT company, during the course of that interview I was repeatedly quizzed on the runtime complexity of various algorithms (hashing, sorting, searching), how I could incorporate them into the solution of a given problem and what the resultant runtime complexity of that solution would be . Very quickly in the interview I realized that the cursory introduction to Big O notation that was taught in my algorithm analysis class was very introductory to say the least. That interview was an eye opener as to what I need to know in preparation for future interviews. After doing some research, I could not find any simple well defined implementation of Big O that was relatively easy to grasp; therefore, I decided to put together the major parts of the information I had gone through into a simple and easy to follow explanation.
Definition: In computer science, big O notation is used to classify algorithms by how they respond (i.e., in their processing time or working space requirements) to changes in input size. Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function. Associated with big O notation are several related notations, using the symbols o, Ω, ω, and Θ, to describe other kinds of bounds on asymptotic growth rates.
Infinite asymptotics: Big O notation is useful when analyzing algorithms for efficiency. For example, the time (or the number of steps) it takes to complete a problem of size n might be found to be T(n) = 4n^{2} − 2n + 2. As n grows large, the n^{2} term will come to dominate, so that all other terms can be neglected — for instance when n = 500, the term 4n^{2} is 1000 times as large as the 2n term. Ignoring the latter would have negligible effect on the expression’s value for most purposes. Further, the coefficients become irrelevant if we compare to any other orderof expression, such as an expression containing a term n^{3} or n^{4}. Even if T(n) = 1,000,000n^{2}, if U(n) = n^{3}, the latter will always exceed the former once n grows larger than 1,000,000 (T(1,000,000) = 1,000,000^{3}= U(1,000,000)). Additionally, the number of steps depends on the details of the machine model on which the algorithm runs, but different types of machines typically vary by only a constant factor in the number of steps needed to execute an algorithm. So the big O notation captures what remains: we write either or
Orders of Growth for Common Functions: Functions are ordered by scalability and efficiency from best to worst.
Notation | Name | Example |
---|---|---|
constant | Determining if a number is even or odd; using a constant-size lookup table or hash table | |
double logarithmic | Finding an item using interpolation search in a sorted array of uniformly distributed values. | |
logarithmic | Finding an item in a sorted array with a binary search or a balanced search tree as well as all operations in a Binomial heap. | |
fractional power | Searching in a kd-tree | |
linear | Finding an item in an unsorted list or a malformed tree (worst case) or in an unsorted array; Adding two n-bit integers by ripple carry. | |
linearithmic, loglinear, or quasilinear | Performing a Fast Fourier transform; heapsort, quicksort (best and average case), or merge sort | |
quadratic | Multiplying two n-digit numbers by a simple algorithm; bubble sort (worst case or naive implementation), Shell sort, quicksort (worst case), selection sort or insertion sort | |
polynomial or algebraic | Tree-adjoining grammar parsing; maximum matching for bipartite graphs | |
L-notation or sub-exponential | Factoring a number using the quadratic sieve or number field sieve | |
exponential | Finding the (exact) solution to the travelling salesman problem using dynamic programming; determining if two logical statements are equivalent using brute-force search | |
factorial | Solving the traveling salesman problem via brute-force search; generating all unrestricted permutations of a poset; finding the determinant with expansion by minors. |
An O(1) scales better than an O(log N) algorithm,
which scales better than an O(N) algorithm,
which scales better than an O(N log N) algorithm,
which scales better than an O(N^2) algorithm,
which scales better than an O(2^N) algorithm.
NB: O(log n) = O(log2 n) | log2 n = log(n) / log(2)
That said, lets look at how we can use Big O notation to describe the runtime complexity of some code.
int SomeFunction(int n) { int p = 1; int i = 1; while (i < n) { int j = 1; while (j < i) { p = p * j; j = j + 1; } i = i + 1; } return p; }
The first thing we will do is to assign constants to each statement in the function, these constants will be used to represent the amount of time it takes to run each statement.
int SomeFunction(int n) { int p = 1; ----------------------->c1 int i = 1; ----------------------->c1 while (i < n) { ------------------>c2 int j = 1; ------------------->c1 while (j < i) { -------------->c2 p = p * j; ---------------->(c1 + c3) j = j + 1; -------------->(c1 + c4) } i = i + 1; ------------------>(c1 + c4) } return p; ------------------------>c5 }
We then multiply each constant by the number of times each statement is run.
int SomeFunction(int n) { int p = 1; ----------------------->c1 x 1 int i = 1; ----------------------->c1 x 1 while (i < n) { ------------------>c2 x n int j = 1; ------------------->c1 x (n - 1) while (j < i) { -------------->c2 x ((1/2)n^2 - (3/2)n + 2) p = p * j; ---------------->(c1 + c3) x ((1/2)n^2 - (3/2)n + 1) j = j + 1; -------------->(c1 + c4) x ((1/2)n^2 - (3/2)n + 1) } i = i + 1; ------------------>(c1 + c4) x (n - 1) } return p; ------------------------>c5 x 1 }
You may be wondering how ((1/2)n^2 – (3/2)n + 1) was derived. Well if you look closely at the code you will notice that it contains a nested dependent loop. Further analysis reveals that code inside the outer loop is run (n-1) times while code inside the inner loop is run 0, 1, 2, 3 … (n-2) times which is an arithmetic progression.
The sum of this progression would be (n-2)(n-1) to convert it to its standard form we have to divide it by 2 hence (n-2)(n-1)/2 = ((1/2) n^2 – (3/2) n + 1). Multiplying the run times by the number of times each statement is run will give the total time the function takes to run -> (c1 + (1/2)c2 + (1/2)c3 + (1/2)c4)n^2 + (-c1 – (1/2)c2 – (1/2)c3 – (1/2)c4)n + (2(c1) + 2(c2) +c3 + c5) ignoring the constants, lower order n terms and the coefficient of n^2 we are left with n^2 or O(n^2).
Related articles
- Mathematically analyzing sorting algorithms (peter-eigenschink.at)
Token Bucket Algorithm In C#
The token bucket is an algorithm used in packet switched computer networks and telecommunications networks to check that data transmissions conform to defined limits on bandwidth and burstiness (a measure of the unevenness or variations in the traffic flow).…more
using System; using System.DateTime; public class TokenBucket { float _capacity = 0; float _tokens = 0; float _fillrate = 0; DateTime _time_stamp; public TokenBucket(float tokens, float fill_rate) { _capacity = tokens; _tokens = tokens; _fill_rate = fill_rate; _time_stamp = DateTime.Now; } public bool Consume(float tokens) { if(tokens { _tokens -= tokens; }else{ return false; } return true; } public float GetTokens() { DateTime _now = DateTime.Now; if(_tokens < _capacity) { var delta = _fill_rate * (_now - _time_stamp); _tokens = Math.Min(_capacity, _tokens + delta); _time_stamp = _now; } return _tokens; } }