I have started this post since reading "Pearls of Functional Algorithm Design" by Richard Bird. It contains my random thoughts about programming puzzles(some from that book). Most of the algorithms are implemented in Haskell. Most of the code in Haskell can run in fpcomplete's online IDE.
Binary Search, the easy way
Basic case
Description:
Given a list of sorted values, find the position of some given value.
If there are multiple values in the list, return any one of them
The basic case of binary search is very easy.
import Data.Vector
import Data.Maybe
binSearch :: (Int -> Ordering) -> (Int, Int) -> Maybe Int
binSearch f (p,r) | p > r = Nothing
| otherwise = case f m of
EQ -> Just m
LT -> binSearch f (m+1,r)
GT -> binSearch f (p,m-1)
where m = (p+r) `div` 2
-- special case for vector
binSearchInVec :: (Ord a) => Vector a -> a -> (Int,Int) -> Maybe Int
binSearchInVec vec val = binSearch (\ m -> vec!m `compare` val)
main = print $ fromJust $ binSearchInVec (fromList [1,2,3,4]) 2 (0,3)
Extended case1: find the smallest/largest index
The base case is easy since the terminating condition is clear. But for some extended cases, it's harder to get it right.
The most important questions one should ask when implementing binary search is
-
When to return the result
-
When to go to upper/lower range
-
How to change the range
If we can't find a good solution for above questions, we will get trapped in "infinite loops", "out of bound error", etc.
Some claims that after changing the range from [p,r] to (p-1,r+1) or [p,r+1) or (p-1, r] will make it easier. However, I think a much better
method is:
- Add a variable
mark
to help decide when to return. - Change the range aggressively. For every loop, we should decrease the range.
Description:
Given a list of sorted values, find the position of some given value.
If there are multiple values in the list, return the smallest index.
import Data.Vector
import Data.Maybe
binSearch :: (Int -> Ordering) -> (Int, Int) -> Maybe Int
binSearch f (p,r) = go (p,r) Nothing
where go (p,r) mark | p > r = mark
| otherwise = case f m of
EQ -> go (p,m-1) (Just m)
LT -> go (m+1,r) mark
GT -> go (p,m-1) mark
-- special case for vector
binSearchInVec :: (Ord a) => Vector a -> a -> (Int,Int) -> Maybe Int
binSearchInVec vec val = binSearch (\ m -> vec!m `compare` val)
main = print $ fromJust $ binSearchInVec (fromList [1,2,2,3,4]) 2 (0,4)
mark
stores the most recent EQ value found. We can prove the above program will surely get the smallest index based on the following
inductive rule:
Suppose initial search range is [first, last], current search range is [p,r], then, mark = Nothing || Just min{x | first<=x<p || r < x <= last, f x == EQ }
Proof:
-
Initially, $ p = first , r = last $,
mark = Nothing
. -
For $ [p,r] $, suppose for $ x \in [first,p) \cup (r, last] $ there exists $ f x = EQ $, then
mark = Just x
, otherwisemark = Nothing
. -
When p > r,
mark = Nothing || Just min {first<=x<=last, f x == EQ}
.
Exercise: How to find the largest index?
Extended case 2: find turning point.
Description:
At a given point: x, we break a strictly sorted list with length >= 3 into two parts: a and b, then append a to the end of b and get another list c. Now, the question is, given c, find the index of the smallest element in c(which is the head of original sorted list
Again, we will implement it with the help of mark
variable.
import Data.Vector
import Data.Maybe
binSearch :: (Int -> Ordering) -> (Int, Int) -> Maybe Int
binSearch f (p,r) = go (p,r) Nothing
where go (p,r) mark | p > r = mark
| otherwise = case f m of
EQ -> Nothing
LT -> go (p, m-1) (Just m)
GT -> go (m+1, r) mark
where m = (p+r) `div` 2
-- special case for vector
binSearchInVec :: (Ord a) => Vector a -> (Int,Int) -> Maybe Int
binSearchInVec vec (p,r) = binSearch (\ m -> (vec!m) `compare` (vec!p)) (p,r)
main = case binSearchInVec (fromList [2,3,1]) (0,2) of
Nothing -> print "the list is not strictly increasing"
Just i -> print i
-
When to return the result?
- When
p>r
, we have considered all elements, so we should return themark
. - When
f m == EQ
, we know that eitherm == first
or there exists $ i \in [first, last] $ and $ f i = f last $, both of which contradict with our question assumption (list should be strictly sorted and have a size >= 3). So we return Nothing.
- When
-
When to go to upper/lower range?
easy question
-
How to change the range?
As I have always advocated, change the range aggressively to avoid infinite recursion.
Extended case 3
Poj has an interesting problem. My haskell implementation is as follows:
My solution:
-- lower bound binary search.
binSearch :: (Int -> Bool) -> (Int,Int) -> Maybe Int
binSearch f (p,r) = go (p,r) Nothing
where go (p,r) mark | p > r = mark
| otherwise = if f m then go (m+1,r) (Just m) else go (p,m-1) mark
where m = (p+r) `div` 2
-- Note I didn't follow the format requested by the problem.
cabalLength :: [Double] -> Int -> Double
cabalLength cables num_pieces = case binSearch validLength (1, 10000000) of
Nothing -> fromIntegral 0
Just v -> (fromIntegral v) / 100
where cabal_ints = map (floor . (*100)) cables
validLength len = (foldr (+) 0 (map (`div` len) cabal_ints)) >= num_pieces
main = print $ cabalLength [8.02,7.43,4.57,5.39] 11
Tree traversal
Tree traversal is good example for recursive program design since it's intuitive. But the non-recursive version is not easy to understand, especially if it's implemented in some imperative programming languages like C. In this chapter, I will give the tail-recursive implementation(not in continuation passing style) of tree traversal in Haskell, which is easy to reimplement in C.
First, I will provide the recursive implementation, which will be used by quick check to test the tail-recursive version.
data BinTree a = EmptyTree
| Branch a (BinTree a) (BinTree a) deriving (Show)
preOrder :: BinTree a -> [a]
preOrder (EmptyTree) = []
preOrder (Branch a left right) = [a] ++ preOrder left ++ preOrder right
inOrder :: BinTree a -> [a]
inOrder (EmptyTree) = []
inOrder (Branch a left right) = inOrder left ++ [a] ++ inOrder right
postOrder :: BinTree a -> [a]
postOrder (EmptyTree) = []
postOrder (Branch a left right) = postOrder left ++ postOrder right ++ [a]
data BinTree a = EmptyTree
| Branch a (BinTree a) (BinTree a) deriving (Show)
preOrder :: BinTree a -> [a]
preOrder (EmptyTree) = []
preOrder (Branch a left right) = [a] ++ preOrder left ++ preOrder right
inOrder :: BinTree a -> [a]
inOrder (EmptyTree) = []
inOrder (Branch a left right) = inOrder left ++ [a] ++ inOrder right
postOrder :: BinTree a -> [a]
postOrder (EmptyTree) = []
postOrder (Branch a left right) = postOrder left ++ postOrder right ++ [a]
sampleTree :: BinTree Int
sampleTree = Branch 4 (Branch 2 (Branch 1 EmptyTree EmptyTree) (Branch 3 EmptyTree EmptyTree)) (Branch 6 (Branch 5 EmptyTree EmptyTree) (Branch 7 EmptyTree EmptyTree))
main = putStr (
"preorder: " ++ (show $ preOrder sampleTree) ++ "\n" ++
"inorder: " ++ (show $ inOrder sampleTree) ++ "\n" ++
"postorder: " ++ (show $ postOrder sampleTree) ++ "\n")
Tail-recursive version:
data BinTree a = EmptyTree
| Branch a (BinTree a) (BinTree a) deriving (Show)
preOrderT :: BinTree a -> [a]
preOrderT bt = go [bt] []
where go [] xs = reverse xs
go (EmptyTree:ts) xs = go ts xs
go (Branch v left right:ts) xs = go (left:right:ts) (v:xs)
inOrderT :: BinTree a -> [a]
inOrderT bt = go [bt] [] []
where go [] [] xs = reverse xs
go (EmptyTree:ts) [] xs = go ts [] xs
go (EmptyTree:ts) (v:left_acc) xs = go ts left_acc (v:xs)
go (Branch v left right:ts) left_acc xs = go (left:right:ts) (v:left_acc) xs
-- tail recursive post order traversal
postOrderT :: BinTree a -> [a]
postOrderT bt = go [bt] []
where go [] xs = xs
go (EmptyTree:ts) xs = go ts xs
go (Branch v left right:ts) xs = go (right:left:ts) (v:xs)
sampleTree :: BinTree Int
sampleTree = Branch 4 (Branch 2 (Branch 1 EmptyTree EmptyTree) (Branch 3 EmptyTree EmptyTree)) (Branch 6 (Branch 5 EmptyTree EmptyTree) (Branch 7 EmptyTree EmptyTree))
main = putStr (
"preorder: " ++ (show $ preOrderT sampleTree) ++ "\n" ++
"inorder: " ++ (show $ inOrderT sampleTree) ++ "\n" ++
"postorder: " ++ (show $ postOrderT sampleTree) ++ "\n")
The key point of tree traversal lies in visit order. For pre-order traversal, we should visit in the order of v -> left -> right. For in-order, it's left -> v -> right. For post-order, it should be left -> right -> v. The recursive version clearly shows the relationship. That's also why it's intuitive.
For the tail-recursive version, can we make it that clear? We know that stack can be used to implement function call. So we just need to use stack to store the context for visit order. Here, we use list to represent stack.
For preOrderT
's local function go trees xs
:
-
trees
: list of trees to visit. Supposetrees = (t:ts)
, i.e,t
is in the left ofts
. We maintain the order thatt
should always be visited beforets
. -
xs
: values of nodes that has already been visited. Supposexs = (v:vs)
. We maintain thatv
was visited beforevs
, i.e., when the program terminated, we should returnxs
in reverse order. Also, asxs
keeps values of nodes visited, we know nodes of values inxs
are visited before nodes intrees
.
Therfore:
-
go [] xs
: we have no trees to visit, we returnxs
in reverse order. -
go (EmptyTree:ts) xs
: currently on the top of the stack is an empty tree. We have nothing to record for it, so we continue withts
. -
go (Branch v left right:ts) xs
: according to our definition, the visit order for a branch isv -> left -> right
. Therefore, we putv
in the visited nodesxs
and continue the process withleft:right:ts
, which meansleft
is visited beforeright, which is also in front of
ts`. Everything is Clear!
postOrderT looks similar to preOrderT. Note that, the reverse visit order of post-order traversal is v->right->left
. So we only need to adapt the pre-order algorithm so that right is visited before left, then the reverse of the result of pre-order should be the result of post-order. Clear!
For inOrderT, go trees xs_acc xs
:
-
trees
andxs
has the same meaning as preOrderT -
xs_acc
: keeps values of nodes that we will visit.xs_acc
is constructed during in-order traversal, it records the visiting context of node for the recursive definition. Supposexs_acc = (x:acc)
. Thenx
should be visited beforeacc
. Also, fortrees = (t:ts)
, at the entry of function go, we should maintaint -> v -> (ts join acc)
. Here, join means keep the ordert->v->...
forts
andacc
. To simplify, we need to make sure thatt
is visited beforev
whilev
should be visited before all other nodes.
Therefore:
-
go [] [] xs
: nothing to visit, returnxs
in reverse order -
go (EmptyTree:ts) [] xs
: this case happens whents = []
. As we have nothing to record for EmptyTree, we go withts
. -
go (EmptyTree:ts) (v:left_acc) xs
: remember that we need to maintain ordert->v->rest
. Therefore, we first visitEmptyTree
(nothing to do), then visitv
by adding it toxs
. -
go (Branch v left right:ts) left_acc xs
: according to definition of in-order traversal, we need to make sureleft->v->right
. Therefore, we transfer it togo (left:right:ts) (v:left_acc) xs
, which perfectly maintains our ordert -> v -> rest
. Clear!
The quick check property to help verify our implementation:
-- Note the following code can't be run independently.
randomTree :: [a] -> Gen (BinTree a)
randomTree [] = return EmptyTree
randomTree (x:xs) = do
k <- choose (0, (length xs)-1)
let (ls,rs) = splitAt k xs
left <- randomTree ls
right <- randomTree rs
return $ Branch x left right
instance Arbitrary a => Arbitrary (BinTree a) where
arbitrary = do
n <- choose (0, 10000)
lst <- vector n
randomTree lst
prop_postOrder bt = postOrder bt == postOrderT bt
where types = [bt :: BinTree Int]
prop_preOrder bt = preOrder bt == preOrderT bt
where types = [bt :: BinTree Int]
prop_inOrder bt = inOrder bt == inOrderT bt
where types = [bt :: BinTree Int]
Based on the tail-recursive implementation, it's easy to implement an iterative version in C/C++:
#include <iostream>
#include <vector>
#include <deque>
using namespace std;
class BinTree {
public:
BinTree() { v = 0; left = right = NULL; }
BinTree(int hv) { v = hv; left = right = NULL; }
int v;
BinTree *left;
BinTree *right;
};
void preOrder(BinTree *rt, deque<int> &xs) {
vector<BinTree *> stack;
stack.push_back(rt);
while (!stack.empty()) {
BinTree *cur = stack.back();
stack.pop_back();
if (cur != NULL) {
xs.push_back(cur -> v);
stack.push_back(cur->right);
stack.push_back(cur->left);
}
}
}
void inOrder(BinTree *rt, deque<int> &xs) {
vector<BinTree *> stack;
stack.push_back(rt);
vector<int> acc_xs;
while (!stack.empty()) {
BinTree *cur = stack.back();
stack.pop_back();
if (cur == NULL && acc_xs.empty()) {
// Nothing to do
} else if (cur == NULL) {
int x = acc_xs.back();
acc_xs.pop_back();
xs.push_back(x);
} else {
acc_xs.push_back(cur->v);
stack.push_back(cur->right);
stack.push_back(cur->left);
}
}
}
void postOrder(BinTree *rt, deque<int> &xs) {
vector<BinTree *> stack;
stack.push_back(rt);
while (!stack.empty()) {
BinTree *cur = stack.back();
stack.pop_back();
if (cur != NULL) {
xs.push_front(cur->v);
stack.push_back(cur->left);
stack.push_back(cur->right);
}
}
}
int main() {
BinTree n1(1);
BinTree n2(2);
BinTree n3(3);
n2.left = &n1;
n2.right = &n3;
cout << "pre order:" << endl;
{
deque<int> res;
preOrder(&n2, res);
for (int i = 0; i < res.size(); i++) {
cout << res[i] << endl;
}
}
cout << "in order: " << endl;
{
deque<int> res;
inOrder(&n2, res);
for (int i = 0; i < res.size(); i++) {
cout << res[i] << endl;
}
}
cout << "post order: " << endl;
{
deque<int> res;
postOrder(&n2, res);
for (int i = 0; i < res.size(); i++) {
cout << res[i] << endl;
}
}
return 0;
}
Apparently, Haskell version looks much clearer than C version.
Extende case1 : Binary Tree Iterator
Description: Construct a binary search tree iterator such that getting an element from the iterator should run in average O(1) time and O(h) memory.
It's trivial to implement this in Haskell since Haskell has native support for lazy evaluation. Actually, the above algorithm already fulfills the requirement.
What about implementing this in C/C++?
Actually, what we want is in order tree traversal.
#include <cstdio>
#include <vector>
#include <deque>
#include <iostream>
using namespace std;
struct TreeNode {
int val;
TreeNode *left;
TreeNode *right;
TreeNode (int x) : val(x), left(NULL), right(NULL) {}
};
class BSTIterator {
public:
TreeNode *rt;
vector<TreeNode*> trees;
deque<int> xs;
vector<int> xs_acc;
BSTIterator(TreeNode *root) {
rt = root;
trees.push_back(root);
}
/** @return whether we have a next smallest number */
bool hasNext() {
if (trees.size() == 1 && trees[0] == NULL && xs.empty() && xs_acc.empty()) return false;
return true;
}
/** @return the next smallest number */
int next() {
//preOrder();
inOrder();
int v = xs.front();
xs.pop_front();
return v;
}
void inOrder() {
while (!trees.empty() && xs.empty()) {
TreeNode *cur = trees.back();
trees.pop_back();
if (cur == NULL && xs_acc.empty()) {
// pass
} else if (cur == NULL) {
xs.push_back(xs_acc.back());
xs_acc.pop_back();
} else {
xs_acc.push_back(cur->val);
trees.push_back(cur->right);
trees.push_back(cur->left);
}
}
}
};
int main() {
TreeNode n1(1);
TreeNode n2(2);
TreeNode n3(3);
n2.left = &n1;
n2.right = &n3;
BSTIterator i = BSTIterator(&n2);
while (i.hasNext()) {
cout << i.next() << endl;
}
return 0;
}
Here, I changed the in-order tree traversal according to the requirement. xs
can be used to keep the visited elements. A corner case for our algorithm is the terminating condition(when hasNext
return false). We know this happens when go (EmptyTree:ts) [] xs
is evaluated.
Backtracking
TODO...
Finger Tree
TODO...
Lexer and Parser
TODO...