Skip to content

0. Homework 1 HUFFMAN ENCODING

Daniel Obeng edited this page Nov 8, 2017 · 1 revision

Mathematical Induction is a special way of proving functions.

It has 2 steps:

  • Step 1. Show it is true for the first one function (our base case)

  • Step 2. Show that if any one function is true then the next function is true (our hypothesis of outcome) This proves that all functions must be true (verifying our hypothesis)

The Huffman algorithm Huff(A,f) computes an optimal tree that reflects frequencies of f in an alphabet A with siblings w and y. We prove the size of our alphabet (in simpliest terms) via induction.

  • Step 1. Base case (n = 1) reflects a tree with one vertex (a root) with a cost of zero (the smallest possible tree).

  • Step 2. Our hypothesis is that for all A with |A| = n all frequencies f, our algorithm will calculate an optimal Huffington tree (Huff (A,f)).

       If we assume that our hypothesis will hold for n-1 symbols creating a T' that is optimal for A' and f' let's 
       follow the reasoning.
       
       Alph_len(T) = (∑ f(x)depth(x,T)) + f(w)depth(w,T) + f(y)depth(y,T)  //for all x in A\{w,y}
              = (∑ f(x)depth(x,T)) + (f(w) +f(y))(depth(z,T') + 1  //for all x in A\{w,y}
              = (∑ f(x)depth(x,T)) + f'(z)depth(z,T') + f(w) + f(y)  //for all x in A\{w,y}
              = (∑ f'(x)depth(x,T')) + f(w) + f(y)  //for all x in A'
              = Alph_len(T') + f(w) + f(y)       
       Alph_len(T) = ABL(T') + f(w) + f(y)
             
       
       Let's take a look at the contraction to our hypothesis that T is not optimal. If we can show that 
       the contraction to our hypothesis is wrong, then we can say our hypothesis is correct.
       
             Let tree Z be an optimal tree with siblings w and y. If Z' is a tree obtained from Z by 
             removing both w and y, we can then assume Z' to be a tree for alphabet A' with frequencies 
             f'. Repeating our hypothetical calculations we arrive at Alph_len(Z) = Alph_len(Z') + f(w) + f(y). 
             Therefore it follows that Alph_len(T') = Alph_len(T) - f(w) - f(y) must be larger than 
             Alph_lenL(Z') = Alph_len(Z) - f(w) - f(y). And since T' is optimal for A' and f', we have a contraction! 
             This proves our original hypthesis that for all A with |A| = n and all frequencies f in 
             our algorithm Huff (A,f) we have an optimal tree.
             
                                                                                  QED
       
        
        Pseudo Code Proof
    
        Huff(A,f)
        If |A| = 1 then return of single vertex //Step 1. Base Case
        Let w and y be symbols of lowest frequencies  //Step 2. Inductive hypothesis
        Let A' = A\{w,y} + {z}
        Let f'(x) = f(x) for all x in A'\{z} and let f'(z)= = f(w) + f(y)
        T' = Huff(A', f')
        We create T from T' by adding w and y as children of z
        return T
    

Clone this wiki locally