Bayes Nets

Bayes Nets take the idea of uncertainty and probability and marry it with efficient structures. We can understand what uncertain variable influences other uncertain variable.

Challenge Question

https://dl.dropbox.com/s/pvo18qlb1gekh1b/Screenshot%202017-02-24%2001.31.30.png
  • This requires creativity to connect O1 and O2.
  • We have to use g somehow.
  • We will use Capital case letters to indicate our Variables.
  • We will use lower case letters to indicate when the variable is true, and - in front of it to indicate when it is not true.
  • I think, the step by step illustration is not accurate.
https://dl.dropbox.com/s/zs0lzjj1yjppw7u/Screenshot%202017-02-24%2001.37.42.png
  • We solve for all the situations were o2 is true given o1 is true (this is subtler meaning with involving both G and o1)
  • Over all the situations were o1 is true. Here we go for every o2 and G.
  • Why are we doing this is not explained in this video.

We define the numerator

https://dl.dropbox.com/s/cz3atf9kxehtpyo/Screenshot%202017-02-24%2001.42.50.png

We define the denominator

https://dl.dropbox.com/s/smv3gpgs25fumh3/Screenshot%202017-02-24%2001.44.10.png
  • We calculated this result by summing up the results for all the relevant situations. But we can also get the results by sampling that can take care for more complex networks.

Bayes Network

  • We care about diagnostic reasoning.
https://dl.dropbox.com/s/uxu1x138ciwkph3/Screenshot%202017-02-24%2002.25.44.png

How many parameters?

  • We need one with the evidence positive.
  • We need once with the evidence negative.
  • One probability for the evidence itself.
https://dl.dropbox.com/s/zhexycql503lp27/Screenshot%202017-02-24%2002.27.40.png

Computing Bayes Rule

  • We compute the posterior probability not normalized, but ditching the probability B.
https://dl.dropbox.com/s/a3y7xt379zumi17/Screenshot%202017-02-24%2002.31.42.png
  • We calculate the normalizer indirectly using the terms itself.
https://dl.dropbox.com/s/d1t91jrqma5l8op/Screenshot%202017-02-24%2002.33.07.png

Two Test Cancer

https://dl.dropbox.com/s/tmirw03l9x2fppb/Screenshot%202017-02-24%2002.45.44.png
P(C| ++) = ?

Use the P' formula from above.

P'(C|++) = P(++|C) * P(C)
         = P(+|C) * P(+|C) * P(C)
         = 0.9 * 0.9 * 0.01

P'(-C|++) = P(++|-C) * P(-C)
          = P(+|-C) * P(+|-C) * P(-C)
          = 0.2 * 0.2 * 0.99

P(C| ++) = P'(C|++)
           --------------------
           P'(C|++) + P'(-C|++)

Calculating the result.

n1 =  0.9 * 0.9 * 0.01
d1 =  0.2 * 0.2 * 0.99

n1 / (n1 + d1)
0.169811320754717
https://dl.dropbox.com/s/i2e1s2e8v120scs/Screenshot%202017-02-24%2002.56.24.png

Conditional Independence

https://dl.dropbox.com/s/6rxgvmxfphe8298/Screenshot%202017-02-24%2002.59.44.png
  • Conditional Independence is a big thing in Bayes network.
https://dl.dropbox.com/s/16dy6pv5faer4tv/Screenshot%202017-02-24%2003.01.37.png
  • Without A, B and C are independent.
  • Given A, B and C are not independent. They are both conditioned on A.

Conditional Independence 2

  • Tricky again.
  • Apply Total Probability.
https://dl.dropbox.com/s/332s5ikar2v0zwq/Screenshot%202017-02-24%2003.20.48.png https://dl.dropbox.com/s/7ygv4e7fuf4ak8s/Screenshot%202017-02-24%2003.24.27.png
  • Right here is the Magic. How did we bring this in?
  • Why do we not have any denominator.
https://dl.dropbox.com/s/kns1stjd71zjbjw/Screenshot%202017-02-24%2004.09.18.png
  • A Lot has happened in here. This is short-circuiting.
https://dl.dropbox.com/s/55g9nnv0fyvcok6/Screenshot%202017-02-24%2004.16.23.png https://dl.dropbox.com/s/asqdlqjzsmxnx2d/Screenshot%202017-02-24%2004.17.38.png

Compare

  • Same thing approached. Two different situations.
https://dl.dropbox.com/s/smv3gpgs25fumh3/Screenshot%202017-02-24%2001.44.10.png https://dl.dropbox.com/s/55g9nnv0fyvcok6/Screenshot%202017-02-24%2004.16.23.png

Absolute and Conditional

https://dl.dropbox.com/s/bbrqxphfi6nmomr/Screenshot%202017-02-24%2020.29.05.png

Confounding Cause

https://dl.dropbox.com/s/ejn4qwdu4isw3h1/Screenshot%202017-02-24%2008.50.54.png

Explaining Away

https://dl.dropbox.com/s/g1jiqnre3ia32d3/Screenshot%202017-02-24%2008.52.17.png https://dl.dropbox.com/s/yeutvmix4hyq57f/Screenshot%202017-02-24%2008.53.30.png

Explaining Away 2

https://dl.dropbox.com/s/jxn9a02cutmwpcr/Screenshot%202017-02-24%2021.13.27.png

Explaining Away 3

https://dl.dropbox.com/s/a2k3gjkpfsh6f5g/Screenshot%202017-02-24%2021.19.44.png

Conditional Dependence

https://dl.dropbox.com/s/04ab2uph1r2vkzz/Screenshot%202017-02-24%2021.21.12.png

General Bayes Network

https://dl.dropbox.com/s/nbf2tor4yz0bbp5/Screenshot%202017-02-24%2021.22.38.png https://dl.dropbox.com/s/vt82z3mdkplpufi/Screenshot%202017-02-24%2021.24.20.png

D Separation

https://dl.dropbox.com/s/xb21x38u6qc1lmx/Screenshot%202017-02-24%2021.25.32.png
  • Not Independent, if linked by unknown variable.
https://dl.dropbox.com/s/uhzgjhwfc2vxoqi/Screenshot%202017-02-24%2021.26.33.png

D Separation

https://dl.dropbox.com/s/1d9cb70w42f99qq/Screenshot%202017-02-24%2021.28.08.png
  • Active Triplets render them Dependent
  • Inactive triplets render them Independent

Conclusion

https://dl.dropbox.com/s/imppwbjtti4pkua/Screenshot%202017-02-24%2021.29.41.png

Probabilistic Inference

  • Probability Theory
  • Bayes Net
  • Independence
  • Inference
https://dl.dropbox.com/s/fmbg4knfrkdz5qs/Screenshot%202017-02-25%2005.52.20.png
  • What kind of questions can we ask?
  • Given some inputs what are the outputs?
  • Evidence (know) and Query (to find out) Variables.
  • Hidden (neither Evidence or Query. We have to compute)variables.
  • Probabilistic Inference, output is going to be probability distribution over query variables.
https://dl.dropbox.com/s/r09675e4drswgfd/Screenshot%202017-02-25%2005.55.57.png

Enumeration

  • Start by stating the problem
  • Using conditional probability
https://dl.dropbox.com/s/xbhakaxuezhxnep/Screenshot%202017-02-25%2005.59.12.png https://dl.dropbox.com/s/6pyyuk13ymf4c01/Screenshot%202017-02-25%2006.01.44.png https://dl.dropbox.com/s/w9lajc4h2wqvnmz/Screenshot%202017-02-25%2006.02.35.png
  • We denote that product of 5 numbers term as a single term called f(e,a)
  • Then the final sum is the answer to sum of four terms where each term is a product of 5 numbers.
https://dl.dropbox.com/s/6rqq7gv64ko5ywq/Screenshot%202017-02-25%2006.04.57.png https://dl.dropbox.com/s/h1do4kipzng82t3/Screenshot%202017-02-25%2006.05.27.png

Speeding up Enumeration

https://dl.dropbox.com/s/h1kqmgznefudqzt/Screenshot%202017-02-25%2006.18.58.png
  • Reduce the cost of each row in the table.
  • Still the same number of rows.

Using dependence

https://dl.dropbox.com/s/ztn5wq66p08c6pq/Screenshot%202017-02-25%2006.23.33.png

Casual Direction

  • Bayes Network is easier to do inference on, when the network flows from causes to effects.

Variable Elimination

  • NP Hard computation to do inference over Bayes Nets in general.
  • Requires algebra to manipulate the arrays that come out the probabilistic terms.
https://dl.dropbox.com/s/q0ufdgn4h6ci0p4/Screenshot%202017-02-25%2006.35.05.png
  • Compute by Marginalising out and we have smaller network to deal with.
https://dl.dropbox.com/s/7zms1cwvz9l2ggc/Screenshot%202017-02-25%2006.38.29.png
  • We apply elimination, also called marginalization or summing out to apply to the table.
https://dl.dropbox.com/s/yij3e5xs0mib8gx/Screenshot%202017-02-25%2006.41.32.png

Variable Elimination - 2

  • We sum out the variables and find the distribution.
https://dl.dropbox.com/s/7tnknw21tihfz0j/Screenshot%202017-02-25%2006.43.37.png

Variable Elimination - 3

https://dl.dropbox.com/s/z706dpnoslrfxl1/Screenshot%202017-02-25%2006.46.06.png
  • Summing out and eliminating.
  • If we make a good choice, then variable elimination is going to be more efficient than enumerating.

Approximate Inference

  • Sampling
https://dl.dropbox.com/s/uvfz2og3pbsbp33/Screenshot%202017-02-25%2006.51.24.png
  • Enough counts to estimate the joint probability distribution.
  • Sampling has an advantage over elimination as know a procedure to come up with an approximate value.
  • Without knowing the conditional probabilities, we can still do sampling.
  • Because we can follow the process.

Sampling Exercise

  • Sample that randomly
  • Doubt: Weighted Sample or the Random Sample. Video suggests that it is a weighted sample.
https://dl.dropbox.com/s/c34wjhd6p3heqvs/Screenshot%202017-02-25%2007.02.35.png

Approximate Inference 2

  • In the limit, the sampling will approach the true probability.
  • Consistent.
  • Sampling can be used for complete probability distribution.
  • Sampling can be used for an individual variable.
  • What if we want to compute for a conditional distribution?
https://dl.dropbox.com/s/dlvkzx2r6dudecx/Screenshot%202017-02-25%2007.13.39.png

Rejection Sampling

  • Evidence is unlikely, you will reject a lot of variables.
https://dl.dropbox.com/s/i3qv2e1svcmecer/Screenshot%202017-02-25%2007.22.37.png
  • We introduce a new method called likelihood weighting so that we can keep everyone.
  • In likelihood weighting, we fix the evidence variables.
https://dl.dropbox.com/s/4osmw87r1l3u4ft/Screenshot%202017-02-25%2007.23.40.png

Likelihood Weighting

https://dl.dropbox.com/s/xjhlsqbshnp4mik/Screenshot%202017-02-25%2007.26.11.png
  • It is a weighted Sample.
https://dl.dropbox.com/s/cc4jr3zd3dwtly5/Screenshot%202017-02-25%2007.28.37.png
  • We make likelihood weighting consistent.

Gibbs Sampling

  • Josiah Gibbs, takes all the evidence into account, not just upstream evidence.
  • Markov Chain Monty Carlo
  • We have a set of variables, we re-sample just one variable at a time conditioned on all the others.
  • Select one non-evidence variable and resample it on all other variables.
https://dl.dropbox.com/s/rnr442leqpjpuuu/Screenshot%202017-02-25%2007.34.54.png
  • We end up walking around the variables.
  • The samples are dependent.
  • They are very similar.
  • The technique is consistent.