Decisions and Expert Systems
Decision theory
The utility equation
Measuring information
Cost of gathering information
A Bayesian method
Expected Value
The expected value of a uniformly distributed set of numbers is the average of its values
E(S) = sum S / |S|
More generally, the expected value is the centroid
E(S) = sum [e in S] e pr(e in S)
This reduces to the average when elements are equally probable
The Utility Equation
Sometimes you need to pick a best action a from a set of choices A
When you know what effect e(a) each action will have, and you know the value v(e) of each effect, this is easy: just pick
argmax [a in A] v(e(a))
But the world is uncertain. Sometimes all you know is the probability of various action effects pr(e|a)
You might evaluate the utility U of an action in terms of expected effect value
U(a) = E(v(a)) = sum [e in e(a)] v(e) pr(e|a)
You can now just maximize again
argmax [a in A] U(a)
The Game of Hangman
"Defendant" picks a word, writes blanks for its letters, draws a gallows
---- | | | | | | _ _ _ _ _ _ _ _
"Prosecution" guesses letters of the word in turn. Guessing a letter not in the word causes a body part to be added to the gallows. Guessing a letter in the word fills in all occurrences
---- z w | | | | O | | | _ _ _ _ _ _ i _
"Trial" is over when word is completed or def is hanged
---- z w e a o u | | | | O | /|\ | / \ _ _ _ _ _ _ i c
Hangman Prosecution and Utility
What letter should pros guess first?
Let's assume that def chose random 8-letter word from known dictionary
Let's further assume that a found letter has a uniform value of 1
Then find
argmax [l in 'a'..'z'] (l in word) pr(word|dict)
This is an easy calculation: 'e' occurs in 2/3 of the 8-letter words (see
bestguess.py
)
Hangman: My Assumptions Are Weak
Problem: because 'e' is so common, hitting it doesn't necessarily narrow things down so much
Value function should be based on expected chance of winning after hitting or missing the letter
Problem: Def isn't choosing uniformly — will pick "hard" words
Need to know probability given defense strategy pr(word|ds(dict))
So
argmax [l in 'a'..'z'] v(win with word) pr(word|ds(dict))
Hangman: Nash Equilibrium
Ugh. ds(dict) will depend on guessing strategy
- "Good old rock. Can always count on rock"
Adversary games are hard. But Nash Equilibrium exists and can in principle be calculated
Current research for me
Information Theory
Consider these strings of bits
0111111111111111 0101100010011011
x
is boring and easy to describe. You could predict the "next value" pretty reliablyy
is complicated
Shannon et al:
x
has "less information" thany
Information content can be viewed as a utility function. The entropy of a set is given as
u(S) = sum [i in S] pr(i) lg 1/pr(i) = sum [i in S] - pr(i) lg pr(i)
For our sets (strings)
u(x) = - pr(0) lg pr(0) - pr(1) lg pr(1) = - (1/16) lg (1/16) - (15/16) lg (15/16) ~= 0.337 u(y) = - pr(0) lg pr(0) - pr(1) lg pr(1) = - 2 (8/16) lg (8/16) = -2 × 0.5 × -1 = 1
Hangman: Entropy-Based Prosecution
Pros wants to get to a state where there is only one choice
Standard trick: pick the letter that has the greatest expected chance of reducing the entropy the most
argmax [l in 'a'..'z'] (1 - sum [p in part(l, dict)] u(l))) pr(l)
Does this still produce 'e' as the initial guess? It does
We have taken into account the "cost of gathering information" as part of the utility: we don't want to make a guess that costs us a body part unless we get a lot of information from it
Demonic Hangman
Consider "Demonic Hangman": Def cheats as desired by changing the word in a way consistent with the guesses so far
Now we almost have to use entropy to narrow Def down to only one choice quickly
Decision theory
The utility equation
Measuring information
Cost of gathering information
A Bayesian method
Expected Value
The expected value of a uniformly distributed set of numbers is the average of its values
E(S) = sum S / |S|
More generally, the expected value is the centroid
E(S) = sum [e in S] e pr(e in S)
This reduces to the average when elements are equally probable
The Utility Equation
Sometimes you need to pick a best action a from a set of choices A
When you know what effect e(a) each action will have, and you know the value v(e) of each effect, this is easy: just pick
argmax [a in A] v(e(a))
But the world is uncertain. Sometimes all you know is the probability of various action effects pr(e|a)
You might evaluate the utility U of an action in terms of expected effect value
U(a) = E(v(a)) = sum [e in e(a)] v(e) pr(e|a)
You can now just maximize again
argmax [a in A] U(a)
The Game of Hangman
"Defendant" picks a word, writes blanks for its letters, draws a gallows
---- | | | | | | _ _ _ _ _ _ _ _
"Prosecution" guesses letters of the word in turn. Guessing a letter not in the word causes a body part to be added to the gallows. Guessing a letter in the word fills in all occurrences
---- z w | | | | O | | | _ _ _ _ _ _ i _
"Trial" is over when word is completed or def is hanged
---- z w e a o u | | | | O | /|\ | / \ _ _ _ _ _ _ i c
Hangman Prosecution and Utility
What letter should pros guess first?
Let's assume that def chose random 8-letter word from known dictionary
Let's further assume that a found letter has a uniform value of 1
Then find
argmax [l in 'a'..'z'] (l in word) pr(word|dict)
This is an easy calculation: 'e' occurs in 2/3 of the 8-letter words (see
bestguess.py
)
Hangman: My Assumptions Are Weak
Problem: because 'e' is so common, hitting it doesn't necessarily narrow things down so much
Value function should be based on expected chance of winning after hitting or missing the letter
Problem: Def isn't choosing uniformly — will pick "hard" words
Need to know probability given defense strategy pr(word|ds(dict))
So
argmax [l in 'a'..'z'] v(win with word) pr(word|ds(dict))
Hangman: Nash Equilibrium
Ugh. ds(dict) will depend on guessing strategy
- "Good old rock. Can always count on rock"
Adversary games are hard. But Nash Equilibrium exists and can in principle be calculated
Current research for me
Information Theory
Consider these strings of bits
0111111111111111 0101100010011011
x
is boring and easy to describe. You could predict the "next value" pretty reliablyy
is complicated
Shannon et al:
x
has "less information" thany
Information content can be viewed as a utility function. The entropy of a set is given as
u(S) = sum [i in S] pr(i) lg 1/pr(i) = sum [i in S] - pr(i) lg pr(i)
For our sets (strings)
u(x) = - pr(0) lg pr(0) - pr(1) lg pr(1) = - (1/16) lg (1/16) - (15/16) lg (15/16) ~= 0.337 u(y) = - pr(0) lg pr(0) - pr(1) lg pr(1) = - 2 (8/16) lg (8/16) = -2 × 0.5 × -1 = 1
Hangman: Entropy-Based Prosecution
Pros wants to get to a state where there is only one choice
Standard trick: pick the letter that has the greatest expected chance of reducing the entropy the most
argmax [l in 'a'..'z'] (1 - sum [p in part(l, dict)] u(l))) pr(l)
Does this still produce 'e' as the initial guess? It does
We have taken into account the "cost of gathering information" as part of the utility: we don't want to make a guess that costs us a body part unless we get a lot of information from it
Demonic Hangman
Consider "Demonic Hangman": Def cheats as desired by changing the word in a way consistent with the guesses so far
Now we almost have to use entropy to narrow Def down to only one choice quickly