CISC 7700X Final Exam 1. b 2. 0; (25 + 25 -50)/3 3. 0.7812; exp(log(1+0.25)+log(1+0.25)+log(1-0.50)) 4. b 5. b 6. b 7. b; about ~10. sqrt(2*7^2) 8. b; we found a random widget, there is a 50% chance that the serial number 959569 is within the interquartile range of all serial numbers. 9. c 10. d 11. e; invalid; p(x,y)=p(x|y)p(y)=p(y|x)p(x) 12. d 13. b 14. a 15. c 16. 0.15; % P(fraud|amnt) = P(amnt|fraud)P(fraud) / ( P(amnt|fraud)P(fraud) + P(amnt|-fraud)P(-fraud) ) % (0.9 * 0.001) / (0.9*0.001 + 0.005*0.999) = 0.1526717557251908 17. 0.038 % P(fraud|st) = P(st|fraud)P(fraud) / ( P(st|fraud)P(fraud) + P(st|-fraud)P(-fraud) ) % (0.8 * 0.001) / ( 0.8 * 0.001 + 0.02 * 0.999 ) = 0.0384985563041386 18. not enough data; we don't know P(amnt,st|fraud) % answer is: P(fraud|amnt,st) = P(amnt,st|fraud)P(fraud) / (P(amnt,st|fraud)P(fraud) + P(amnt,st|-fraud)P(-fraud) ) % but we don't know P(amnt,st|fraud) 19. 0.878 % P(fraud|amnt,st) = P(amnt,st|fraud)P(fraud) / (P(amnt,st|fraud)P(fraud) + P(amnt,st|-fraud)P(-fraud) ) % naive assumption: P(amnt,st|fraud) = P(amnt|fraud)P(st|fraud) % P(amnt|fraud)P(st|fraud)P(fraud) / (P(amnt|fraud)P(st|fraud))P(fraud) + P(amnt|-fraud)P(st|-fraud)P(-fraud)) % (0.9 * 0.8 * 0.001) / ( 0.9 * 0.8 * 0.001 + 0.02 * 0.005 * 0.999) = 0.8781558726673985 20. One way to fix these kinds of issues is to keep stats by customer. In other words, if a particular customer travels a lot, then having an out of state transaction should not raise red flags... adjust P(out-of-state|fraud) for that customer.