CISC 7700X Midterm Exam NAME: _______________________ //sgi;print'>midterm20231024.txt --> Each question is worth 5 points. Pick the best answer that fits the question. Not all of the answers may be correct. If none of the answers fit, write your own answer. --------------------------------------------------------------------- 1. A model is: a) A fact. b) A data point. c) A description. d) All of the above. 2. When data has a few very large outliers, which is most appropriate to use: a) Mean b) Median c) Gradient d) Centroid regression 3. When is interquartile range more appropriate to use than standard deviation: a) When data long tails. b) When data has short tails. c) When data has no tails. d) When data has fluffy tails. 4. The more supporting evidence we observe, the more confidence we have in the model. Suppose our model is: all fish live in water: If something is a fish, then it lives in water. Supporting evidence may consist of: a) Observing a fish in water. b) Observing a person on the beach. c) Observing a duck flying over water. d) All of the above. 5. We make a lot of observations of A happening within 5 minutes before B. To show that A causes B: a) We need to observe at least 1,000 instances of A happening right before B. b) We need to observe at least 1,000,000 instances of A happening right before B. c) Observing B without A proves that A does not cause B. d) We need to conduct a controlled experiment. 6. Fair coin flipping game: We start with $1. Heads we win 50%, tails we lose 50%. After 2 rounds, with a fair coin, the mean value we will have: a) $0.25 b) $0.75 c) $1.00 d) $2.25 7. Fair coin flipping game. We start with $1. Heads we win 50%, tails we lose 50%. After 2 rounds, with a fair coin, the median value we will have: a) $0.25 b) $0.75 c) $1.00 d) $2.25 8. For last 3 years, your investment returned: +35%, +35%, -70%. Which measure of central tendency would best describe your annual return? a) Arithmetic mean b) Geometric mean c) Median d) Standard Variance 9. The process of computing P(x) from P(x,y) is called a) Bootstrapping b) Generalizing c) Specifizing d) Marginalizing 10. If P(x|y) != P(x,y)/P(y) then a) x is more likely after y. b) y is causes x. c) x and y are independent. d) x and y are not independent. e) None of the above, answer is: 11. In Bayes rule: P(x|y)=P(y|x)P(x)/P(y), the P(y|x) is: a) The likelihood. b) The prior probability. c) The posterior probability. d) The conditional probability of y given x. 12. Conditional probability P(y|x) differs from likelihood P(y|x): a) They're both the same. b) They both sum to 1. c) Probability P(y|x) is a function of y, while likelihood P(y|x) is a function of x. d) Likelihood tells us the probability of y given x. 13. We have two die, an 6-sided one, and an 8-sided one. We pick one at random. What's the probability we picked 6-sided die? a) 1/2 b) 3/7 c) 9/25 d) 4/7 e) None of the above, the answer is: 14. We have two die, an 6-sided one, and an 8-sided one. We pick one at random, and note the number: 4. What's the probability we picked 8-sided die? a) 1/2 b) 3/7 c) 9/25 d) 4/7 e) None of the above, the answer is: 15. Smallpox: Suppose that out of 1 million people, 99% are vaccinated, and 1% are not. A vaccinated person has 1% chance of developing a reaction, which has 1% chance of being fatal. A vaccinated person has no chance of getting smallpox. An unvaccinated person has 1% chance of getting smallpox, which is fatal in 20% of the cases. Quick math shows that we can expect 99 fatalities (1000000 * 0.99 * 0.01 * 0.01) from vaccine complications and 20 fatalities (1000000 * 0.01 * 0.01 * 0.20) from smallpox. Vaccinations kill more people than smallpox! What is wrong with the above analysis? e) Answer: 16. The probability of correctly answering question 16 on this exam is P(q16). What is 1 - P(q16) ? a) The inverse probability. b) It should be P(q16) * 3/4, since there are 3 wrong answers and only 1 correct answer. c) Our belief in answering question 16. d) The probability of incorrectly answering question 16. 17. Let P(q16,q17) be join probability of correctly answering question 16 and 17 on this exam. If P(q16,q17) != P(q16)P(q17) then: a) q16 is more likely than q17. b) q16 is causes q17. c) q16 and q17 are not independent. d) q16 and q17 are independent. e) None of the above, answer is: 18. (short answer) A large language model such as chatgpt estimages a probability distribution for a word given a sequence of preceding words. In some ways, it is not unlike: P(nextword|w0, w1, w2, ... wN) If we wish to implemenet our own model using such a joint probability, what problems would we encounter? 19. (short answer) Continuing question 18, we realize most words are independent, and make Naive Bayes assumption. What problems would we encounter now? 20. (short answer) How might you fix problems identified in question 19, without re-introducing problems of qeustion 18.