main September 29th, 2022  CISC 7700X Main Files Syllabus Links Homeworks Notes Intro Models Distance Confusion Matrix Hyperplanes Features SQLRunner  ## CISC 7700X Homeworks

### You should EMAIL me homeworks, alex at theparticle dot com. Start email subject with "CISC 7700X HW#". Homeworks without the subject line risk being deleted and not counted.

CISC 7700X HW# 1 (due by 2nd class;): Email me your name, prefered email address, IM account (if any), major, and year.

CISC 7700X HW# 2 (due by 3rd class;):

1. A furniture manufacturer makes two kinds of furniture chairs and sofas. The production process has three operations: carpentry, finishing, and upholstery. The labor required for each operation varies. To manufacture a chair requires 6-hours of carpentry, 1-hour of finishing, and 2-hours of upholstery. To manufacture a sofa requires 3-hours of carpentry, 1-hour of finishing, and 6-hours of upholstery. Due to limited availability of skill, on each day, we have available 96-hours of carpentry labor, 18-hours of finishing labor and 72-hours of upholstery labor. We make \$80 profit per chair, and \$70 profit per sofa. How many chairs and sofas we should manufacture per day to maximize profit? Show work (don't just email two numbers).
2. A soup manufacturer sells 16oz cans of soup. They would like to minimize the amount of metal used in the construction of the can. What are the dimensions of a 16oz can that uses the least amount of metal? Show work. [hint]

CISC 7700X HW# 3 (due by 4th class;): We have a labeled training data set: hw3.data1.csv.gz.

Thinking of a linear model, we come up with:

y = 24*column1 + -15*column2 + -38*column3 + -7*column4 + -41*column5 + 35*column6 + 0*column7 + -2*column8 + 19*column9 + 33*column10 + -3*column11 + 7*column12 + 3*column13 + -47*column14 + 26*column15 + 10*column16 + 40*column17 + -1*column18 + 3*column19 + 0*column20 + -6

if y is > 0 then 1 othewise -1.

What is the accuracy? Calculate the confusion matrix for this model. If cost of a false negative is \$1000, and cost of a false positive is \$100, (and \$0 for an accurate answer), what is the expected economic gain?

How can we tweak the model to increase economic gain? Come up with a model that maximizes economic gain (approximations are OK; try guestimating a few possibilities in a spreadsheet, etc.).

Email the numbers and the steps you used to calculate things (you can do most of this homework in a spreadsheet [Excel?], but feel free to write code).

CISC 7700X HW# 4 (due by Nth class;):

Using data from: stockrow, using previous 2 years data (excluding latest quarter!), build a linear [y = a+bx ], logarithmic [y = a+b*log(x) ], exponential [ y=b*exp(a*x) ], and power curve [ y=b*x^a ] models on revenue, earnings, and dividends, for symbols IBM, MSFT, AAPL, GOOG, FB, PG, GE.

Which model works best for which metric/symbol? Show with numbers, (e.g. r-squared score, etc.). Read through: Coefficient of determination.

Using the best model for each metric, make a prediction for `next quarter' revenue, earnings, and dividends. Remember, you didn't use the last number to build your models. Compare your model's prediction to the last quarter number. What's the error? [hint] 