CISC 7412X - Artificial Intelligence II

HW1: Download some big collection of text. I highly recommend Project Gutenberg (google for it; you can download the entire DVD). Part1: Write a parser/reader that will collect probabilities of every word following any other word. Convert everything to lowercase. Punctuation breaks the sequence (e.g. end of setence, etc.). Part2: Using the probabilities from part1, generate text... start with a random word, then randomly (weighted by probabilities) pick the following word, and continue. Email code for the project and a sample of the generated output. Other things to experiment with: gather probabilities of any word following a pair or triplet of words, etc. (originally assigned 20140910)

HW2: Create a neural network with 8 binary inputs, 3 hidden units, and 8 outputs. Your inputs are 8 binary bits, 00000001, 00000010, 00000100, 00001000, 00010000, 00100000, 01000000, 10000000. Your expected outputs match the inputs. In other words, you're giving the network 00000010 and expecting it to produce 00000010, etc. You should use backpropagation for training, but feel free to use whatever other method that works. After successful training, your network should NOT make any mistakes (you give it a 00000010 and it always outputs 00000010, etc.). Now you can cut away the last layer, and you end up with a network that turns your input string into a 3-bit binary representation. Note that 00000010 will not correspond to 010 ('2' in binary). Submit your code along with a log of output...

HW3 Given points: (67.53,241.95), (11.75,68.29), (20.87,92.94), (3.51,37.16), (74.29,258.33), (68.49,242.46), (4.25,37.04), (34.97,137.55), (17.29,86.79), (76.56,270.25), (39.28,154.74), (97.28,336.08), (35.35,143.59), (68.79,242.51), (17.84,85.56), (76.38,268.30), (12.90,64.93), (80.57,280.01), (82.70,289.04), (28.38,116.09), fit a line, 3 degree polynomial, 5 degree polynomial, an exponential, and a power function. I highly recommend you write your own matrix inverter (Numerical Recipes in C is a great reference for that). You can also use Octave, R, etc. Submit your code along with the equations for line, polynomials, exponential, and a power function. The above was generated via (but please ignore this when doing this homework):

perl -e'for(1..20){ $x=rand(100); $y=3.14*$x+23+rand(10); printf("(%.2f,%.2f), ",$x,$y) }; print "\n"'

HW4: Write a program to do naive bayes classification of email. Create 2 folders, "spam" and "notspam", and fill it with text of respective categories. You can use your own spam/notspam emails, or just make up some text that seems like it fits these categories. Write a program to train your naive bayes classifier on the text in the folders. Not that you're estimating the P(D|C) probabilities... you already know the class (which folder the text is in). Now given a previously unseen document, use the bayes rule to find P(C|D)... and assign either spam or notspam tag on the new document. Submit your code, training data, etc., along with a log of output...

© 2026, Particle