You should EMAIL me homeworks, alex at theparticle dot com. Start your email subject with "CIS25".
HW# 1 (due by 2nd class): Email me your name, prefered email address, IM account (if any), major, and year.
HW# 2 (due by 3nd class): There is a Project at the end of chapter2 (or chapter3 in new edition) of your book. ``Adding a System Call to the Linux Kernel''. This homework is about following that guide and... actually doing it. It involves building your own linux kernel, adding the system call, and then actually using that system call from your program. All that is explained in the book project; if you have any questions, feel free to email me. Email all the source files you've created (the system call, and your program to test it). You can use virtualbox to host ubuntu for this homework. Note that some small details have changed in the later versions of Linux, so use google to figure out how to compile the module with the latest kernel.
HW# 3 (due by 4th class): Data structures: Implement an auto-growing (same way vectors grow) array based list with constant time insert and remove from either the front or the back of the list. Note that this is different from how Vectors work, they only support constant time insert/remove from end of the list, not beginning. Your implementation needs to work for both. Implement a heap based priority queue on top of your array based list (your code should work similarly to PriorityQueue in Java, except use your own array based list). Use whatever language you are comfortable with.
HW# 4 (due by 5th class): This is mostly a status update homework. Start working on project1. Submit the program to generate input data (along with instructions on how to compile/run it).
HW# 5 (due by 6th class): Write a program to estimate PI using the monte carlo algorithm, using N threads (have N=100). Individual threads will update a shared estimate values of PI, using the Bakery algorithm to synchronize. This homework is about implementing the Bakery algorithm---do not use other synchronization methods.
You will have 2 variables: CNT, INARC. These variables MUST be shared between threads. These are your PI estimate. Each thread will: loop forever, at each iteration, get two random values (0..1) x,y. Each iteration will increment CNT. If x*x + y*y <= 1, then the thread will also increment INARC. You will have N threads doing this at the same time. Access to CNT++ and INARC++ will be controlled via the Bakery algorithm.
Every 100th iteration of each thread will display the current PI estimate, which is just: INARC/CNT*4. Note that the simulation will be very very very slow. (e.g. you have 100 processes waiting for 100 other processes and you'd be displaying a value every 100th iteration)
Submit the program.
HW# 6 (due by 7th class): From your project1, submit the code that reads your data files. HW5 you submitted the program that generates input data. For HW6 submit the code to read these files and load them into appropriate data structures that your project will use. (just like Hw5, this is a status update homework---just to make sure everyone's working on the project).
HW# 7: In the not-so-distant future, flying cars are commonplace---everyone's on the planet got one. Yes, there are ~10 billion flying cars all over the globe. Each one logs its coordinates every 10 milliseconds (assume x,y,z), with z being altitude, and x,y, some cartesian equivalent of gps. To avoid accidents, regulation states that no car can be next to any other car by more than 10 feet while in the air (z > 0). Cars can go really fast, ~500mph. YOUR TASK: write an algorithm & program to find all violators. Assume input is a HUGE file (10 billion cars logging VIN,time,x,y,z every 10 milliseconds all-the-time).
What is the running time of your algorithm? If it's O(N^2), can you make it run in O(N log N) time? (note that with this much data, N^2 is not practical, even N log N is a bit long). Use Apache Hadoop (hadoop.apache.org) MapReduce to distribute the algorithm. Is the running time still O(N log N)?
HW# 8: Write a utility to determine speed of disk operations, in MB/second. Many hard-drive manufacturers advertise "X" megabytes per second read/write, etc., write a utility to verify that (e.g. how quickly can YOUR utility read/write to the disk). Your program should output 4 numbers: sequential read, sequential write, random read, random write, in MB/second. (assume you're writing at least 1MB in any given disk hit). Note you will need to read/write a LOT to avoid memory/disk caches timings (cache is fast, we all know that... but what's the actual disk read/write speed?).
HW# 9: Download and install Apache Hadoop (hadoop.apache.org). Assume HW7 data file is stored in a file called "cars.dat" on HDFS (the file is many terabytes in size). Write a map-reduce job to solve HW7. Submit the map-reduce program for Hadoop.
HW# 10: In any language of your choice, write an NTP client. NTP is Network Time Protocol. Your program sends a UDP packet to the server. The server responds with a similarly formatted UDP packet with the current time. A public ntp server is: pool.ntp.org. This homework is part research part programming. Research part: Find the NTP specification, and determine what the UDP packet format is. Programming Part: Write a program to send the packet AND recieve a response. When your program is executed, it will output the current time (recieved from NTP server) to standard output, in ``Wed Sep 24 07:29:23 EDT 2008'' format, (Note that since you're using UDP, not all packets will get through---so you might need to run the program a few times in order to get the correct time). You can lookup the format for the to/from message in NTP RFC (google for it)---the format is pretty simple. Your program should work similar to the unix utility ``ntptime''; [hw10 hint]