main
September 29th, 2022    

CISC 7510X
Main
Files
Syllabus
Links
Homeworks

Notes
0001

DB1
Intro
SQL Intro
More SQL
Oracle Primer
MySQL Primer
PostgreSQL Primer
Indexes/Joins
Data Loads


Sample Data
ctsdata.20140211.tar
Stock Ordrs


SQLRunner

CISC 7510X (DB1) Homeworks

You should EMAIL me homeworks, alex at theparticle dot com. Start email subject with "CISC 7510X HW#". Homeworks without the subject line risk being deleted and not counted.

CISC 7510X HW# 1 (due by 2nd class;): Email me your name, prefered email address, IM account (if any), major, and year. (oh, and install PostgreSQL).


CISC 7510X HW# 2 (due by 3rd class;): For the below `store' schema:
product(productid,description,)
customer(customerid,username,fname,lname,street1,street2,city,state,zip)
purchase(purchaseid,purchasetimestamp,customerid,productid,quantity,price)

Using SQL, answer these questions (write a SQL query that answers these questions):

  1. What is the description of productid=42?
  2. What's the name and address of customerid=42?
  3. What products did customerid=42 purchase?
  4. List customers who bought productid=24?
  5. List customer names who have never puchased anything.
  6. List product descriptions who have never been purchased by anyone.
  7. What products were purchased by customers with zip code 10001?
  8. What percentage of customers have ever purchased productid=42?
  9. Of customers who purchased productid=42, what percentage also purchased productid=24?
  10. What is the most popular (purchased most often) product in NY state?
  11. What is the most popular (purchased most often) product in Tri-state Area? (NJ, NY, CT)
  12. Who purchased productid=24 prior to July 4th, 2020?

Also, install PostgreSQL.


CISC 7510X HW# 3 (due by 4th class;): Install PostgreSQL.
For the below schema for a company door:
doorlog(eventid,doorid,tim,username,event)

Where doorid represents the door for this event. e.g. Front door may be doorid=1, and bathroom may be doorid=2, etc. tim is timestamp, username is the user who is opening or closing the door. event is "E" for entry, and "X" for exit.

Using SQL, answer these questions (write a SQL query that answers these questions):

  1. How many users entered through doorid=1
  2. If doorid=2 is bathroom, how many people are currently in the bathroom?
  3. If doorid=1 is front entrance door, and doorid=3 is back entrance door, and these are the only doors in the building, how many people are currently in the building?
  4. How many people were in the building on July 4th, at 10PM? (watching fireworks)
  5. If doorid=7 is for floor 42, what's the daily occupancy of floor 42 for entire 2021 (give a number for every day in 2021; not just days that had activity; if nobody entered/left floor, then return 0 for that day)
  6. What is the daily average (and standard deveation) occupancy of floor 42 for 2021? (single number; use above question results)
  7. What percentage of the people work on floor 42 (assume if they entered the floor, they work there).
  8. What's the average number of times per day that people use the bathroom? (bathroom is doorid=2).
  9. What percentage of employees stayed after 5:15PM on July 3rd, 2022?
  10. List all employees who left work before 1PM on July 3rd, 2022 (assume they arrived to work on July 3rd, before 1pm).

CISC 7510X HW# 4 (due by Nth class;): Write a command line program to "join" .csv files. Use any programming language you're comfortable with (Python suggested). Your program should work similarly to the unix "join" utility (google for it). Unlike the unix join, your program will not require files to be sorted on the key. Your program must also accept the "type" of join to use---merge join, inner loop join, or hash join, etc. Assume that first column is the join key---or you can accept the column number as paramater (like unix join command). Test your program on "large" files (e.g. make sure it doesn't blow up on one million records, etc.)

Submit source code for the program.

Also... load all files in ctsdata.20140211.tar (link on the left) into Oracle or Postgres (or whichever works for you). The format of these files is: cts(tdate,symbol,open,high,low,close,volume), splits(tdate,symbol,post,pre), dividend(tdate,symbol,dividend). Submit (email) whatever commands/files you used to load the data into whatever database you're using, as well as the raw space usage of the tables in your database.






































© 2006, Particle