
CISC 7510X/7512X
Main
Files
Syllabus
Links
Homeworks
Notes
0001
DB1
Intro
SQL Intro
More SQL
Oracle Primer
MySQL Primer
PostgreSQL Primer
Grouping Sets
DB2
Intro
SQL Intro
More SQL
Oracle Primer
MySQL Primer
PostgreSQL Primer
Grouping Sets
Bucket Joins
AnalyticFuncs
Indexes/Joins
DB Design
Sample Data
ctsdata.20140211.tar
Stock Ordrs
SQLRunner
|
 |
 |
CISC 7510X (DB1) Homeworks
You should EMAIL me homeworks, alex at theparticle dot com. Start email subject with "CISC 7510X HW#". Homeworks without the subject line risk being deleted and not counted.
CISC 7510 HW# 1 (due by 3rd class;): For the below `store' schema:
product(productid,description,listprice)
customer(customerid,username,fname,lname,street1,street2,city,state,zip)
purchase(purchaseid,purchasetimestamp,customerid)
purchase_items(itemid,purchaseid,productid,quantity,price)
Using SQL, answer these questions (write a SQL query that answers these questions):
- What is the description of productid=42?
- What's the name and address of customerid=42?
- What products did customerid=42 purchase?
- List customers who bought productid=24?
- List customer names who have never puchased anything.
- List product descriptions who have never been purchased by anyone.
- What products were purchased by customers with zip code 10001?
- What percentage of customers have ever purchased productid=42?
- Of customers who purchased productid=42, what percentage also purchased productid=24?
- What is the most popular (purchased most often) product in NY state?
- What is the most popular (purchased most often) product in Tri-state Area? (NJ, NY, CT)
- Who purchased productid=24 prior to July 4th, 2020?
- For each customer, find all products from their last purchase.
- For each customer, find all products from their last 10 purchases.
- Names of customers who have purchased product 42 in the last 3 months.
Also, install PostgreSQL.
CISC 7510 HW# 2 (due by 4th class;): Install PostgreSQL
As a `simple' review of SQL, do `Sample Questions' at the end of: sql2.pdf; For the same database, also answer the following questions:
- Find the company with most employees.
- Find employees who make more than the average salary within their company.
- Find employees who make more than the median salary within their company.
- Find employees whose salary is an outlier (above 2 standard deviations) within their comapny.
- Find employees whose salary is an outlier (above 95th percentile) within their comapny.
- Find the company with the highest number of outlying salaries (your choice which outlier to use).
- Find the company with most non-managing employees.
- Find the company with highest average difference between manager salary and non-manager employee salary.
- Assume that each non-managing employee genererates around 2x their salary in revenue. Managing employees don't directly contribute to revenue. Estimate revenue and ``profit'' for each company (assume profit = revenue - all_salaries).
- Calculate salary skew for profitable (profit > 0) companies from question 9.
Email the query text.
CISC 7512X (DB2) Homeworks
You should EMAIL me homeworks, alex at theparticle dot com. Start email subject with "CISC 7512X HW#". Homeworks without the subject line risk being deleted and not counted.
CISC 7512X HW# 1 (due by 3rd class;): For the below `bank' schema:
customer(customerid,username,fname,lname,street1,street2,city,state,zip)
account(accountid,customerid,description,)
transaction(transactionid,trantimestamp,accountid,amount)
A customer may have several accounts, and each account may participate in many
transactions. Each transaction will have at least two records, one deducting amount from an account, and one adding amount to an account (for a single transactionid, the sum of amounts will equal zero).
Using SQL, answer these questions (write a SQL query that answers these questions):
- What is the balance of accountid=42?
- What was the transaction amount of transactionid=42?
- Which transactionids do not sum up to zero (are invalid)?
- List of customers without accounts?
- What is the balance (total across all accounts) for customerid=42?
- What is the total balance of all customers living in zip code 10001?
- Which zip code has the highest balance?
- List the top 1% of customers (ordered by total balance).
- Using balances for previous two months, predict what the balances will be next month. (tip: find slope of a line; x-axis is days, y-axis is balance. 2 previous months means you have 2 points, finding slope is easy. Use slope to predict where next month's balance will be.)
- List top 10 fastest growing accounts (using previous 2 months). (tip: same as above, fastest growing means steepest slope).
- For each account, what was the closing balance on December 31, 2023?
- What percentage of bank's money is held by people in the tri-state area today? (NY, NJ, CT)
- Write a query to add 0.01% to each savings account (note that the money has to be accounted for).
- Find all accounts with 30-day moving average balance less than $1500.
- Find all accounts with less than 2 transactions in the previous 30-days.
- Find all accounts with negative balances for the entire previous 30-days.
- Find all accounts that meet all conditions from previous 3 bullet points.
- Find customers who move money between their accounts too frequencly (e.g. transaction count among accounts that they own is above 99-th percentile of all customers in the bank).
- We (the bank) define a round-trip as a transaction from account A to account B, that is followed by a transaction from account B back to account A within 30 days, for an amount that is within 10% of the original amount. Find all round-trip transactions in the last year. Make sure not to count transactions twice. [e.g. A-to-B, and then B-to-A, followed by another B-to-A, etc., shouldn't count it as two round-trips, since the 2nd B-to-A isn't paired up with its own A-to-B].
- Generate a list of customers who engaged in at least 10 round-trips in the last year.
- We (the bank) define a loop as any set of transactions, less than 4, that create a loop, e.g.: A-to-B, and B-to-C and C-to-A... within 30 days, and within 10% of amount, etc., This is just a bigger version of a round-trip, involving more than 2 participants. Find all loops in the last year. Again, do not overcount.
|
 |