main
October 19th, 2021    

CISC 7700X
Main
Files
Syllabus
Links
Homeworks


Notes
Intro
Models
Distance
Confusion Matrix
Hyperplanes
Features
Quantization Probability
Meta-Models


Past Tests
Fall 2020 Midterm
Fall 2020 anskey
Fall 2020 Final Exam
Fall 2020 anskey
Midterm
Midterm anskey
Final
Final anskey



SQLRunner

CISC 7700X - Introduction to Data Science

CISC 7700X : W 06:05-08:10PM ONLIN OC

Fall 2021 semester, class will meet on Google Meet platform (details TBD) during the regularly scheduled times.

Meeting ID meet.google.com/kfz-cesb-wps
Phone Numbers: 1 440-462-3021; PIN: 830 813 039#

Primary E-Mail: alex at theparticle dot com
GoogleTalk: alex at theparticle dot com

Office Hours:
I'm reachable online via email, google meet, or zoom, before or after class, or during scheduled office hours: 6:05-8:10PM Tuesdays, and 10:20-11:20PM Wednesdays. Please text me or send me an email to arrange.

Books:

[recommended] Doing Data Science: Straight Talk from the Frontline
By Cathy O'Neil, Rachel Schutt, Publisher: O'Reilly Media

[recommended] Data Science from Scratch: First Principles with Python
by Joel Grus

[recommended] Pattern Recognition and Machine Learning
by Christopher Bishop, Publisher: Springer

[recommended] Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis
Authors: Guller, Mohammed, Publisher: Apress

[recommended] Data Smart: Using Data Science to Transform Information into Insight
Authors: John W. Foreman

[recommended] The Signal and the Noise: Why So Many Predictions Fail-But Some Don't
Authors: Nate Silver

[recommended] Thoughtful Machine Learning: A Test-Driven Approach
by Matthew Kirk

[recommended] How to Lie with Statistics
by Darrell Huff

Description:

CISC 7700X - Introduction to Data Science

Data Science is an interdisciplinary field concerned primarily with extracting information from data. It incorporates aspects of computer science, statistics, analytics, and mathematics. This introductory course focuses on providing a broad overview of key concepts, such as data management, data preparation, analysis, machine learning, performance measures, and working with large data sets.

Outline

  1. Introduction: What is Data Science?
  2. Data Analysis, and the Data Science Process.
  3. Inference, Performance Measures, Confusion Matrix
  4. Basic Algorithms & Models
  5. Data Engineering & Feature Selection
  6. Logistic Regression
  7. Naive Bayes
  8. Midterm
  9. Mining Graphs, Recommendation Engines
  10. Clustering & Dimension Reduction Techniques
  11. Working with Big Data
  12. Deep Learning
  13. Data Visualization
  14. Ethical Issues & Review

Projects:

There will be about 10 projects/homeworks.

Tests:

You will have at least a midterm and a final exam. There might also be a surprise quiz every few weeks.

In This Class:

Peer cooperation is encouraged, however, everyone must submit their own work. You will be expected to answer detailed questions about your assignments/projects. (i.e.: if you didn't write them, I'll know.)

Required:

Academic Integrity: The faculty and administration of Brooklyn College support an environment free from cheating and plagiarism. Each student is responsible for being aware of what constitutes cheating and plagiarism and for avoiding both. The complete text of the CUNY Academic Integrity Policy and the Brooklyn College procedure for implementing that policy can be found at this site: http://www.brooklyn.cuny.edu/bc/policies. If a faculty member suspects a violation of academic integrity and, upon investigation, confirms that violation, or if the student admits the violation, the faculty member MUST report the violation.

CLASSROOM BEHAVIOR: Disruptive classroom behavior negatively affects the classroom environment as well as the educational experience for students enrolled in the course. Any serious or continued disruption of class will result in a report to the Office of Judicial Affairs. Public Safety will be summoned immediately if a serious disruption prevents the continued teaching of the class and you may be subject to disciplinary action. For disruptive behavior that does not prevent the continued teaching of the class, you will receive a warning after one such disruption. If the disruptive behavior is repeated in the same or subsequent classes, you may be asked to leave the classroom for the remainder of class and you may be subject to disciplinary action.

This means that if you cheat on a test or an assignment, I must file a report which will initiate academic penalties.

Attendance is not mandatory (I don't need a doctors note!), but highly recommended. [you must attend at least a few times in the first six weeks, or you will be dropped from the class with a WU grade]. Also, it would be VERY difficult to pass the class without regular attendence; you are responsible for catching up if you miss class (for any reason). That being said, if you hardly ever show up (miss >= 4 classes) don't expect to get anything but a WU grade.

All projects, assignments, homeworks, etc., will be submitted via email (subject line: "CISC 7700X HW#"). Do not print out the assignments - they will promptly be trashed.

Grading:
Tentative grade breakup: ~25% for Midterm, ~35% for Projects, ~40% Final - These may change slightly depending on how well the class does in any of the above.





































© 2006, Particle