Skip to content

Latest commit

 

History

History
111 lines (71 loc) · 4.9 KB

index.md

File metadata and controls

111 lines (71 loc) · 4.9 KB
layout
default

Note: This GitHub repository supports the Dept. of Computer Science, Columbia University course COMS W4111-Introduction to Databases. The current content is for the Fall 2019 semester section 02. taught by Donald F. Ferguson.

W4111 - Introduction to Databases

Overview

From the CU course description,

"The fundamentals of database design and application development using databases: entity-relationship modeling, logical design of relational databases, relational data definition and manipulation languages, SQL, XML, query processing, physical database tuning, transaction processing, security. Programming projects are required."

This section of W4111 - Introduction to Databases focus on understanding and applying database technology, and deemphasizes the theory and algorithms. The course will cover underlying theory and algorithms but in less detail than other database sections at CU. The course will focus on implementing small SW projects using various database technology.

The course will have four modules, each of which has sub-modules. Specific topics will often appear in more than one sub-module of a module and across modules. For example, understanding data modeling and best practices, and data schema/query are intertwined. The modules and sub-modules are listed below.

Prerequisites

The CU course description lists the following prerequisites:

  • COMS W3137 -- Honors Data Structures and Algorithms or W3134 -- Data Structures in Java
  • Fluency in Java
  • Instuctor's permission.

A course in data structures is helpful for sections 03, H03 and V03 but not essential.

Java is an excellent language for learning algorithms, data structures and programming fundamentals. Python, however, is becoming the dominant language and toolset for database centric applications. Sections 03, H03 and V03 will use Python for examples. Python is recommended for homework assignments and exams.

Sections and Registration

There are 3 sections for Fall 2019, COMS W4111 - Introduction to Databases, sections 02, H02 and V02. The three sections cover the same material, have the same homework assignments and exams, have the same office hours and have the same teaching assistants/course assistants.

Section 02 is the "classic," in-person lecture format. H02 is a "hybrid section." Students do not need to attend the lecture and can watch recordings of the lectures. V03 is via the Columbia Video Network. Schools and departments have policies about elibility for the various sections, enrollment authorization and waitlist management. Jessica Rosa in the CS department manages the enrollment and waitlist. Please contact Jessica if you have questions.

Students in all three section can access the lecture videos.

Module I: Foundational Concepts

  1. Introduction to databases, role in applications, type of DB applications and overall system software architecture.

  2. Information and data modeling and best practices, focusing on supporting application scenarios.

  3. Relational data model (theory), Relational Database Management Systems, Structured Query Language, data query and update scenarios.

  4. Extended topics in SQL and RDBMS (performance, security, constraints, triggers, connection management, etc).

Module II: Database Management System Implementation/Architecture

  1. Storage management, disk management, buffer management, indexes.

  2. Query processing and optimization: Query evaluation, query parsing and parse trees, operator implementation algorithms, query rewrite, query optimization techniques.

  3. Concurrency control and transaction management.

Module III: NoSQL Database Overview

  1. Overview, graph databases, Redis.

  2. Amazon S3, Amazon DynamoDB, Google Firebase/Cloud Firestore.

Module IV: Decision Support, Data Analysis

  1. Overview of schema denormalization, OLAP cubes, data analytics, machine learning.

Office Hours Calendar

This calendar has the office hours for the professor and instruction assistants.

<iframe src="https://calendar.google.com/calendar/embed?src=8a3li5aeqbu36m0q928rrog2f8%40group.calendar.google.com&ctz=America%2FNew_York" style="border: 0" width="800" height="600" frameborder="0" scrolling="no"></iframe>

Lecture Material

Lecture material is below. Lecture material will typically be some combination of iPython/Jupyter Notebook,, HTML version of the notebook and/or PDF of a PowerPoint presentation.