Engineering Cloud Computing
ENGR-E 516/CSCI-B 649 (Fall 2021)

Announcements

Course Description

This course will teach the fundamental concepts, engineering principles, and practical skills pertaining to the effective use of cloud computing. This course will focus on both cloud applications and the design of cloud platforms. We will cover the relevant concepts from operating systems, computer networks, and distributed systems.

This course should be useful to anyone who wants a deeper understanding of how the cloud works, as well those who want to learn how to easily and effectively use the cloud for running their applications at low cost. We will look at a wide spectrum of cloud-based applications such as a parallel data processing (e.g., MapReduce), data storage and caching (e.g., key-value stores), scientific computing, interactive notebooks (e.g., Jupyter), etc.

We will also look at the challenges involved in the efficient operation of large-scale cloud platforms with hundreds of thousands of servers. The course will cover a wide gamut of data center optimization techniques such as hardware virtualization, distributed resource management, and software-defined datacenters.

This course will expose students to popular cloud platforms such as Amazon EC2, Google Cloud Platform, Microsoft Azure, etc., and introduce students to new developments such as serverless computing and edge-clouds.

More details about the course syllabus can be found on the course website.

Prerequisites: The course has no official prerequisites, but requires a high comfort-level with systems programming and debugging. The assignments in this course will include nontrivial programming in the language of your choice.

Logistics: Classes will meet Mondays and Wednesdays from 9:30–10:45 in Luddy 4063.

This course is also cross-listed as CSCI-B 490. This year, the class is online: lecture videos will be posted on Youtube (see Canvas for links).

References

Text-books

  1. DS. Distributed Systems: Principles and Paradigms, 3rd Edition (Maarten Van Steen and Andrew Tanenbaum) Online version

Papers

Other references

  1. CCTP. Cloud Computing Theory and Practice. Dan C. Marinescu. (2nd edition)
  2. OS3EP : Operating Systems in Three Easy Pieces http://pages.cs.wisc.edu/~remzi/OSTEP/

Schedule, notes, and readings

Lecture Topic Slides Reading
0 Course Intro cloud/0-admin.pdf  
1 Intro to cloud computing cloud/1-intro-annot.pdf CCTP Chapter 1 and 2
2 ..continued (same as above)  
3 OS: system calls cloud/2-OS-1.pdf OS3EP Chapter 4
4 OS: concurrency cloud/2-OS-annot.pdf OS3EP
5 Networks cloud/3-net-2-annot.pdf  
6 Networks: Socket programming cloud/3-net-2-annot.pdf  
7 Map-Reduce [[cloud/7-MapRed-annot-1.pdf cloud/7-MapRed-annot-2pdf.pdf 1. MapReduce
8 Spark cloud/spark-annot.pdf  
10 Cloud infrastructure cloud/10-iaas.pdf  
11 Functions as a Service cloud/16-serverless-annot.pdf 9, 10
12 Cloud Storage cloud/15-storage-annot.pdf 11
13 Hardware Virtualization cloud/11-virt-1.pdf  
14 CPU Virt cloud/11-virt-2.pdf 4. VMWare, 5. KVM
15 Paravirtualization cloud/11-virt-3.pdf 3. Xen
16 Memory Virtualization cloud/11-virt-4.pdf  
17 Live Migration cloud/11-virt-5.pdf 6. Xen-migration
18 Cluster management cloud/12-clustmgmt-annot.pdf 7. ESX, 8. Remus
19 OS Virtualization cloud/13-osvirt.pdf  
20 Client-server modeling cloud/4-servers-annot-1.pdf Markov Chains
21 -More queueing theory- cloud/4-servers-annot2.pdf M/M/1 Queues
22 Parallel scaling cloud/6-scaling-annot.pdf Amdahl's Law
23 Elastic scaling cloud/6-scaling-annot.pdf  
24 Transient Computing cloud/transient.pdf 12. SpotCheck
25 Course Wrapup    

Syllabus

Approximate Weekly Schedule

Week Topic Readings Assignments
1 Introduction to cloud computing CCTP Chap 1 Homework 1
2 Building blocks: OS processes    
3 Computer networks   Socket programming
4 Architecture of distributed applications TS Chap 1,2  
5 Cloud applications (data processing, etc) CCTP Chap 4 Key-val store
6 Cloud infrastructure CCTP Chap 3 GCP setup
7 Virtualization CCTP Chap 5 MapReduce
8 Resource Management (Migration, scheduling,..) Chap 5 mostly  
9 Cloud services (PaaS, FaaS)   Build serverless app
10 Transient Cloud Computing Slides  
11 Distributed computing basics TS Chap 2  
12 Data storage: caching    
13 Data storage: consistency   Distributed KV store
14 Edge computing Slides  

Evaluation Criteria

The rough breakdown is as follows, but is subject to change:

Component Weight
Homework and Readings 15%
Programming assignments (5) 50%
Final exam 25%
Class participation/Canvas Quizzes 10%

Exams

The exams will test how well students have understood various virtualization techniques, cloud performance and cost tradeoffs, and how techniques learnt in class can be applied to emerging cloud offerings and applications.

Assignments

Students will implement various classic distributed algorithms (such as Map-Reduce, distributed key-value stores) on public clouds, and learn to use various cloud services such as Functions as a Service, various storage services, and how to use cloud VMs to develop and deploy applications.

The design oriented assignments will involve a large degree of programming and debugging. In most cases, the programming assignments are language agnostic (you can pick any reasonable programming language).

A key learning objective of this course is to design, architect, and implement a distributed system from scratch, and to design useful test-cases for evaluating the implementation. Therefore, no starter-code or templates will be provided, to give students the maximum flexibility and freedom to explore the unconstrained design space. Points will be awarded for correct and faithful designs, complete implementation, adequate testing, and reports and documentation.

Most programming assignments will take significantly longer than you anticipate. Start early. Please see the assignment descriptions below (from last year), to get a sense of how they will look like. In general, all programming assignments in this course only specify the "end goal", and you must figure out how to get there: what and how to implement, what libraries to use, etc. There will be no starter-code, no templates, no training wheels. You are on your own.

Likely assignments and schedule:

# Topic Approx Due Date
1 Simple Key-Value Store. Spawn processes and sockets Lec
2 Deploy Assign 1 on GCP VMs using APIs  
3 Simple FaaS Hello World  
4 FaaS KeepAlive Competition End
5 FaaS MapReduce End

Late submission policy

Late submissions will not be accepted. It is strongly recommended to start early—completing the assignments always takes more time than you think.

Michelin Star Grading

The grading in this course will favor students who turn in exceptional programs, reviews, and exam answers. Towards this end, we will use a "Michelin Star" system where points are awarded for high quality course products. Going over and beyond the standard evaluation criteria will fetch multiple stars. Students are eligible for an A (or A+) grade only if they have atleast one "star" across the course. Thus, it is not enough to turn in work that is merely correct. Students with a few stars are automatically eligible for A/A+ grades irrespective of their performance in the rest of the course.

Examples of high-quality work

  • Programs that are well documented, have a clean design, and implement something non-trivial in a clever way.
  • Proofs that are correct and concise.
  • Insightful and thoughtful paper reviews
  • Exam answers that are crisp, insightful, and show a deep or unique understanding of the subject matter.
  • A great question or answer during class discussions/office hours

Frequently Asked Michelin Star Questions/Clarifications

  1. Can I get a Michelin Star for my homework?
    • If you have to ask, probably not. Not all assignments are eligible. For example, the preparatory assignments (1,2,3) are pretty basic, and leave no scope for ample creativity and excellence.
  2. If you are struggling in the course, please do not rely on the hope that later michelin stars will magically rescue your grade.
  3. In some cases, assignments will list extra components that will make the submission eligibile.
  4. Your first priority should be to submit correct and complete assignments. Please do not over-design or over-engineer your assignments, lest you are unable to submit on time. A non-functional submission with a "great" design will likely get 0. Thus, be careful when shooting for the stars.

Administrative Information

Class Information

Where When
Luddy Hall Room 4063 Mondays and Wednesdays 9:25–10:40

Office Hours

Who Email Office Location Office Hours
Prateek Sharma prateeks Luddy 4126 Wed 4–5
Sahil Tyagi styagi    
Alexander Fuerst alfuerst    

Author: Prateek Sharma

Created: 2021-08-18 Wed 13:15

Validate