The following schedule is final; no swapping or substitution is permitted. Please read the instructions at Student Talks. At the day of your presentation, you must upload your talk slides (ppt or PDF) on the classroom PC using a USB memory key. You should also upload your slides on the class web site before your talk.
Attendance to student presentations is required. There will be an attendance list during student presentations. Missing one presentation day will result to 2%-off from your large-project/paper-presentation grade. Do not talk during the student presentations. Turn off your cell phone. If you come late, wait outside the classroom until the current presentation is over.
Tariq Alsahfi Google File System Bhanu Jain Hadoop File System Daniel Aguilera ZooKeeper: Wait-free coordination for Internet-scale systems
Tejas Shetti Fast Crash Recovery in RAMCloud Vanisha Crasta A comparison of approaches to large scale data analysis Mary Kingsbury Design Patterns for Efficient Graph Algorithms in MapReduce Nitin Pawar HaLoop: Efficient Iterative Data Processing on Large Clusters
Syed Saqib Ali Spark SQL: Relational Data Processing in Spark Padmajaa Rajaji Shark: Fast Data Analysis Using Coarse-grained Distributed Memory Aishwarya Ashok Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud Ashiq Imran GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
Brijesh Jivrajbhai Dankhara MLbase: A Distributed Machine Learning System Sanika Gupta Kafka: a Distributed Messaging System for Log Processing Md Hasanuzzaman Noor Large scale machine learning at Twitter Karan Dhiren Shah The Unified Logging Infrastructure for Data Analytics at Twitter
Meghan Hemen Kothari Druid: a real-time analytical data store Anuj Thomas Mathew Megastore: Providing Scalable, Highly Available Storage for Interactive Services Sidharth Mehra Cassandra: a decentralized structured storage system.
Harish Ram Nambiappan Adapting Microsoft SQL Server for Cloud Computing Pooja Venkatesh HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads Mohammad Taha Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing Jaimin Pravinkumar Patel Low Latency Analytics of Geo-distributed Data in the Wide Area
Fahad Mohamed Sajid Furniturew Gorilla: A Fast, Scalable, In-Memory Time Series Database Vidhi Shailesh Shah Spanner: Google's Globally-Distributed Database Yash Nilesh Shah Dremel: Interactive Analysis of Web-Scale Datasets Aashara Shrestha Beaver, Facebook needle in a haystack photo storage system
Last modified: 10/18/2016 by Leonidas Fegaras