GT Home : : Campus Maps : : GT Directory

new GT courses this fall

This entry was posted by on Thursday, 18 August, 2011 at

I would like to call your attention to two new courses that are being offered this Fall that may be of interest to your students:

CS 8803-EA: Towards Exascale Analytics (instructor: Joel Saltz): Covers topics in HPC and data analysis (see below for a list of topics)

CSE 8803-CPS: Computational Problem Solving (Instructors: Edmond Chow and Richard Fujimoto): an introductory course intended for math, science and engineering students to develop computing knowledge and skills; includes an introduction to parallel programming.

Details available at: http://www.cc.gatech.edu/~echow/cs4803.html

 

Richard Fujimoto

 

 

 

List of Topics to be covered in Fall Semester CS 8803: Towards Exascale Analytics

CLUSTERING, DATA AND GRAPH MINING: Large scale clustering, data mining and graph algorithms.  Scalable parallel graph algorithms, high end techniques to support dimensionality reduction and summarization of high dimensional data, massive scale clustering and data mining, collaborative clustering methods, performance modeling for distributed data mining applications, system software to support graph mining, data mining and clustering, scalable distributed reasoning.

DATA SYSTEMS SOFTWARE: Active semantic caching, filter stream middleware, in-transit data processing,  data staging services,  adaptable IO system, collaborative threads, active storage, storage management for complex array processing, SciDB – array oriented science oriented database management system.

MAPREDUCE, DATABASES AND FINE GRAINED PARALLELISM: MapReduce, Hadoop,  the Google File System,  HDFS, Big Table, HIVE, Llama, Sawzall, PIG,  Twister, MapReduce for Multi-core and multiprocessor systems, MapReduce and Parallel Database management systems,  HadoopDB,  Hadoop-GIS

OPTIMIZATION OF HIGH END FILE SYSTEM PERFORMANCE: Object based storage, overview of Panasas parallel file system, checkpointing, scalable directories for shared file systems, collective I/O and parallel file systems, active storage strategies for parallel file systems,  relationship between data intensive scalable computing systems (e.g. Google file system and HDFS) and cluster file systems (e.g. Lustre, Panasas, GPFS).

STREAMS, ONLINE AGGREGATION AND CONTINUOUS QUERY SUPPORT: High performance stream processing, System S, DataCutter,  applications of stream processing to sensor data analysis,  workflow and quality of service,  performance insightful query languages,  MapReduce and stream processing,  streaming query languages,  stateful key-value storae with performance service level objectives.

SPATIAL ANALYTICS — SYSTEMS SOFTWARE AND MACHINE LEARNING: Spatial object association algorithms, crossmatch,  parallel database for multi-dimensional data,  spatial datamining,  content based image retrieval, high level image representation for scene classification.

TEMPORAL ANALYSES AND TIME SERIES – SYSTEMS SOFTWARE AND MACHINE LEARNING:  Time series mining, finding semantics in time series, multiple resolution time series, specifying and identifying temporal sequences, temporal RFID processing.

DRIVING APPLICATIONS – IMAGE ANALYSIS, GENE SEQUENCING, CLINICAL DATA ANALYTICS AND COMPUTATIONAL ASTRONOMY:  Examples of challenging problems (primarily drawn from the biomedical domain)

Comments are closed.