Dr. Daqing Yun, School of Computing Assistant Professor Candidate, seminar
- Thursday, April 27, 2017 from 4:10pm to 5:00pm
- Barnard Hall, Room 108 - view map
Transport Profiling for Big Data Transfer Over Dedicated Channels
Extreme-scale scientific applications in various domains such as earth science and high energy physics among multiple national laboratories within U.S. are generating colossal amounts of data, now frequently termed as “big data”, which must be stored, managed and moved to different geographical locations for distributed data processing and analysis. High-performance networks featuring high bandwidth and advance reservation are being developed and deployed to support such scientific applications. However, even if a dedicated channel is provisioned, the end-to-end data transfer performance still largely depends on the transport protocols being used on the end-hosts and maximizing their throughput performance is still very challenging mainly because: i) their optimal operational zone is affected by the configurations and dynamics of the network, the endhosts, and the protocol itself, ii) their default parameter setting does not always yield the best performance, iii) application users, who are domain experts, typically do not have the necessary knowledge to choose which transport protocol to use and which parameter value to set.
We design and develop a network connection profiler named “Transport Profile Generator” (TPG) to characterize and enhance the end-to-end throughput performance of a specifically selected data transfer protocol for big data movement over high-speed dedicated network connections. TPG employs an exhaustive search-based profiling approach to sweep through the combinations of parameter settings and enables users to determine the “best” set of parameter values for the optimal data transfer performance. To improve the efficiency of transport profiling, we propose a stochastic approximation-based profiling method, referred to as FastProf, which employs the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm to accelerate the exploration of the parameter space. Furthermore, we extend the “fast” profiling approach to other transport protocols and propose a profiling optimization-based data transfer advisor to help end users determine the most effective data transfer method with the most appropriate control parameter values to achieve the best data transfer performance.
In this talk, I will introduce our profiling approach to explore the optimal operational zone of a data transfer protocol in a given network environment and then present extensive experimental results of both TPG and FastProf collected in various network environments including a 10Gb/s back-to-back connection in our local testbed, 10Gb/s emulated long-haul connections with various RTT delays at Oak Ridge National Laboratory, and 10Gb/s physical connections with both short and long delays from Argonne National Laboratory to University of Chicago.
Daqing Yun received his Ph.D. degree in computer science from New Jersey Institute of Technology in August 2016. He is currently an assistant professor at Harrisburg University of Science and Technology. His research interests include high-performance networking, parallel and distributed computing, green networking, and big data.