Publications Details
IO-Cop: Managing Concurrent Accesses to Shared Parallel File System
Thapaliya, Sagar; Bangalore, Purushotham; Lofstead, Jay; Mohror, Kathrn; Moody, Adam
A parallel file system (PFS) is often used to store intermediate results and checkpoint/restart files in a high performance computing (HPC) system. Multiple applications running on an HPC system often access PFSs concurrently resulting in degraded and variable I/O performance. By managing PFS accesses, these sharing induced inefficiencies can be controlled and reduced. To this end, we are exploring access control mechanisms to manage the shared PFS, so that the PFS can change its runtime behavior when it serves I/O requests from applications, e.g., provide exclusive access to a single application at a time and for a time window. In this paper, we discuss our design space exploration and also present some initial experimental results collected during the exploration. This workenables deeper exploration of our ongoing research in managing inter-application interference in a PFS.