A Novel Middleware for Transparent Acceleration of Large-scale I/O
||A Novel Middleware for Transparent Acceleration of Large-scale I/O|
||SNIC Small Compute|
||Wei Der Chien <email@example.com>|
||Kungliga Tekniska högskolan|
||2022-02-03 – 2023-03-01|
With the large increase in computation power and parallelism, the I/O phase of HPC applications can become a bottleneck due to the use of traditional parallel file systems. Since they adhere to POSIX semantics, and millions of processes can potentially read and write to the same file at the same time, techniques such as distributed-locking must be used, and this can hurt performance. To improve I/O performance, modern supercomputers are equipped with increasingly hierarchical storage architecture. For example, fast SSDs in form of node-local burst buffers are provided as a temporary storage area. Yet, they have limited usability beyond a scratch area due to their disaggregated namespace. Only applications that already use a file-per-process paradigm (i.e. N-N) can benefit from it directly. Existing solutions either aims to construct an ad-hoc file system between compute nodes, or rely on applications to handle data movement in an ad-hoc manner. All these solutions require a complex setup and configurations and mostly target applications with an N-N writing pattern. Yet, the N-1 pattern is essential for many applications and currently, they cannot benefit from using node-local burst buffers. Our approach instead aims to provide a middleware that transparently accelerates the writing of shared files through node-local resources without any intervention from the user and developer. The solution aims to be portable and support all applications that already use MPI-IO and collective I/O. Our work can directly benefit applications that are using systems such as Fugaku (with node temporary storage areas), Tetralith, and Kebnekaise (with their node-local SSD scratch disk).