A large-scale study of MPI usage in open-source HPC applications I Laguna, R Marshall, K Mohror, M Ruefenacht, A Skjellum, N Sultana Proceedings of the International Conference for High Performance Computing …, 2019 | 98 | 2019 |
ARCHER: effectively spotting data races in large OpenMP applications S Atzeni, G Gopalakrishnan, Z Rakamaric, DH Ahn, I Laguna, M Schulz, ... 2016 IEEE international parallel and distributed processing symposium (IPDPS …, 2016 | 95 | 2016 |
Ipas: Intelligent protection against silent output corruption in scientific applications I Laguna, M Schulz, DF Richards, J Calhoun, L Olson Proceedings of the 2016 International Symposium on Code Generation and …, 2016 | 84 | 2016 |
Scalable temporal order analysis for large scale debugging DH Ahn, BR De Supinski, I Laguna, GL Lee, B Liblit, BP Miller, M Schulz Proceedings of the Conference on High Performance Computing Networking …, 2009 | 75 | 2009 |
AutomaDeD: Automata-based debugging for dissimilar parallel tasks G Bronevetsky, I Laguna, S Bagchi, BR de Supinski, DH Ahn, M Schulz 2010 IEEE/IFIP International Conference on Dependable Systems & Networks …, 2010 | 66 | 2010 |
Automatic fault characterization via abnormality-enhanced classification G Bronevetsky, I Laguna, BR de Supinski, S Bagchi IEEE/IFIP International Conference on Dependable Systems and Networks (DSN …, 2012 | 64 | 2012 |
Evaluating and extending user-level fault tolerance in MPI applications I Laguna, DF Richards, T Gamblin, M Schulz, BR de Supinski, K Mohror, ... The International Journal of High Performance Computing Applications 30 (3 …, 2016 | 60 | 2016 |
Evaluating user-level fault tolerance for MPI applications I Laguna, DF Richards, T Gamblin, M Schulz, BR de Supinski Proceedings of the 21st European MPI Users' Group Meeting, 57-62, 2014 | 57 | 2014 |
Accurate application progress analysis for large-scale parallel debugging S Mitra, I Laguna, DH Ahn, S Bagchi, M Schulz, T Gamblin ACM SIGPLAN Notices 49 (6), 193-203, 2014 | 48 | 2014 |
Large scale debugging of parallel tasks with automaded I Laguna, T Gamblin, BR de Supinski, S Bagchi, G Bronevetsky, DH Anh, ... Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 48 | 2011 |
Gpumixer: Performance-driven floating-point tuning for gpu scientific applications I Laguna, PC Wood, R Singh, S Bagchi High Performance Computing: 34th International Conference, ISC High …, 2019 | 46 | 2019 |
Refine: Realistic fault injection via compiler-based instrumentation for accuracy, portability and speed G Georgakoudis, I Laguna, DS Nikolopoulos, M Schulz Proceedings of the International Conference for High Performance Computing …, 2017 | 46 | 2017 |
Debugging high-performance computing applications at massive scales I Laguna, DH Ahn, BR De Supinski, T Gamblin, GL Lee, M Schulz, ... Communications of the ACM 58 (9), 72-81, 2015 | 44 | 2015 |
Versioned distributed arrays for resilience in scientific applications: Global view resilience A Chien, P Balaji, P Beckman, N Dun, A Fang, H Fujita, K Iskra, ... Procedia Computer Science 51, 29-38, 2015 | 42 | 2015 |
AMPT-GA: automatic mixed precision floating point tuning for GPU applications PV Kotipalli, R Singh, P Wood, I Laguna, S Bagchi Proceedings of the ACM International Conference on Supercomputing, 160-170, 2019 | 41 | 2019 |
Apollo: Reusable models for fast, dynamic tuning of input-dependent code D Beckingsale, O Pearce, I Laguna, T Gamblin 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017 | 40 | 2017 |
Report of the HPC Correctness Summit, Jan 25--26, 2017, Washington, DC G Gopalakrishnan, PD Hovland, C Iancu, S Krishnamoorthy, I Laguna, ... arXiv preprint arXiv:1705.07478, 2017 | 36 | 2017 |
EReinit: Scalable and efficient faulttolerance for bulksynchronous MPI applications S Chakraborty, I Laguna, M Emani, K Mohror, DK Panda, M Schulz, ... Concurrency and Computation: Practice and Experience 32 (3), e4863, 2020 | 34 | 2020 |
Distributed diagnosis of failures in a three tier e-commerce system G Khanna, I Laguna, FA Arshad, S Bagchi 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS …, 2007 | 34 | 2007 |
Fliptracker: Understanding natural error resilience in hpc applications L Guo, D Li, I Laguna, M Schulz SC18: International Conference for High Performance Computing, Networking …, 2018 | 33 | 2018 |