Follow
Naoya Maruyama
Naoya Maruyama
NVIDIA
No verified email
Title
Cited by
Cited by
Year
FTI: High performance fault tolerance interface for hybrid systems
L Bautista-Gomez, S Tsuboi, D Komatitsch, F Cappello, N Maruyama, ...
Proceedings of 2011 international conference for high performance computing …, 2011
4532011
Statistical power modeling of GPU kernels using performance counters
H Nagasaka, N Maruyama, A Nukada, T Endo, S Matsuoka
International conference on green computing, 115-122, 2010
2682010
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer
T Shimokawabe, T Aoki, T Takaki, T Endo, A Yamanaka, N Maruyama, ...
Proceedings of 2011 International Conference for High Performance Computing …, 2011
2662011
Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers
N Maruyama, T Nomura, K Sato, S Matsuoka
Proceedings of 2011 International Conference for High Performance Computing …, 2011
2612011
Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs
HR Zohouri, N Maruyama, A Smith, M Matsuda, S Matsuoka
SC'16: Proceedings of the International Conference for High Performance …, 2016
2252016
An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code
T Shimokawabe, T Aoki, C Muroi, J Ishida, K Kawano, T Endo, A Nukada, ...
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
1792010
Design and modeling of a non-blocking checkpointing system
K Sato, N Maruyama, K Mohror, A Moody, T Gamblin, BR de Supinski, ...
SC'12: Proceedings of the International Conference on High Performance …, 2012
1502012
CUDA vs OpenACC: Performance case studies with kernel benchmarks and a memory-bound CFD application
T Hoshino, N Maruyama, S Matsuoka, R Takaki
2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid …, 2013
1372013
Trends in data locality abstractions for HPC systems
D Unat, A Dubey, T Hoefler, J Shalf, M Abraham, M Bianco, ...
IEEE Transactions on Parallel and Distributed Systems 28 (10), 3007-3020, 2017
1252017
Scalable kernel fusion for memory-bound GPU applications
M Wahib, N Maruyama
SC'14: Proceedings of the International Conference for High Performance …, 2014
1222014
An efficient, model-based CPU-GPU heterogeneous FFT library
Y Ogata, T Endo, N Maruyama, S Matsuoka
2008 IEEE international symposium on parallel and distributed processing, 1-10, 2008
1222008
Problem diagnosis in large-scale computing environments
AV Mirgorodskiy, N Maruyama, BP Miller
Proceedings of the 2006 ACM/IEEE conference on Supercomputing, 88-es, 2006
1192006
Virtual clusters on the fly-fast, scalable, and flexible installation
H Nishimura, N Maruyama, S Matsuoka
Seventh IEEE International Symposium on Cluster Computing and the Grid …, 2007
1142007
Optimizing stencil computations for NVIDIA Kepler GPUs
N Maruyama, T Aoki
Proceedings of the 1st international workshop on high-performance stencil …, 2014
1032014
A user-level infiniband-based file system and checkpoint strategy for burst buffers
K Sato, K Mohror, A Moody, T Gamblin, BR De Supinski, N Maruyama, ...
2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2014
832014
Improving the computing efficiency of HPC systems using a combination of proactive and preventive checkpointing
MS Bouguerra, A Gainaru, LB Gomez, F Cappello, S Matsuoka, ...
2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013
782013
Linpack evaluation on a supercomputer with heterogeneous accelerators
T Endo, S Matsuoka, A Nukada, N Maruyama
2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010
782010
Distributed diskless checkpoint for large scale systems
LAB Gomez, N Maruyama, F Cappello, S Matsuoka
2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid …, 2010
702010
Improving strong-scaling of CNN training by exploiting finer-grained parallelism
N Dryden, N Maruyama, T Benson, T Moon, M Snir, B Van Essen
2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019
632019
A high-performance fault-tolerant software framework for memory on commodity gpus
N Maruyama, A Nukada, S Matsuoka
2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010
612010
The system can't perform the operation now. Try again later.
Articles 1–20