Follow
Teng Li
Teng Li
Engineering Manager, Meta, Inc.
Verified email at fb.com - Homepage
Title
Cited by
Cited by
Year
Pytorch distributed: Experiences on accelerating data parallel training
S Li, Y Zhao, R Varma, O Salpekar, P Noordhuis, T Li, A Paszke, J Smith, ...
2020 International Conference on Very Large Databases (VLDB 2020 …, 2020
4472020
Sample-efficient neural architecture search by learning actions for monte carlo tree search
L Wang, S Xie, T Li, R Fonseca, Y Tian
IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (9), 5503-5515, 2021
94*2021
GPU resource sharing and virtualization on high performance computing systems
T Li, VK Narayana, E El-Araby, T El-Ghazawi
2011 International Conference on Parallel Processing, 733-742, 2011
652011
Productivity of GPUs under different programming paradigms
M Malik, T Li, U Sharif, R Shahid, T El‐Ghazawi, G Newby
Concurrency and computation: practice and experience 24 (2), 179-191, 2012
292012
Symbiotic scheduling of concurrent GPU kernels for performance and energy optimizations
T Li, VK Narayana, T El-Ghazawi
Proceedings of the 11th ACM Conference on Computing Frontiers, 1-10, 2014
202014
A static task scheduling framework for independent tasks accelerated using a shared graphics processing unit
T Li, VK Narayana, T El-Ghazawi
2011 IEEE 17th International Conference on Parallel and Distributed Systems …, 2011
202011
A power-aware symbiotic scheduling algorithm for concurrent GPU kernels
T Li, VK Narayana, T El-Ghazawi
2015 IEEE 21st International Conference on Parallel and Distributed Systems …, 2015
192015
Exploring graphics processing unit (GPU) resource sharing efficiency for high performance computing
T Li, VK Narayana, T El-Ghazawi
Computers 2 (4), 176-214, 2013
132013
Reconfigurable active drive: An fpga accelerated storage architecture for data-intensive applications
T Li, M Huang, T El-Ghazawi, H Huang
2009 Symposium on Application Accelerators in High-Performance Computing, 1-3, 2009
102009
Accelerated high-performance computing through efficient multi-process GPU resource sharing
T Li, VK Narayana, T El-Ghazawi
Proceedings of the 9th Conference on Computing Frontiers, 269-272, 2012
82012
& Chintala, S.(2020). Pytorch distributed: Experiences on accelerating data parallel training
S Li, Y Zhao, R Varma, O Salpekar, P Noordhuis, T Li
arXiv preprint arXiv:2006.15704, 0
6
Reordering GPU kernel launches to enable efficient concurrent execution
T Li, VK Narayana, T El-Ghazawi
arXiv preprint arXiv:1511.07983, 2015
52015
The system can't perform the operation now. Try again later.
Articles 1–12