Neal Crago, Ph.D. Senior Research Scientist, NVIDIA ncrago{at}nvidia.com google scholar
Neal Crago is a Senior Research Scientist at NVIDIA Research. His research specializes in hardware/software co-design of data-parallel, spatial, and domain-specific computer architectures, with a focus on managing parallelism and data movement in the memory subsystem. Dr. Crago received his B.S., M.S., and Ph.D. in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign (UIUC).
Awards:
Best Paper Nominee, HPCA 2021
Top Picks Honorable Mention, IEEE MICRO 2019
Best Paper Nominee, ISPASS 2014
Top Picks Awardee, IEEE MICRO 2013
Best Paper Nominee, HPCA 2013
Publications:
“WASP: Exploiting GPU Pipeline Parallelism with Hardware-Accelerated Automatic Warp Specialization” [preprint pdf] Neal Crago, Sana Damani, Karu Sankaralingam, Stephen W. Keckler International Symposium on High-Performance Computer Architecture (HPCA), March 2024
“Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing” [pdf] Michael Pellauer, Jason Clemons, Vignesh Balaji, Neal Crago, Aamer Jaleel, Donghyuk Lee, Mike O’Connor, Angshuman Parashar, Sean Treichler, Po-An Tsai, Stephen W. Keckler, Joel S. Emer ACM Transactions on Computing Systems (TOCS), October 2023.
“Community-based Matrix Reordering forSparse Linear Algebra Optimization” [pdf] Vignesh Balaji, Neal Crago, Aamer Jaleel, Stephen W Keckler International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2023.
“Accelerating SparseData Orchestration via Dynamic Reflexive Tiling” [pdf] Toluwanimi O. Odemuyiwa, Hadi Asghari-Moghaddam, Michael Pellauer, Kartik Hegde, Po-An Tsai, Neal Crago, Aamer Jaleel, John D Owens, Edgar Solomonik, Joel S Emer, Christopher W Fletcher International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2023.
“P-OPT: Practical Optimal Cache Replacement for Graph Analytics” [pdf] *Best Paper Nominee* Vignesh Balaji, Neal Crago, Aamer Jaleel, Brandon Lucia, International Symposium on High-Performance Computer Architecture (HPCA), February 2021.
“ExTensor: An Accelerator for Sparse Tensor Algebra“[pdf] *IEEE MICRO Top Picks Honorable Mention* Kartik Hegde, Hadi Asghari-Moghaddam, Michael Pellauer, Neal Crago, Aamer Jaleel, Edgar Solomonik, Joel Emer, Christopher W Fletcher International Symposium on Microarchitecture (MICRO), October 2019.
“Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration” [pdf] *IEEE MICRO Top Picks Honorable Mention* Michael Pellauer, Yakun Sophia Shao, Jason Clemons, Neal Crago, Kartik Hegde, Rangharajan Venkatesan, Stephen W Keckler, Christopher W. Fletcher, Joel Emer International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2019.
“Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs” [pdf] Neal Crago, Mark Stephenson, Stephen W Keckler ACM Transactions on Architecture and Code Optimization (TACO), October 2018
“Efficient Control and Communication Paradigms for Coarse-grained Spatial Architectures” [pdf] Michael Pellauer, Angshuman Parashar, Michael Adler, Bushra Ahsan, Randy Allmon, Neal Crago, Kermin Fleming, Mohit Gambhir, Aamer Jaleel, Tushar Krishna, Daniel Lustig, Stephen Maresh, Vladimir Pavlov, Rachid Rayess, Antonia Zhai, Joel Emer. ACM Transactions on Computer Systems (TOCS), September 2015
“Exploiting Spatial Architectures for Edit Distance Algorithms” [pdf] *Best Paper Nominee* Jesmin Jahan Tithi, Neal Crago, Joel Emer. International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2014
“Triggered Instructions: A Control Paradigm for Spatially-programmed Architectures” [pdf, pdf] *IEEE MICRO Top Picks Awardee* Angshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, Joel Emer International Symposium on Computer Architecture(ISCA), June 2013
“Hybrid Latency Tolerance for Robust Energy-efficiency on 1000-core Data Parallel Processors” [pdf] *Best Paper Nominee* Neal Crago, Omid Azizi, Steven S Lumetta, Sanjay J Patel International Symposium on High Performance Computer Architecture (HPCA), February 2013.
“Decoupled Architectures as a Low-Complexity Alternative to Out-of-order Execution“, [pdf] Neal Crago, Sanjay J Patel International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2011
“OUTRIDER: Efficient Memory Latency Tolerance with Decoupled Strands” [pdf] Neal Crago, Sanjay J Patel International Symposium of Computer Architecture (ISCA), June 2011.
“Rigel: An Architecture and Scalable Programming Interface for a 1000-core Accelerator” [pdf] John H Kelm, Daniel R Johnson, Matthew R Johnson, Neal Crago, William Tuohy, Aqeel Mahesri, Steven S Lumetta, Matthew I Frank, Sanjay J Patel International Symposium of Computer Architecture (ISCA), June 2009.
“Tradeoffs in Designing Accelerator Architectures for Visual Computing” [pdf] Aqeel Mahesri, Daniel Johnson, Neal Crago, Sanjay J Patel International Symposium on Microarchitecture (MICRO), November 2008