R. Govindarajan
Orcid: 0000-0003-2517-9994Affiliations:
- ERNET, India
According to our database1,
R. Govindarajan
authored at least 139 papers
between 1986 and 2024.
Collaborative distances:
Collaborative distances:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024
Tile Size and Loop Order Selection using Machine Learning for Multi-/Many-Core Architectures.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
Proceedings of the 30th IEEE International Conference on High Performance Computing, 2023
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
On odd harmonious labelling of even cycles with parallel chords and dragons with parallel chords.
Int. J. Comput. Aided Eng. Technol., 2020
HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated Heterogeneous Systems.
ACM Trans. Archit. Code Optim., 2017
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017
Proceedings of the Second International Symposium on Memory Systems, 2016
Author Rebuttal to Rocha et al. "Comments on Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks".
J. Signal Process. Syst., 2015
Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015
ACM Trans. Archit. Code Optim., 2014
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Fluidic Kernels: Cooperative Execution of OpenCL Programs on Multiple Heterogeneous Devices.
Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014
Proceedings of the Compiler Construction - 23rd International Conference, 2014
Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014
J. Signal Process. Syst., 2013
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
On-chip memory architecture exploration framework for DSP processor-based embedded system on chip.
ACM Trans. Embed. Comput. Syst., 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Multiple sub-row buffers in DRAM: unlocking performance and energy improvement opportunities.
Proceedings of the International Conference on Supercomputing, 2012
CUDA-For-Clusters: A System for Efficient Execution of CUDA Kernels on Multi-core Clusters.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012
Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012
J. Instr. Level Parallelism, 2011
Proceedings of the IEEE Workshop on Signal Processing Systems, 2011
Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011
Variable Granularity Access Tracking Scheme for Improving the Performance of Software Transactional Memory.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
Proceedings of the High Performance Embedded Architectures and Compilers, 2011
Proceedings of the CGO 2011, 2011
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
Row-Buffer Reorganization: Simultaneously Improving Performance and Reducing Energy in DRAMs.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011
Proceedings of the Static Analysis - 17th International Symposium, 2010
Proceedings of the 39th International Conference on Parallel Processing, 2010
Analyzing cache performance bottlenecks of STM applications and addressing them with compiler's help.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010
IEEE Trans. Computers, 2009
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, 2009
Proceedings of the 18th International Conference on Computer Communications and Networks, 2009
Proceedings of the CGO 2009, 2009
Proceedings of the Programming Languages and Systems, 7th Asian Symposium, 2009
Proceedings of the PACT 2009, 2009
Impact of message compression on the scalability of an atmospheric modeling application on clusters.
Parallel Comput., 2008
Proceedings of the 21st International Conference on VLSI Design (VLSI Design 2008), 2008
A systematic approach to synthesis of verification test-suites for modular SoC designs.
Proceedings of the 21st Annual IEEE International SoC Conference, SoCC 2008, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008
Proceedings of the High Performance Computing, 2008
Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008
ACM Trans. Archit. Code Optim., 2007
Int. J. Parallel Program., 2007
MAX: A Multi Objective Memory Architecture eXploration Framework for Embedded Systems-on-Chip.
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007
Proceedings of the Fourth International Conference on the Quantitative Evaluaiton of Systems (QEST 2007), 2007
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40 2007), 2007
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
Proceedings of the High Performance Computing, 2007
Proceedings of the Compiler Construction, 16th International Conference, 2007
Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 0-1 Integer Linear Programming Formulation.
Proceedings of the Compiler Construction, 16th International Conference, 2007
MODLEX: A Multi Objective Data Layout EXploration Framework for Embedded Systems-on-Chip.
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007
Advances in Software Pipelining.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2007
Instruction Scheduling.
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2007
Area and Power Reduction of Embedded DSP Systems using Instruction Compression and Re-configurable Encoding.
J. VLSI Signal Process., 2006
Exploiting programmable network interfaces for parallel query execution in workstation clusters.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 20th Annual International Conference on Supercomputing, 2006
Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), 2006
J. Embed. Comput., 2005
Proceedings of the Second International Conference on the Quantitative Evaluaiton of Systems (QEST 2005), 2005
Proceedings of the 19th Annual International Conference on Supercomputing, 2005
Offloading Bloom Filter Operations to Network Processor for Parallel Query Processing in Cluster of Workstations.
Proceedings of the High Performance Computing, 2005
Performance analysis of methods that overcome false sharing effects in software DSMs.
J. Parallel Distributed Comput., 2004
Int. J. Parallel Program., 2004
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004
Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures.
IEEE Trans. Computers, 2003
Proceedings of the 16th International Conference on VLSI Design (VLSI Design 2003), 2003
Unified Instruction Reordering and Algebraic Transformations for Minimum Cost Offset Assignment.
Proceedings of the Software and Compilers for Embedded Systems, 7th International Workshop, 2003
Proceedings of the Languages and Compilers for Parallel Computing, 2003
An Executable Analytical Performance Evaluation Approach for Early Performance Prediction.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Programming Models and System Software for Future High-End Computing Systems: Work-in-Progress.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Exploiting Java-ILP on a Simultaneous Multi-Trace Instruction Issue (SMTI) Processor.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Proceedings of the High Performance Computing - HiPC 2003, 10th International Conference, 2003
Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks.
J. VLSI Signal Process., 2002
A Theory for Co-Scheduling Hardware and Software Pipelines in ASIPs and Embedded Processors.
Des. Autom. Embed. Syst., 2002
Power-Performance Trade-Offs for Energy-Efficient Architectures: A Quantitative Study.
Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002
Proceedings of the High Performance Computing, 2002
Proceedings of the Compiler Design Handbook: Optimizations and Machine Code Generation, 2002
J. Parallel Distributed Comput., 2001
Minimum Register Instruction Sequence Problem: Revisiting Optimal Code Generation for DAGs.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001
Proceedings of the High Performance Computing - HiPC 2001, 8th International Conference, 2001
Enhanced Co-Scheduling: A Software Pipelining Method Using Modulo-Scheduled Pipeline Theory.
Int. J. Parallel Program., 2000
Proceedings of the 12th IEEE International Conference on Application-Specific Systems, 2000
Minimum Register Instruction Scheduling: A New Approach for Dynamic Instruction Issue Processors.
Proceedings of the Languages and Compilers for Parallel Computing, 1999
Resource usage models for instruction scheduling: two new models and a classification.
Proceedings of the 13th international conference on Supercomputing, 1999
Proceedings of the High Performance Computing, 1999
Proceedings of the Compiler Construction, 8th International Conference, 1999
Evaluating Register Allocation and Instruction Scheduling Techniques in Out-Of-Order Issue Processors.
Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999
A Unified Framework for Instruction Scheduling and Mapping for Function Units with Structural Hazards.
J. Parallel Distributed Comput., 1998
Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, 1998
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998
Proceedings of the 5th International Conference On High Performance Computing, 1998
Proceedings of the Seventh International Workshop on Petri Nets and Performance Models, 1997
Proceedings of the 1997 International Conference on Parallel and Distributed Systems (ICPADS '97), 1997
Classification and performance evaluation of simultaneous multithreaded architectures.
Proceedings of the Fourth International on High-Performance Computing, 1997
Proceedings of the 1997 Conference on Parallel Architectures and Compilation Techniques (PACT '97), 1997
IEEE Trans. Parallel Distributed Syst., 1996
Proceedings of the Second International Symposium on High-Performance Computer Architecture, 1996
Buffer allocation in regular dataflow networks: an approach based on coloring circular-arc graphs.
Proceedings of the 3rd International Conference on High Performance Computing, 1996
Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation (PLDI), 1995
Proceedings of the Languages and Compilers for Parallel Computing, 1995
Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture (HPCA 1995), 1995
Proceedings of the PARLE '94: Parallel Architectures and Languages Europe, 1994
Minimizing register requirements under resource-constrained rate-optimal software pipelining.
Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30, 1994
Proceedings of the International Conference on Application Specific Array Processors, 1994
IEEE Trans. Software Eng., 1993
Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993
Proceedings of the International Conference on Application-Specific Array Processors, 1993
Attempting guards in parallel: A data flow approach to execute generalized guarded commands.
Int. J. Parallel Program., 1992
Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, 1992
Proceedings of the 25th Annual International Symposium on Microarchitecture, 1992
Performance Evaluation of Latency Tolerant Architectures.
Proceedings of the Computing and Information, 1992
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992
Proceedings of the Parallel Processing: CONPAR 92, 1992
Proceedings of the Sixteenth Annual International Computer Software and Applications Conference, 1992
Proceedings of the PARLE '91: Parallel Architectures and Languages Europe, 1991
Proceedings of the Fifteenth Annual International Computer Software and Applications Conference, 1991
Lenient Execution and Concurrent Execution of Re-Entrant Routines: Efficient Implementation in Data Flow Systems.
Comput. J., 1990
PROMIDS: A PROtotype multi-rIng data flow system for functional programming languages.
Microprocessing and Microprogramming, 1989
Design and Performance Evaluation of EXMAN: An EXtended MANchester Data Flow Computer.
IEEE Trans. Computers, 1986