T. N. Vijaykumar

Orcid: 0000-0001-6624-4372

Affiliations:
  • Purdue University, West Lafayette, USA


According to our database1, T. N. Vijaykumar authored at least 98 papers between 1994 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Efficient Sparse Processing-in-Memory Architecture (ESPIM) for Machine Learning Inference.
CoRR, 2024

QED: Scalable Verification of Hardware Memory Consistency.
CoRR, 2024

2023
Occam: Optimal Data Reuse for Convolutional Neural Networks.
ACM Trans. Archit. Code Optim., March, 2023

SafeBet: Secure, Simple, and Fast Speculative Execution.
CoRR, 2023

Eureka: Efficient Tensor Cores for One-sided Unstructured Sparsity in DNN Inference.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

2022
Booster: An Accelerator for Gradient Boosting Decision Trees Training and Inference.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Karma: Cost-Effective Geo-Replicated Cloud Storage with Dynamic Enforcement of Causal Consistency.
IEEE Trans. Cloud Comput., 2021

Barrier-Free Large-Scale Sparse Tensor Accelerator (BARISTA) For Convolutional Neural Networks.
CoRR, 2021

FastZ: accelerating gapped whole genome alignment on GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

2020
Dart: Divide and Specialize for Fast Response to Congestion in RDMA-Based Datacenter Networks.
IEEE/ACM Trans. Netw., 2020

Network Interface Architecture for Remote Indirect Memory Access (RIMA) in Datacenters.
ACM Trans. Archit. Code Optim., 2020

Booster: An Accelerator for Gradient Boosting Decision Trees.
CoRR, 2020

Newton: A DRAM-maker's Accelerator-in-Memory (AiM) Architecture for Machine Learning.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Secure automatic bounds checking: prevention is simpler than cure.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019
SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

2018
Dart: Divide and Specialize for Fast Response to Congestion in RDMA-based Datacenter Networks.
CoRR, 2018

Fast Congestion Control in RDMA-based Datacenter Networks.
Proceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos, 2018

Millipede: Die-Stacked Memory Optimizations for Big Data Machine Learning Analytics.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

2017
NutShell: Scalable Whittled Proxy Execution for Low-Latency Web over Cellular Networks.
Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, 2017

Efficient Collaborative Approximation in MapReduce without Missing Rare Keys.
Proceedings of the 2017 International Conference on Cloud and Autonomic Computing, 2017

Exploring Functional Slicing in the Design of Distributed SDN Controllers.
Proceedings of the Communication Systems and Networks - 9th International Conference, 2017

Hydra: Leveraging functional slicing for efficient distributed SDN controllers.
Proceedings of the 9th International Conference on Communication Systems and Networks, 2017

2015
TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for On-line Data-Intensive Applications.
CoRR, 2015

MigrantStore: Leveraging Virtual Memory in DRAM-PCM Memory Architecture.
CoRR, 2015

TimeTrader: exploiting latency tail to save datacenter energy for online search.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

FaultHound: value-locality-based soft-fault tolerance.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
ShuffleWatcher: Shuffle-aware Scheduling in Multi-tenant MapReduce Clusters.
Proceedings of the 2014 USENIX Annual Technical Conference, 2014

Fractal++: Closing the performance gap between fractal and conventional coherence.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

High-performance fractal coherence.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013
MapReduce with communication overlap (MaRCO).
J. Parallel Distributed Comput., 2013

Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
Top Picks from the 2011 Computer Architecture Conferences.
IEEE Micro, 2012

Deadline-aware datacenter tcp (D2TCP).
Proceedings of the ACM SIGCOMM 2012 Conference, 2012

Tarazu: optimizing MapReduce on heterogeneous clusters.
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

2011
TreeCAM: decoupling updates and lookups in packet classification.
Proceedings of the 2011 Conference on Emerging Networking Experiments and Technologies, 2011

2010
EffiCuts: optimizing packet classification for memory and throughput.
Proceedings of the ACM SIGCOMM 2010 Conference on Applications, 2010

Adaptive Flow Control for Robust Performance and Energy.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Timetraveler: exploiting acyclic races for optimizing memory race recording.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

LiteTM: Reducing transactional state overhead.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Joint optimization of idle and cooling power in data centers while maintaining response time.
Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010

2009
Speculatively Multithreaded Architectures.
Proceedings of the Multicore Processors and Systems, 2009

2008
Optimal Power/Performance Pipeline Depth for SMT in Scaled Technologies.
IEEE Trans. Computers, 2008

Automatic volume management for programmable microfluidics.
Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, 2008

Shapeshifter: Dynamically changing pipeline width and speed to address process variations.
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

2007
Speculative thread decomposition through empirical optimization.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Resource area dilation to reduce power density in throughput servers.
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

Aquacore: a programmable architecture for microfluidics.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

BlackJack: Hard Error Detection with Redundant Threads on SMT.
Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2007

2006
Exploiting reference idempotency to reduce speculative storage overflow.
ACM Trans. Program. Lang. Syst., 2006

SmashGuard: A Hardware Solution to Prevent Security Attacks on the Function Return Address.
IEEE Trans. Computers, 2006

Opportunistic Transient-Fault Detection.
IEEE Micro, 2006

Dynamic feature selection for hardware prediction.
J. Syst. Archit., 2006

Pesticide: Using SMT Processors to Improve Performance of Pointer Bug Detection.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

A program transformation and architecture support for quantum uncomputation.
Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

Do Trace Cache, Value Prediction and Prefetching Improve SMT Throughput?.
Proceedings of the Architecture of Computing Systems, 2006

2005
Combined circuit and architectural level variable supply-voltage scaling for low power.
IEEE Trans. Very Large Scale Integr. Syst., 2005

Detection and prevention of stack buffer overflow attacks.
Commun. ACM, 2005

Dynamic pipelining: making IP-lookup truly scalable.
Proceedings of the ACM SIGCOMM 2005 Conference on Applications, 2005

Balancing Resource Utilization to Mitigate Power Density in Processor Pipelines.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Rescue: A Microarchitecture for Testability and Defect Tolerance.
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Optimizing Replication, Communication, and Capacity Allocation in CMPs.
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

Heat Stroke: Power-Density-Based Denial of Service in SMT.
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

2004
DCG: deterministic clock-gating for low-power microprocessor design.
IEEE Trans. Very Large Scale Integr. Syst., 2004

Min-cut program decomposition for thread-level speculation.
Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004

Wire Delay is Not a Problem for SMT (In the Near Future).
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

Exploiting Resonant Behavior to Reduce Inductive Noise.
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

Heat-and-run: leveraging SMT and CMP to manage power density through the operating system.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003
Transient-Fault Recovery for Chip Multiprocessors.
IEEE Micro, 2003

Reducing Design Complexity of the Load/Store Queue.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

Accelerating private-key cryptography via multithreading on symmetric multiprocessors.
Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, 2003

Pipeline muffling and a priori current ramping: architectural techniques to reduce high-frequency inductive noise.
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003

Pipeline Damping: A Microarchitectural Technique to Reduce Inductive Noise in Supply Voltage.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Iimplicitly-Multithreaded Processors.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Efficient Use of Memory Bandwidth to Improve Network Processor Throughput.
Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

Deterministic Clock Gating for Microprocessor Power Reduction.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

Exploring High Bandwidth Pipelined Cache Architecture for Scaled Technology.
Proceedings of the 2003 Design, 2003

Exploring High Bandwidth Pipelined Cache Architecture for Scaled Technology.
Proceedings of the Embedded Software for SoC, 2003

2002
Reducing register ports for higher speed and lower energy.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

Transient-Fault Recovery Using Simultaneous Multithreading.
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002

Exploiting Choice in Resizable Cache Design to Optimize Deep-Submicron Processor Energy-Delay.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002

2001
Reducing leakage in a high-performance deep-submicron instruction cache.
IEEE Trans. Very Large Scale Integr. Syst., 2001

Speculative Versioning Cache.
IEEE Trans. Parallel Distributed Syst., 2001

Reference idempotency analysis: a framework for optimizing speculative execution.
Proceedings of the 2001 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'01), 2001

Reducing set-associative cache energy via way-prediction and selective direct-mapping.
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001

Skipper: a microarchitecture for exploiting control-flow independence.
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001

Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor.
Proceedings of the 15th international conference on Supercomputing, 2001

An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches.
Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

Reactive-Associative Caches.
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

2000
Gated-V<sub>dd</sub>: a circuit technique to reduce leakage in deep-submicron cache memories
Proceedings of the 2000 International Symposium on Low Power Electronics and Design, 2000

1999
Task Selection for the Multiscalar Architecture.
J. Parallel Distributed Comput., 1999

Is SC + ILP=RC?
Proceedings of the 26th Annual International Symposium on Computer Architecture, 1999

1998
Task Selection for a Multiscalar Processor.
Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture, 1998

1997
Dynamic Speculation and Synchronization of Data Dependences.
Proceedings of the 24th International Symposium on Computer Architecture, 1997

1995
Multiscalar Processors.
Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995

1994
The anatomy of the register file in a multiscalar processor.
Proceedings of the 27th Annual International Symposium on Microarchitecture, San Jose, California, USA, November 30, 1994


  Loading...