S. K. Nandy
Affiliations:- Indian Institute of Science (IISc), Department of Computational and Data Sciences, CAD Laboratory, Bangalore, India
- ERNET, India
According to our database1,
S. K. Nandy
authored at least 155 papers
between 1986 and 2022.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2022
CoRR, 2022
A Survey on High-Throughput Non-Binary LDPC Decoders: ASIC, FPGA, and GPU Architectures.
IEEE Commun. Surv. Tutorials, 2022
2021
Factorization of Boolean Polynomials: Parallel Algorithms and Experimental Evaluation.
Program. Comput. Softw., 2021
2020
Towards Accelerated Genome Informatics on Parallel HPC Platforms: The ReneGENE-GI Perspective.
J. Signal Process. Syst., 2020
2019
J. Low Power Electron., 2019
Proceedings of the 32nd International Conference on VLSI Design and 18th International Conference on Embedded Systems, 2019
A Systematic Approach for Acceleration of Matrix-Vector Operations in CGRA through Algorithm-Architecture Co-Design.
Proceedings of the 32nd International Conference on VLSI Design and 18th International Conference on Embedded Systems, 2019
Proceedings of the Perspectives of System Informatics, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
Efficient Realization of Householder Transform Through Algorithm-Architecture Co-Design for Acceleration of QR Factorization.
IEEE Trans. Parallel Distributed Syst., 2018
Efficient Realization of Givens Rotation through Algorithm-Architecture Co-design for Acceleration of QR Factorization.
CoRR, 2018
Proceedings of the 31st International Conference on VLSI Design and 17th International Conference on Embedded Systems, 2018
An Algorithm - Architecture Co-Designed System for Dynamic Execution-Driven Pre-Silicon Verification.
Proceedings of the 8th International Symposium on Embedded Computing and System Design, 2018
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018
ReneGENE-Novo: Co-designed Algorithm-Architecture for Accelerated Preprocessing and Assembly of Genomic Short Reads.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018
Achieving Efficient Realization of Kalman Filter on CGRA Through Algorithm-Architecture Co-design.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2018
2017
Parallel Process. Lett., 2017
Energy aware synthesis of application kernels through composition of data-paths on a CGRA.
Integr., 2017
REDEFINE<sup>®</sup>™: a case for WCET-friendly hardware accelerators for real time applications (work-in-progress).
Proceedings of the 2017 International Conference on Compilers, 2017
2016
IEEE Trans. Parallel Distributed Syst., 2016
Nano Commun. Networks, 2016
CoRR, 2016
An Energy Efficient Dynamically Reconfigurable QR Decomposition for Wireless MIMO Communication.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
Achieving Efficient QR Factorization by Algorithm-Architecture Co-design of Householder Transformation.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
Efficient Realization of Table Look-Up Based Double Precision Floating Point Arithmetic.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
VOP: Architecture of a Processor for Vector Operations in On-Line Learning of Neural Networks.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
VOP: Architecture of a Processor for Vector Operations in On-Line Learning of Neural Networks.
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
Proceedings of the 29th International Conference on VLSI Design and 15th International Conference on Embedded Systems, 2016
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016
Proceedings of the 7th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and the 5th Workshop on Design Tools and Architectures For Multicore Embedded Computing Platforms, 2016
Performance Evaluation of Feed-Forward Backpropagation Neural Network for Classification on a Reconfigurable Hardware Architecture.
Proceedings of the Applied Reconfigurable Computing - 12th International Symposium, 2016
2015
Scalable and Energy Efficient, Dynamically Reconfigurable Fast Fourier Transform Architecture.
J. Low Power Electron., 2015
Router Attack toward NoC-enabled MPSoC and Monitoring Countermeasures against such Threat.
Circuits Syst. Signal Process., 2015
Proceedings of the 28th International Conference on VLSI Design, 2015
Micro-architectural Enhancements in Distributed Memory CGRAs for LU and QR Factorizations.
Proceedings of the 28th International Conference on VLSI Design, 2015
Proceedings of the 28th International Conference on VLSI Design, 2015
Proceedings of the 28th IEEE International System-on-Chip Conference, 2015
A deterministic, minimal routing algorithm for a toroidal, rectangular honeycomb topology using a 2-tupled relative address.
Proceedings of the 28th IEEE International System-on-Chip Conference, 2015
Proceedings of the 2015 IEEE World Congress on Services, 2015
Energy Aware Synthesis of Application Kernels Expressed in Functional Languages on a Coarse Grained Composable Reconfigurable Array.
Proceedings of the IEEE International Symposium on Nanoelectronic and Information Systems, 2015
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015
Proceedings of the International Conference on Cloud Computing and Big Data, 2015
2014
A framework for post-silicon realization of arbitrary instruction extensions on reconfigurable data-paths.
J. Syst. Archit., 2014
Proceedings of the 2014 27th International Conference on VLSI Design, 2014
Scalable and energy-efficient reconfigurable accelerator for column-wise givens rotation.
Proceedings of the 22nd International Conference on Very Large Scale Integration, 2014
Co-exploration of NLA kernels and specification of Compute Elements in distributed memory CGRAs.
Proceedings of the XIVth International Conference on Embedded Computer Systems: Architectures, 2014
Proceedings of the XIVth International Conference on Embedded Computer Systems: Architectures, 2014
Energy Efficient, Scalable, and Dynamically Reconfigurable FFT Architecture for OFDM Systems.
Proceedings of the 2014 Fifth International Symposium on Electronic System Design, 2014
Hardware architecture of bi-cubic convolution interpolation for real-time image scaling.
Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014
Proceedings of the 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, 2014
Proceedings of the 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, 2014
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014
2013
Proceedings of the 26th International Conference on VLSI Design and 12th International Conference on Embedded Systems, 2013
High throughput, low latency, memory optimized 64K point FFT architecture using novel radix-4 butterfly unit.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013
Proceedings of the IEEE 5th International Conference on Cloud Computing Technology and Science, 2013
Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Santa Clara, CA, USA, June 28, 2013
2012
Proceedings of the 13th ACM/IEEE International Conference on Grid Computing, 2012
2011
Data Flow Graph Partitioning Algorithms and Their Evaluations for Optimal Spatio-temporal Computation on a Coarse Grain Reconfigurable Architecture.
IPSJ Trans. Syst. LSI Des. Methodol., 2011
A Method for Flexible Reduction over Binary Fields using a Field Multiplier.
Proceedings of the SECRYPT 2011 - Proceedings of the International Conference on Security and Cryptography, Seville, Spain, 18, 2011
Proceedings of the IEEE 9th Symposium on Application Specific Processors, 2011
A Fully Pipelined Modular Multiple Precision Floating Point Multiplier with Vector Support.
Proceedings of the International Symposium on Electronic System Design, 2011
Proceedings of the E-Business and Telecommunications - International Joint Conference, 2011
Interconnect-topology independent mapping algorithm for a Coarse Grained Reconfigurable Architecture.
Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011
Dataflow Graph Partitioning for Optimal Spatio-Temporal Computation on a Coarse Grain Reconfigurable Architecture.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2011
2010
Enhancements for variable N-point streaming FFT/IFFT on REDEFINE, a runtime reconfigurable architecture.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010
Design space exploration of systolic realization of QR factorization on a runtime reconfigurable platform.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010
Accelerating Numerical Linear Algebra Kernels on a Scalable Run Time Reconfigurable Platform.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2010
Towards minimizing execution delays on dynamically reconfigurable processors: a case study on REDEFINE.
Proceedings of the 2010 International Conference on Compilers, 2010
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010
2009
ACM Trans. Embed. Comput. Syst., 2009
Generic routing rules and a scalable access enhancement for the Network-on-Chip RECONNECT.
Proceedings of the Annual IEEE International SoC Conference, SoCC 2009, 2009
Proceedings of the 2009 International Conference on Embedded Computer Systems: Architectures, 2009
High-throughput flexible constraint length Viterbi decoders on de Bruijn, shuffle-exchange and butterfly connected architectures.
Proceedings of the 2009 International Conference on Embedded Computer Systems: Architectures, 2009
Proceedings of IEEE International Conference on Communications, 2009
Proceedings of the Workshops at the Grid and Pervasive Computing Conference, 2009
Proceedings of the 2009 International Conference on Compilers, 2009
Proceedings of the 20th IEEE International Conference on Application-Specific Systems, 2009
Proceedings of the Reconfigurable Computing: Architectures, 2009
2008
On the effectiveness of phase based regression models to trade power and performance using dynamic processor adaptation.
J. Syst. Archit., 2008
Realizing a flexible constraint length Viterbi decoder for software radio on a de Bruijn interconnection network.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2008
Proceedings of the 19th IEEE International Conference on Application-Specific Systems, 2008
Proceedings of the 19th IEEE International Conference on Application-Specific Systems, 2008
Proceedings of the 19th IEEE International Conference on Application-Specific Systems, 2008
Architecture of a polymorphic ASIC for interoperability across multi-mode H.264 decoders.
Proceedings of the 19th IEEE International Conference on Application-Specific Systems, 2008
2007
J. Low Power Electron., 2007
REDEFINE: Architecture of a SoC Fabric for Runtime Composition of Computation Structures.
Proceedings of the FPL 2007, 2007
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007
2006
Instruction Reuse in SPEC, media and packet processing benchmarks: A comparative study of power, performance and related microarchitectural optimizations.
J. Embed. Comput., 2006
Molecular Caches: A caching structure for dynamic creation of application-specific Heterogeneous cache regions.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006
Framework for Enabling Highly Available Distributed Applications for Utility Computing.
Proceedings of the Parallel and Distributed Processing and Applications, 2006
On the Implementation of a Streaming Video over Peer to Peer network using Middleware Components.
Proceedings of the Fifth International Conference on Networking and the International Conference on Systems (ICN / ICONS / MCL 2006), 2006
Proceedings of IEEE International Conference on Communications, 2006
Proceedings of the 2006 IEEE International Conference on Application-Specific Systems, 2006
Proceedings of the 2006 IEEE International Conference on Application-Specific Systems, 2006
A Framework for Measurement of End-To-End Qos Requirements in Loosely Coupled Systems.
Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA 2006), 2006
Proceedings of the 2006 IEEE International Conference on Services Computing (SCC 2006), 2006
2005
Proceedings of the 3rd International Workshop on Middleware for Pervasive and Ad-hoc Computing (MPAC 2005), held at the ACM/IFIP/USENIX 6th International Middleware Conference, November 28, 2005
A low power and low cost scan test architecture for multi-clock domain SoCs using virtual divide and conquer.
Proceedings of the Proceedings 2005 IEEE International Test Conference, 2005
Proceedings of the 16th International Workshop on Database and Expert Systems Applications (DEXA 2005), 2005
Throughput Driven, Highly Available Streaming Stored Playback Video Service over a Peer-to-Peer Network.
Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA 2005), 2005
2004
On the Correctness of Program Execution When Cache Coherence Is Maintained Locally at Data-Sharing Boundaries in Distributed Shared Memory Multiprocessors.
Int. J. Parallel Program., 2004
On the effectiveness of prefetching and reuse in reducing L1 data cache traffic: a case study of Snort.
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004
Proceedings of IEEE International Conference on Communications, 2004
An Architectural View of the Entities Required for Execution of Task in Pervasive Space.
Proceedings of the 10th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS 2004), 2004
A framework for resource discovery in pervasive computing for mobile aware task execution.
Proceedings of the First Conference on Computing Frontiers, 2004
Proceedings of the 2004 Conference on Asia South Pacific Design Automation: Electronic Design and Solution Fair 2004, 2004
Exploiting program execution phases to trade power and performance for media workload.
Proceedings of the 2004 Conference on Asia South Pacific Design Automation: Electronic Design and Solution Fair 2004, 2004
Proceedings of the 18th International Conference on Advanced Information Networking and Applications (AINA 2004), 2004
2003
On the Effectiveness of Flow Aggregation in Improving Instruction Reuse in Network Processing Applications.
Int. J. Parallel Program., 2003
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003
Traffic Profiling for Efficient Network Resource Utilization.
Proceedings of the International Conference on Internet Computing, 2003
Enhancing Speedup in Network Processing Applications by Exploiting Instruction Reuse with Flow Aggregation.
Proceedings of the 2003 Design, 2003
A complexity effective communication model for behavioral modeling of signal processing applications.
Proceedings of the 40th Design Automation Conference, 2003
Proceedings of the Advances in Computer Systems Architecture, 2003
Enhancing Speedup in Network Processing Applications by Exploiting Instruction Reuse with Flow Aggregation.
Proceedings of the Embedded Software for SoC, 2003
2002
Multithreaded Architectural Support for Speculative Trace Scheduling in VLIW Processors.
Proceedings of the 15th Annual Symposium on Integrated Circuits and Systems Design, 2002
On the Benefits of Speculative Trace Scheduling in VLIW Processors.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2002
Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002
Enforcing Cache Coherence at Data Sharing Boundaries without Global Control: A Hardware-Software Approach (Research Note).
Proceedings of the Euro-Par 2002, 2002
2001
Proceedings of the 14th International Conference on VLSI Design (VLSI Design 2001), 2001
Proceedings of the 14th International Conference on VLSI Design (VLSI Design 2001), 2001
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001
2000
J. VLSI Signal Process., 2000
Harmony - An Architecture for Providing Quality of Service in Mobile Computing Environments.
J. Interconnect. Networks, 2000
Proceedings of the IEEE International Symposium on Circuits and Systems, 2000
Performance evaluation of multithreaded architectures for media processing applications.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2000
Proceedings of the 2000 IEEE International Conference on Multimedia and Expo, 2000
Proceedings of the 37th Conference on Design Automation, 2000
1999
J. VLSI Signal Process., 1999
Proceedings of the 12th International Conference on VLSI Design (VLSI Design 1999), 1999
Proceedings of the IEEE International Conference On Computer Design, 1999
Harmony - A Framework for Providing Quality of Service in Wireless Mobile Computing Environment.
Proceedings of the High Performance Computing, 1999
1998
Proceedings of the 11th International Conference on VLSI Design (VLSI Design 1991), 1998
1997
Modeling multi-threaded architectures in PAMELA for real-time high performance applications.
Proceedings of the Fourth International on High-Performance Computing, 1997
Proceedings of the European Design and Test Conference, 1997
1996
Proceedings of the 3rd International Conference on High Performance Computing, 1996
1995
Design and realization of high-performance wave-pipelined 8×8 b multiplier in CMOS technology.
IEEE Trans. Very Large Scale Integr. Syst., 1995
Proceedings of the 8th International Conference on VLSI Design (VLSI Design 1995), 1995
1994
VLSI Design, 1994
VLSI Design, 1994
Proceedings of the Seventh International Conference on VLSI Design, 1994
Proceedings of the Seventh International Conference on VLSI Design, 1994
Proceedings of the Seventh International Conference on VLSI Design, 1994
TWTXBB: A Low Latency, High Throughput Multiplier Architecture Using a New 4 --> 2 Compressor.
Proceedings of the Seventh International Conference on VLSI Design, 1994
1993
NPCPL: Normal Process Complementary Pass Transistor Logic for Low Latency, High Throughput Designs.
Proceedings of the Sixth International Conference on VLSI Design, 1993
A Parallel Progressive Refinement Image Rendering Algorithm on a Scalable Multithreaded VLSI Processor Array.
Proceedings of the 1993 International Conference on Parallel Processing, 1993
Proceedings of the Proceedings 1993 International Conference on Computer Design: VLSI in Computers & Processors, 1993
Architectural Synthesis of Performance-Driven Multipliers with Accumulator Interleaving.
Proceedings of the 30th Design Automation Conference. Dallas, 1993
1990
Microprocessing and Microprogramming, 1990
Microprocessing and Microprogramming, 1990
1989
Proceedings of the 26th ACM/IEEE Design Automation Conference, 1989
1986
Proceedings of the 23rd ACM/IEEE Design Automation Conference. Las Vegas, 1986