David Gregg
Orcid: 0000-0003-3782-4612Affiliations:
- Trinity College Dublin, Ireland
According to our database1,
David Gregg
authored at least 126 papers
between 2000 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2025
IEEE Trans. Circuits Syst. II Express Briefs, January, 2025
2024
IEEE Embed. Syst. Lett., December, 2024
E<sup>2</sup>CSM: efficient FPGA implementation of elliptic curve scalar multiplication over generic prime field GF(p).
J. Supercomput., January, 2024
Nearest-neighbor, BERT-based, scalable clone detection: A practical approach for large-scale industrial code bases.
Softw. Pract. Exp., 2024
GMC-crypto: Low latency implementation of ECC point multiplication for generic Montgomery curves over GF(p).
J. Parallel Distributed Comput., 2024
2023
ACM Trans. Embed. Comput. Syst., November, 2023
CoRR, 2023
EC-Crypto: Highly Efficient Area-Delay Optimized Elliptic Curve Cryptography Processor.
IEEE Access, 2023
Proceedings of the 31st Euromicro International Conference on Parallel, 2023
Proceedings of the 17th IEEE International Workshop on Software Clones, 2023
2022
ACM Trans. Embed. Comput. Syst., November, 2022
Guest Editorial: Introduction to the Special Section on Communication-Efficient Distributed Machine Learning.
IEEE Trans. Netw. Sci. Eng., 2022
High-speed parallel reconfigurable Fp multipliers for elliptic curve cryptography applications.
Int. J. Circuit Theory Appl., 2022
Proceedings of the IEEE International Conference on Software Maintenance and Evolution, 2022
Proceedings of the SSA-based Compiler Design, 2022
2021
ACM Trans. Archit. Code Optim., 2021
Proceedings of the IEEE Nordic Circuits and Systems Conference, NorCAS 2021, Oslo, 2021
Domino Saliency Metrics: Improving Existing Channel Saliency Metrics with Structural Information.
Proceedings of the AIxIA 2021 - Advances in Artificial Intelligence, 2021
2020
Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks.
ACM Trans. Math. Softw., 2020
Bonseyes AI Pipeline - Bringing AI to You: End-to-end integration of data, algorithms, and deployment tools.
ACM Trans. Internet Things, 2020
HOBFLOPS CNNs: Hardware Optimized Bitsliced Floating-Point Operations Convolutional Neural Networks.
CoRR, 2020
Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence, 2020
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020
Proceedings of the 21st ACM SIGPLAN/SIGBED International Conference on Languages, 2020
2019
Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019
Proceedings of the 26th IEEE Symposium on Computer Arithmetic, 2019
Proceedings of the AI*IA 2019 - Advances in Artificial Intelligence, 2019
POSTER: Space and Time Optimal DNN Primitive Selection with Integer Linear Programming.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019
2018
Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing.
ACM Trans. Archit. Code Optim., 2018
CoRR, 2018
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018
2017
IEEE Trans. Computers, 2017
Mutual Inclusivity of the Critical Path and its Partial Schedule on Heterogeneous Systems.
CoRR, 2017
Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks.
IEEE Comput. Archit. Lett., 2017
Bitslice Vectors: A Software Approach to Customizable Data Precision on Processors with SIMD Extensions.
Proceedings of the 46th International Conference on Parallel Processing, 2017
Proceedings of the 28th IEEE International Conference on Application-specific Systems, 2017
2016
Parallel Performance Problems on Shared-Memory Multicore Systems: Taxonomy and Observation.
IEEE Trans. Software Eng., 2016
ACM Trans. Archit. Code Optim., 2016
CoRR, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
2015
Heuristics on Reachability Trees for Bicriteria Scheduling of Stream Graphs on Heterogeneous Multiprocessor Architectures.
ACM Trans. Embed. Comput. Syst., 2015
IEEE Micro, 2015
Exploiting Hyper-Loop Parallelism in Vectorization to Improve Memory Performance on CUDA GPGPU.
Proceedings of the 2015 IEEE TrustCom/BigDataSE/ISPA, 2015
An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
Proceedings of the Network and Parallel Computing, 2014
Proceedings of the Languages and Compilers for Parallel Computing, 2014
An improved simulated annealing heuristic for static partitioning of task graphs onto heterogeneous architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2014
2013
ACM Trans. Archit. Code Optim., 2013
ACM Trans. Archit. Code Optim., 2013
Int. J. Parallel Program., 2013
Proceedings of the 12th IEEE International Conference on Trust, 2013
Proceedings of the 12th IEEE International Conference on Trust, 2013
2012
Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions.
ACM Trans. Archit. Code Optim., 2012
A practical solution for achieving language compatibility in scripting language compilers.
Sci. Comput. Program., 2012
Proceedings of the 16th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, 2012
2011
Optimizing interpreters by tuning opcode orderings on virtual machines for modern architectures: or: how I learned to stop worrying and love hill climbing.
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, 2011
2010
GSFAP adaptive filtering using log arithmetic for resource-constrained embedded systems.
ACM Trans. Embed. Comput. Syst., 2010
ACM J. Exp. Algorithmics, 2010
An output sensitive algorithm for computing a maximum independent set of a circle graph.
Inf. Process. Lett., 2010
Proceedings of the Progress in Cryptology - INDOCRYPT 2010, 2010
Proceedings of the CGO 2010, 2010
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010
2009
Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), 2009
Proceedings of the Languages and Compilers for Parallel Computing, 2009
Proceedings of the Languages and Compilers for Parallel Computing, 2009
Using the Meeting Graph Framework to Minimise Kernel Loop Unrolling for Scheduled Loops.
Proceedings of the Languages and Compilers for Parallel Computing, 2009
Proceedings of the Computational Science, 2009
2008
A stochastic bitwidth estimation technique for compact and low-power custom processors.
ACM Trans. Embed. Comput. Syst., 2008
ACM Trans. Archit. Code Optim., 2008
ACM J. Exp. Algorithmics, 2008
ACM J. Exp. Algorithmics, 2008
Optimization strategies for a java virtual machine interpreter on the cell broadband engine.
Proceedings of the 5th Conference on Computing Frontiers, 2008
2007
ACM Trans. Program. Lang. Syst., 2007
Proceedings of the FPL 2007, 2007
2006
Analyzing Effects of Trace Cache Configurations on the Prediction of Indirect Branches.
J. Instr. Level Parallelism, 2006
Concurr. Comput. Pract. Exp., 2006
Proceedings of the IEEE Workshop on Signal Processing Systems, 2006
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006
Low-Cost Microarchitectural Techniques for Enhancing the Prediction of Return Addresses on High-Performance Trace Cache Processors.
Proceedings of the Computer and Information Sciences, 2006
High Performance Scientific Computing Using FPGAs with IEEE Floating Point and Logarithmic Arithmetic for Lattice QCD.
Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006
GSFAP adaptive filtering using log arithmetic for resource-constrained embedded systems.
Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, 2006
Efficient Floating-Point Implementation of High-Order (N)LMS Adaptive Filters in FPGA.
Proceedings of the Reconfigurable Computing: Architectures and Applications, 2006
2005
Des. Autom. Embed. Syst., 2005
Concurr. Pract. Exp., 2005
Proceedings of the 1st International Conference on Virtual Execution Environments, 2005
Proceedings of the 35th IEEE International Symposium on Multiple-Valued Logic (ISMVL 2005), 2005
FPGA Implementation of a Lattice Quantum Chromodynamics Algorithm Using Logarithmic Arithmetic.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Proceedings of the Digital Games Research Conference 2005, 2005
Proceedings of the Compiler Construction, 14th International Conference, 2005
2004
Proceedings of the 2004 Workshop on Interpreters, Virtual Machines and Emulators, 2004
Fine-Tuning Loop-Level Parallelism for Increasing Performance of DSP Applications on FPGAs.
Proceedings of the 12th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2004), 2004
Automatic Customization of Embedded Applications for Enhanced Performance and Reduced Power Using Optimizing Compiler Techniques.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
Stochastic Bit-Width Approximation Using Extreme Value Theory for Customizable Processors.
Proceedings of the Compiler Construction, 13th International Conference, 2004
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004
2003
J. Instr. Level Parallelism, 2003
Platform independent dynamic Java virtual machine analysis: the Java Grande Forum benchmark suite.
Concurr. Comput. Pract. Exp., 2003
Proceedings of the Software and Compilers for Embedded Systems, 7th International Workshop, 2003
Proceedings of the 2003 ACM Symposium on Applied Computing (SAC), 2003
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003
Proceedings of the 2003 Workshop on Interpreters, Virtual Machines and Emulators, 2003
Proceedings of the Domain-Specific Program Generation, International Seminar, 2003
2002
Softw. Pract. Exp., 2002
Measuring the impact of object-oriented techniques in grande applications: a method-level analysis.
Proceedings of the 2002 Joint ACM-ISCOPE Conference on Java Grande 2002, 2002
Proceedings of the Compiler Construction, 11th International Conference, 2002
2001
Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001
Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001
Proceedings of the Euro-Par 2001: Parallel Processing, 2001
Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling.
Proceedings of the Compiler Construction, 10th International Conference, 2001
2000
Proceedings of the Compiler Construction, 9th International Conference, 2000