Jilong Xue

Orcid: 0000-0002-4495-1997

According to our database1, Jilong Xue authored at least 36 papers between 2011 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FPRev: Revealing the Order of Floating-Point Summation by Numerical Testing.
CoRR, 2024

GRIN: GRadient-INformed MoE.
CoRR, 2024

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration.
CoRR, 2024

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor.
CoRR, 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits.
CoRR, 2024

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement.
Proc. ACM Manag. Data, 2023

Retentive Network: A Successor to Transformer for Large Language Models.
CoRR, 2023

Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Optimizing Dynamic Neural Networks with Brainstorm.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Welder: Scheduling Deep Learning Memory Access via Tile-graph.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

2022
ROLLER: Fast and Efficient Tensor Compilation for Deep Learning.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

2021
Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

2020
Distributed Graph Computation Meets Machine Learning.
IEEE Trans. Parallel Distributed Syst., 2020

Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

2019
NeuGraph: Parallel Deep Neural Network Computation on Large Graphs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Fast Distributed Deep Learning over RDMA.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

2018
Towards Efficient Large-Scale Graph Neural Network Computing.
CoRR, 2018

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA.
CoRR, 2018

2017
Processing Concurrent Graph Analytics with Decoupled Computation Model.
IEEE Trans. Computers, 2017

Garaph: Efficient GPU-accelerated Graph Processing on a Single Machine with Balanced Replication.
Proceedings of the 2017 USENIX Annual Technical Conference, 2017

Tux<sup>2</sup>: Distributed Graph Computation for Machine Learning.
Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017

2016
VoteTrust: Leveraging Friend Invitation Graph to Defend against Social Network Sybils.
IEEE Trans. Dependable Secur. Comput., 2016

Efficient Distributed Machine Learning with Trigger Driven Parallel Training.
Proceedings of the 2016 IEEE Global Communications Conference, 2016

2015
Understanding the performance of offline download in real p2p networks.
Peer-to-Peer Netw. Appl., 2015

Uncovering User Interaction Dynamics in Online Social Networks.
Proceedings of the Ninth International Conference on Web and Social Media, 2015

Process-driven Analysis of Dynamics in Online Social Interactions.
Proceedings of the 2015 ACM on Conference on Online Social Networks, 2015

GraM: scaling graph computation to the trillions.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

When computing meets heterogeneous cluster: Workload assignment in graph computation.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014
A Topology Construct and Control Model with Small-World and Scale-Free Concepts for Heterogeneous Sensor Networks.
Int. J. Distributed Sens. Networks, 2014

Seraph: an efficient, low-cost system for concurrent graph processing.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

2013
Unfolding dynamics in a social network: co-evolution oflink formation and user interaction.
Proceedings of the 22nd International World Wide Web Conference, 2013

VoteTrust: Leveraging friend invitation graph to defend against social network Sybils.
Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14-19, 2013, 2013

2011
On the QoS of Offline Download in Retrieving Peer-Side File Resource.
Proceedings of the International Conference on Parallel Processing, 2011


  Loading...