Shaoxu Song

Orcid: 0000-0002-9503-2755

Affiliations:
  • Tsinghua University, Beijing, China


According to our database1, Shaoxu Song authored at least 103 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Apache TsFile: An IoT-native Time Series File Format.
Proc. VLDB Endow., August, 2024

Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data.
Proc. VLDB Endow., July, 2024

Distance-based Outlier Query Optimization in Apache IoTDB.
Proc. VLDB Endow., July, 2024

On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDB.
Proc. VLDB Endow., July, 2024

Time series data encoding in Apache IoTDB: comparative analysis and recommendation.
VLDB J., May, 2024

From Minimum Change to Maximum Density: On Determining Near-Optimal S-Repair.
IEEE Trans. Knowl. Data Eng., February, 2024

Time Series Representation for Visualization in Apache IoTDB.
Proc. ACM Manag. Data, February, 2024

Determining Exact Quantiles with Randomized Summaries.
Proc. ACM Manag. Data, February, 2024

Streaming data cleaning based on speed change.
VLDB J., January, 2024

High Precision ≠ High Cost: Temporal Data Fusion for Multiple Low-Precision Sensors.
Proc. ACM Manag. Data, 2024

Optimizing Time Series Queries with Versions.
Proc. ACM Manag. Data, 2024

Multimodal Data Encoding and Compression in Apache IoTDB.
Int. J. Softw. Informatics, 2024

Time-tired compaction: An elastic compaction scheme for LSM-tree based time-series database.
Adv. Eng. Informatics, 2024

ACER: Accelerating Complex Event Recognition via Two-Phase Filtering under Range Bitmap-Based Indexes.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

REGER: Reordering Time Series Data for Regression Encoding.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

On Tuning Raft for IoT Workload in Apache IoTDB.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

2023
Efficiently Cleaning Structured Event Logs: A Graph Repair Approach.
ACM Trans. Database Syst., March, 2023

TsQuality: Measuring Time Series Data Quality in Apache IoTDB.
Proc. VLDB Endow., 2023

CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space.
Proc. VLDB Endow., 2023

Time Series Data Validity.
Proc. ACM Manag. Data, 2023

Grouping Time Series for Efficient Columnar Storage.
Proc. ACM Manag. Data, 2023

Apache IoTDB: A Time Series Database for IoT Applications.
Proc. ACM Manag. Data, 2023

Learning Autoregressive Model in LSM-Tree based Store.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Backward-Sort for Time Series in Apache IoTDB.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Data Dependencies Extended for Variety and Veracity: A Family Tree (Extended abstract).
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Discovering Editing Rules by Deep Reinforcement Learning.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Non-Blocking Raft for High Throughput IoT Data.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Matrix Factorization with Landmarks for Spatial Data.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

2022
Data Dependencies Extended for Variety and Veracity: A Family Tree.
IEEE Trans. Knowl. Data Eng., 2022

Time Series Data Encoding for Efficient Storage: A Comparative Analysis in Apache IoTDB.
Proc. VLDB Endow., 2022

Frequency Domain Data Encoding in Apache IoTDB.
Proc. VLDB Endow., 2022

On Repairing Timestamps for Regular Interval Time Series.
Proc. VLDB Endow., 2022

Confidence Bounded Replica Currency Estimation.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

On Aligning Tuples for Regression.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Conditional Regression Rules.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Separation or Not: On Handing Out-of-Order Time-Series Data in Leveled LSM-Tree.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

2021
Cleaning timestamps with temporal constraints.
VLDB J., 2021

Stream Data Cleaning under Speed and Acceleration Constraints.
ACM Trans. Database Syst., 2021

Approximating Median Absolute Deviation with Bounded Error.
Proc. VLDB Endow., 2021

EXPERIENCE: Algorithms and Case Study for Explaining Repairs with Uniform Profiles over IoT Data.
ACM J. Data Inf. Qual., 2021

Time Series Data Cleaning under Multi-Speed Constraints.
Int. J. Softw. Informatics, 2021

Why Not Match: On Explanations of Event Pattern Queries.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

On Saving Outliers for Better Clustering over Noisy Data.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

FastSGG: Efficient Social Graph Generation Using a Degree Distribution Generation Model.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

From Minimum Change to Maximum Density: On S-Repair under Integrity Constraints.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Capturing Semantics for Imputation with Pre-trained Language Models.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
Enriching Data Imputation under Similarity Rule Constraints.
IEEE Trans. Knowl. Data Eng., 2020

Effective and Efficient Retrieval of Structured Entities.
Proc. VLDB Endow., 2020

Editorial: Special Issue on Metadata Discovery for Assessing Data Quality.
ACM J. Data Inf. Qual., 2020

Learning Individual Models for Imputation (Technical Report).
CoRR, 2020

Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing (Technical Report).
CoRR, 2020

Imputing Various Incomplete Attributes via Distance Likelihood Maximization.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Representing Temporal Attributes for Schema Matching.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Swapping Repair for Misplaced Attribute Values.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

IoT Data Quality.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

2019
A Public Domain Dataset for Human Activity Recognition in Free-Living Conditions.
Proceedings of the 2019 IEEE SmartWorld, 2019

Learning Individual Models for Imputation.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Fine-Grained Fuel Consumption Prediction.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

TsOutlier: Explaining Outliers with Uniform Profiles over IoT Data.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
Matching Heterogeneous Event Data.
IEEE Trans. Knowl. Data Eng., 2018

2017
Graph repairing under neighborhood constraints.
VLDB J., 2017

Response to "Differential Dependencies Revisited".
ACM Trans. Database Syst., 2017

Matching Heterogeneous Events with Patterns.
IEEE Trans. Knowl. Data Eng., 2017

Discovering Conditional Matching Rules.
ACM Trans. Knowl. Discov. Data, 2017

Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing.
Proc. VLDB Endow., 2017

2016
Efficient Recovery of Missing Events.
IEEE Trans. Knowl. Data Eng., 2016

A Survey on Accessing Dataspaces.
SIGMOD Rec., 2016

Semantic SPARQL Similarity Search Over RDF Knowledge Graphs.
Proc. VLDB Endow., 2016

Cleaning Timestamps with Temporal Constraints.
Proc. VLDB Endow., 2016

Efficient Set-Correlation Operator Inside Databases.
J. Comput. Sci. Technol., 2016

Sequential Data Cleaning: A Statistical Approach.
Proceedings of the 2016 International Conference on Management of Data, 2016

Constraint-Variance Tolerant Data Repairing.
Proceedings of the 2016 International Conference on Management of Data, 2016

2015
Enriching Data Imputation with Extensive Similarity Neighbors.
Proc. VLDB Endow., 2015

Optimizing data partition for scaling out NoSQL cluster.
Concurr. Comput. Pract. Exp., 2015

How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

SCREEN: Stream Data Cleaning under Speed Constraints.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Turn Waste into Wealth: On Simultaneous Clustering and Cleaning over Dirty Data.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Cleaning structured event logs: A graph repair approach.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

2014
Efficient Determination of Distance Thresholds for Differential Dependencies.
IEEE Trans. Knowl. Data Eng., 2014

Repairing Vertex Labels under Neighborhood Constraints.
Proc. VLDB Endow., 2014

On Concise Set of Relative Candidate Keys.
Proc. VLDB Endow., 2014

Probabilistic correlation-based similarity measure on text records.
Inf. Sci., 2014

Matching heterogeneous event data.
Proceedings of the International Conference on Management of Data, 2014

Matching heterogeneous events with patterns.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

2013
Indexing dataspaces with partitions.
World Wide Web, 2013

Comparable dependencies over heterogeneous data.
VLDB J., 2013

Efficient Recovery of Missing Events.
Proc. VLDB Endow., 2013

Efficient discovery of similarity constraints for matching dependencies.
Data Knowl. Eng., 2013

Context-aware reasoning middle ware applied in the mobile environment.
Proceedings of the International Conference on Machine Learning and Cybernetics, 2013

2012
Parameter-Free Determination of Distance Thresholds for Metric Distance Constraints.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

2011
Differential dependencies: Reasoning and discovery.
ACM Trans. Database Syst., 2011

Materialization and Decomposition of Dataspaces for Efficient Search.
IEEE Trans. Knowl. Data Eng., 2011

Answering Frequent Probabilistic Inference Queries in Databases.
IEEE Trans. Knowl. Data Eng., 2011

On data dependencies in dataspaces.
Proceedings of the 27th International Conference on Data Engineering, 2011

2010
Consistent query answers in inconsistent probabilistic databases.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Efficient set-correlation operator inside databases.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Discovering matching dependencies.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2007
Similarity Joins of Text with Incomplete Information Formats.
Proceedings of the Advances in Databases: Concepts, 2007

Probabilistic correlation-based similarity measure of unstructured records.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

2006
Improved ROCK for Text Clustering Using Asymmetric Proximity.
Proceedings of the SOFSEM 2006: Theory and Practice of Computer Science, 2006

2005
TCUAP: A Novel Approach of Text Clustering Using Asymmetric Proximity.
Proceedings of the 2nd Indian International Conference on Artificial Intelligence, 2005

Concept Chain Based Text Clustering.
Proceedings of the Computational Intelligence and Security, International Conference, 2005

Semantic Correlation Network Based Text Clustering.
Proceedings of the AI 2005: Advances in Artificial Intelligence, 2005


  Loading...