BDQM 2022

The 7th International Workshop on 

Big Data Quality Management (BDQM 2022)

9:00-13:00, April 11, 2022

Hyderabad, India (Online Conference)

Scope

Big data quality management is in demand to decrease the harm of data quality problems and computes high-quality problem from big data. Big data quality management has become one of the hottest issues not only in database community but also in artificial intelligence, data mining and other related area. The goal of this workshop is to raise the awareness of quality issues in big data and promote approaches to evaluate and improve big data quality. 

BDQM 2022 will be one of the workshops of the 27th International Conference on Database Systems for Advanced Applications (DASFAA). DASFAA provides a leading international forum for discussing the latest research on database systems and advanced applications, and it will be held from 11-14 April 2022 in Hyderabad India as an online conference. The conference website is: https://www.dasfaa2022.org/.

Invited Talk

Invited Talk One

Big Data Management and Analytics via Direct Computing on Compression

Feng Zhang

Renmin University of China, fengzhang@ruc.edu.cn


9:00-10:10 am. *All timings are as per Indian Standard Time (IST) (UTC + 05:30)


Abstract:

Today’s rapidly growing volumes of data pose pressing challenges to modern data management and analytics, in both space usage and processing time. This research proposes a method to directly manage and analyze data in the state of compression. The main idea is to enable direct processing on compressed data. In detail, we use interpretable grammar rules to describe data, and then convert data management and analysis operations into grammar interpretation and modification. This report discusses the challenges, insights, methods, and solutions of data management and analytics directly on compressed data.


Speaker Bio:

Feng Zhang, Associate Professor, Renmin University of China. In 2017, he received Ph.D from the Department of Computer Science of Tsinghua University. In the same year, he joined the Key Laboratory of Data Engineering and Knowledge Engineering (MOE), Renmin University of China. His research interests include databases and high-performance computing, and he mainly studies high-performance direct processing on compression in big data environments. He has published more than 20 CCF-A papers in SIGMOD, USENIX ATC, VLDB, TPDS and other conferences and journals. He got the ACM SIGHPC China Rising Star Award.

Invited Talk Two

Data Quality Management: Turn Waste into Wealth

Shaoxu Song

Tsinghua University , sxsong@tsinghua.edu.cn


10:10-11:20 am. *All timings are as per Indian Standard Time (IST) (UTC + 05:30)


Abstract:

Data quality is measured by accuracy, completeness, consistency, etc. While correct data often follow certain patterns, errors could be in multifarious ways. Simply discarding the suspected errors would make the data even more incomplete. To turn the wastes into wealth, data integration and cleaning are conducted. Typical steps on improving data quality include (1) data matching for identifying hidden connections, (2) data profiling for discovering embedded patterns, (3) data validation for detecting suspected errors, and (4) data repairing for remedying detected errors. In this talk, I will introduce our recent studies on improving the quality of various data, such as relational data, graph data, time series data, and so on.


Speaker Bio:

Shaoxu Song is an Associate Professor in the School of Software at Tsinghua University. He received his PhD in Computer Science from the Hong Kong University of Science and Technology. His research interests include data quality and data integration. He has published more than 40 research papers in top journals and conferences in the area, such as ACM TODS, VLDBJ, IEEE TKDE, ACM SIGMOD, KDD, VLDB, IEEE ICDE and so on. Dr. Song served in program committees of international conferences and workshops, including VLDB, ICDE, KDD, IJCAI, CIKM, etc.

Program Schedule

4 hours (from 9:00-13:00, half a day) *All timings are as per Indian Standard Time (IST) (UTC + 05:30)


9:00-10:10 Invited Talk 1

Title: Big Data Management and Analytics via Direct Computing on Compression

Speaker: Feng Zhang, Renmin University of China.


10:10-11:20 Invited Talk 2

Title: Data Quality Management: Turn Waste into Wealth

Speaker: Shaoxu Song, Tsinghua University, China.


11:20-12:50 Paper presentation

01 Evaluating Presto and SparkSQL with TPC-DS

Yinhao Hong, Sheng Du, Jianquan Leng

(Renmin University of China, Beijing Kingbase Information Technology Co., Ltd, Beijing, China)


02 Optimizing the Age of Sensed Information in Cyber-Physical Systems

Yinlong Li,Siyao Cheng,Feng Li, Jie Liu, Hanling Wu

(Harbin Institute of Technology, Harbin Institute of Technology(shenzen), Beijing Institute of Astronautical Systems Engineering, China)


03 Aggregate Query Result Correctness using Pattern Tables

Nitish Yadav, Ayushi Malhotra, Sakshee Patel, Minal Bhise

(Distributed Databases Group, DAIICT, Gandhinagar, India)


04 Time Series Data Quality Enhancing based on Pattern Alignment

Jianping Huang, Hao Chen, Hongkai Wang, Jun Feng, Liangying Peng, Zheng Liang, Hongzhi Wang, Tianlan Fan, Tianren Yu

(State Grid Zhejiang Electric Power Co., Ltd, State Grid Zhejiang Information and Telecommunication Branch, China, Harbin Institute of Technology)


05 Research on feature extraction method of data quality intelligent detection

Weiwei Liu,Shuya Lei, Xiaokun Zheng, Xiao Liang

(Artificial Intelligence on Electric Power System State Grid Corporation Joint Laboratory(GEIRI) China)

Organizers

Xiaoou Ding


Harbin Institute of Technology, China, dingxiaoou@hit.edu.cn

PhD. Associate Professor


Xueli Liu


Xueli Liu, Tianjin University, China, xueli@tju.edu.cn

PhD. Associate Professor