| Computer Science and Information Systems | |
| Research on improved privacy publishing algorithm based on set cover | |
| article | |
| Lv Haoze1  Liu Zhaobin1  Hu Zhonglian1  Nie Lihai2  Liu Weijiang1  Ye Xinfeng3  | |
| [1] School of Information Science and Technology, Dalian Maritime University China;Division of Intelligence and Computing, Tianjin University China;Department of Computer Science, University of Auckland New Zealand | |
| 关键词: Differential Privacy; Set Cover; Frequent Itemsets; Marginal Table; | |
| DOI : 10.2298/CSIS180915023L | |
| 学科分类:土木及结构工程学 | |
| 来源: Computer Science and Information Systems | |
PDF
|
|
【 摘 要 】
With the invention of big data era, data releasing is becoming a hot topicin database community. Meanwhile, data privacy also raises the attention of users.As far as the privacy protection models that have been proposed, the differentialprivacy model is widely utilized because of its many advantages over other models.However, for the private releasing of multi-dimensional data sets, the existing algorithms are publishing data usually with low availability. The reason is that the noisein the released data is rapidly grown as the increasing of the dimensions. In viewof this issue, we propose algorithms based on regular and irregular marginal tablesof frequent item sets to protect privacy and promote availability. The main idea isto reduce the dimension of the data set, and to achieve differential privacy protection with Laplace noise. First, we propose a marginal table cover algorithm basedon frequent items by considering the effectiveness of query cover combination, andthen obtain a regular marginal table cover set with smaller size but higher data availability. Then, a differential privacy model with irregular marginal table is proposedin the application scenario with low data availability and high cover rate. Next, weobtain the approximate optimal marginal table cover algorithm by our analysis toget the query cover set which satisfies the multi-level query policy constraint. Thus,the balance between privacy protection and data availability is achieved. Finally,extensive experiments have been done on synthetic and real databases, demonstrating that the proposed method preforms better than state-of-the-art methods in mostcases.
【 授权许可】
CC BY-NC-ND
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202303290006848ZK.pdf | 737KB |
PDF