Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks

0
347
Collective Data-Sanitization for Preventing Sensitive Information

Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks

Abstract of Collective Data-Sanitization for Preventing Sensitive Information

Collective Data-Sanitization for Preventing Sensitive Information

Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks.Releasing social network data could seriously breach user privacy. User profile and friendship relations are inherently private. Unfortunately, sensitive information may be predicted out of released data through data mining techniques. Therefore, sanitizing network data prior to release is necessary. In this paper, we explore how to launch an inference attack exploiting social networks with a mixture of non-sensitive attributes and social relationships. We map this issue to a collective classification problem and propose a collective inference model. In our model, an attacker utilizes user profile and social relationships in a collective manner to predict sensitive information of related victims in a released social network dataset. To protect against such attacks, we propose a data sanitization method collectively manipulating user profile and friendship relations. Besides sanitizing friendship relations, the proposed method can take advantages of various data-manipulating methods.

Conclusion

Collective Data-Sanitization for Preventing Sensitive Information.We address two issues in this paper: (a) how exactly third party users launch an inference attack to predict sensitive information of users, and (b) are there effective strategies to protect against such an attack to achieve a desired privacy-utility tradeoff. For the first issue, we show that collectively utilizing both attribute and link information can significantly increase prediction accuracy for sensitive information. For the second issue, we explore the dependence relationships for utility/public attributes, and privacy/public attributes.