Graph Anonymization using Machine Learning
13 May 2014
(Draft version) This paper adresses the problem of privacy related to graph data which needs to be externalized for analysis or other purposes. Data privacy is insured by performing data anonymization before release as required by legislation. An important research work has already been performed in the graph anonymization field. Existing algorithms or anonymization tools are strictely dependent on the considered privacy attacks, on the utilities to be preserved on the data as well as on the configuration parameters of the proposed methods. Actual techniques do not consider the case in which the analysis to be made on the data is externalized, very complex and not accessible by the data owner or by the anonymizer. The paper proposes a novel approach for graph anonymization based on machine learning and optimisation techniques. Graph data privacy protection is modelled as an optimisation problem and estimation density algorithms are used to find the best compromise between privacy protection and utility loss. The best adapted methods and parameters are learned on a part of the data and are tested on remaining data. Results are encouraging and comparison with methods available in the state of the art are proving better results when dealing with a larger panel of privacy attacks.