There is much debate about the challenge to anonymize a large amount of information obtained in big data scenarios. Besides, it is even harder considering inferences from data may be used as additional adversary knowledge. This is the case of geo-located data, where the Points of Interest (POIs) may have additional information that can be used to link them to a user’s real identity. However, in most cases, when a model of the raw data is published, this processing protects up to some point the privacy of the data subjects by minimizing the published information. In this paper, we measure the privacy obtained by the minimization of the POIs published when we apply the Mobility Markov Chain (MMC) model, which extracts the most important POIs of an individual. We consider the gender inferences that an adversary may obtain from publishing the MMC model together with additional information such as the gender or age distribution of each POI, or the aggregated gender distribution of all the POIs visited by a data subject. We measure the unicity obtained after applying the MMC model, and the probability that an adversary that knows some POIs in the data before processing may be able to link them with the POIs published after the MMC model. Finally, we measure the anonymity lost when adding the gender attribute to the side knowledge of an adversary that has access to the MMC model. We test our algorithms on a real transaction database.
|Title of host publication||Information Management and Big Data - 6th International Conference, SIMBig 2019, Proceedings|
|Editors||Juan Antonio Lossio-Ventura, Nelly Condori-Fernandez, Jorge Carlos Valverde-Rebaza|
|Number of pages||14|
|State||Published - 1 Jan 2020|
|Event||Communications in Computer and Information Science - |
Duration: 1 Jan 2020 → …
|Name||Communications in Computer and Information Science|
|Conference||Communications in Computer and Information Science|
|Period||1/01/20 → …|
Bibliographical notePublisher Copyright:
© Springer Nature Switzerland AG 2020.
- Data protection regulation
- Gender inference
- Geo-located data privacy
- Mobility Markov Chain