Most of the existing statistical disclosure control (SDC) standards, such as k-anonymity or l-diversity, were initially designed for static data. Therefore, they cannot be directly applied to stream data which is continuous, transient, and usually unbounded. Moreover, in streaming applications, there is a need to offer strong guarantees on the maximum allowed delay between incoming data and its corresponding anonymous output. In order to full-fill with these requirements, in this paper, we present a set of modifications to the most standard SDC methods, efficiently implemented within the Massive Online Analysis (MOA) stream mining framework. Besides, we have also developed a set of performance metrics to evaluate Information Loss and Disclosure Risk values continuously. Finally, we also show the efficiency of our new methods with a large set of experiments.
- MOA Framework
- Statistical disclosure control
- Stream mining
- Stream processing