Online discourse and Information Security

Online Discourse, Censorship and Information Security

Information leakage is a complex problem. It occurs when a member of a social network posts seemingly inconspicuous information which when correlated or cross-referenced with other such it information leads to the exposure of information that was intended to remain secret or unexposed. This presents a two-fold challenge. First, since the specific content released does not contain confidential information it is difficult for content-based detection techniques to identify; Second, since the individual releasing the information would not inherently be viewed as a threat, the topology of their social network sub-map may not behave in ways that mimic classic information diffusion patterns of a leak.

UIL Behavior: We study the interplay between online news articles, reader comments, and social networks, to detect and characterize a new form of unintentional information leakage (UIL) - the accidental disclosure of confidential information not intended for public release. Organizations including the military and the courts use censorship to enhance security, with non-identification of individuals seen as necessary protection for military personnel, witnesses, minors, victims or suspects requiring anonymity We are investigating both qualitative methods for recognition and characterization of UIL as well as by a quantitative methods that automatically detects UIL comments. Our work extends across multiple social media and networking platforms including Facebook and X/Twitter.

Network Topology Anomalies: We examine the topological impact of leaked information on the social networks. That is, we examine whether such information is likely to increase the number of new edges in the network (followers), and/ or increase use of already existing edges (data sharing). Current state of the art focuses on content analysis for this research problem. We propose a two phase approach to tackle this question. First, we model the baseline topology dynamics in terms of followers (increase, decrease), and information diffusion. We further develop a data-driven simulation that generates multiple data scenarios. We use these scenarios as alternative realizations of the data that follows the expected baseline network dynamics. In the second step, we analyze anomalous behavior in the data. In other words, we pose the questions of (1) what is anomalous topology (theoretical definition), and (2) does leaked information cause anomalous behavior in the network (data-driven question).

This research is funded by Israel Ministry of Science and Technology research grant 3-9770 “Data Leakage in Social Networks: Detection and Prevention”.

Our more recent work in this area deals with civil social media discourse and its impact on society.

Related publications include:

Yahav, I., Shehory, O., & Schwartz, D.G., Comments Mining With TF-IDF: The Inherent Bias and Its Removal. IEEE Transactions on Knowledge and Data Engineering, 31(3), 437-450, 2018. https://ieeexplore.ieee.org/document/8364601/
Cascavilla, G., Conti, M., Schwartz, D. G., & Yahav, I., The insider on the outside: a novel system for the detection of information leakers in social networks. European Journal of Information Systems, 27(4), 470-485, 2018.
Yahav, I. and Schwartz, D.G. Citizen engagement and the illusion of secrecy: Exploring commenter characteristics in censored online news articles. Information, Communication & Society, 2017. http://dx.doi.org/10.1080/1369118X.2017.1346135.
Schwartz, D.G., Yahav, I., and Silverman, G. News Censorship in Online Social Networks: A Study of Circumvention in the Commentsphere, Journal of the Association of Information Science and Technology (JASIST), Vol. 68, No. 3, pp.569-582, March 2017.