Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending

0
328
Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending

Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending

Abstract

Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending,Twitter trends, a timely updated set of top terms in Twitter, have the ability to affect the public agenda of the community and have attracted much attention. Unfortunately, in the wrong hands, Twitter trends can also be abused to mislead people. In this paper, we attempt to investigate whether Twitter trends are secure from the manipulation of malicious users. We collect more than 69 million tweets from 5 million accounts. Using the collected tweets, we first conduct a data analysis and discover evidence of Twitter trend manipulation. Then, we study at the topic level and infer the key factors that can determine whether a topic starts trending due to its popularity, coverage, transmission, potential coverage, or reputation. What we find is that except for transmission, all of factors above are closely related to trending. Finally, we further investigate the trending manipulation from the perspective of compromised and fake accounts and discuss countermeasures.
 

Introduction

The Internet has subverted the autocratic way of disseminating news by traditional media like newspapers. Online trends are different from traditional media as a method for information propagation. For instance, Google Hot Trends ranks the hottest searches that have recently experienced a sudden surge in popularity [2]. Meanwhile, these trends may attract much more attention than before due to their appearance on Google Hot Trends. More recently, Online Social Networking (OSN) like Twitter has inaugurated a new era of “We Media.” Twitter is a real-time microblogging service. Users broadcast short messages no longer than 140 characters (called tweets) to their followers. Users can also discuss with the others on a variety of topics at will. The topics that gain sudden popularity are ranked by Twitter as a list of trends (also known as trending topics) [3]. Twitter and Google trends have become an important tool for journalists. Twitter in particular is used to develop stories, track breaking news, and assess how public opinion is evolving in the breaking story. Taking election campaignsasanexample[5],journalists,campaigns,and pundits have tracked trends in Twitter traffic to determine candidates’ popularity and predict likely election outcomes [4]. Previous research have studied trend taxonomy [7], [9], [10], trend detection [14], [17], [19], [20], and real events extraction from Twitter trends [6], [38]. However, researchers have paid little attention to Twitter trend manipulation. It is reported that attackers manipulate Google trends by simply employing large group of people to visit Google and search for a specific keyword phrase [23]. Also, Just et al. [4] inspected Twitter manipulation in an election campaign. As reported in The Wall Street Journal, robots have been used to undermine the “trending topics” on Twitter [1]. Thus, the focus of this work is on Twitter trend manipulation. In this paper, the primary questions we attempt to answer are whether the malicious users can manipulate the Twitter trends and how they might be able to do that? Being exposed to real-time trending topics, users are entitled to have insight into how those trends actually go trending. Moreover, this research also cast light on how to enhance a commercial promotion campaign by reasonably using Twitter trends. To investigate the possibility of manipulating Twitter trends, we need to deeply understand how Twitter trending works. Twitter statesthattrendsaredeterminedbyanalgorithmandare always topics that are immediately popular. However, the detailed trending algorithm of Twitter is unknown to the public, and we have no way to find out what it specifically is. Instead, we study Twitter trending at the topic level and infer the key factors that can determine whether a topic trends from its popularity, coverage, transmission, potential coverage, and reputation. After identifying those key factors that are associated with the trends, we then investigate the manipulation and countermeasures from the perspective of these key factors. The major contributions of this work are as follows:

• We demonstrate the evidence of the existing manipulation of Twitter trends. In particular, employing an influence model, we analyze the dynamics of an endogenous hashtag and identify the manipulation from its endogenous diffusion. After further investigating the manipulation in the dynamics, we disclose the existence of a suspect spamming infrastructure.

• We study Twitter trending at topic level,considering topics’ popularity, coverage, transmission, potential coverage, and reputation. The corresponding dynamics for each factor above are extracted, and then Support Vector Machine (SVM) classifier is used to check how accurately a factor could predict trending. We find that, except for transmission, each studied factor is associated with trending. We further illustrate the interaction pattern between malicious accounts and authenticated accounts, with respect to trending.

• We present the threat of malicious manipulation of Twitter trending, given compromised and fake accounts in the suspect spamming infrastructure we observed. Then we demonstrate how compromised and fake accounts could threaten Twitter trending by simulating the manipulation of dynamics as compromised and fake accounts would do. Corresponding countermeasures are then discussed.

  • Previous research have studied trend taxonomy, trend detection, and real events extraction from Twitter trends
  • Just et al. inspected Twitter manipulation in an election campaign. As reported in The Wall Street Journal, robots have been used to undermine the “trending topics” on Twitter.
  • Becker et al. analyzed the stream of Twitter messages and distinguished the messages about real events from non-event messages based on a clustering method.
  • Zubiaga et al. categorized different triggers that leverage the trending topics by using social features rather than content-based approaches.

Disadvantages

  • Researchers have paid little attention to Twitter trend manipulation.
  • It is reported that attackers manipulate Google trends by simply employing large group of people to visit Google and search for a specific keyword phrase.

Proposed System

  • In this paper, the primary questions we attempt to answer are whether the malicious users can manipulate the Twitter trends and how they might be able to do that? Being exposed to real-time trending topics, users are entitled to have insight into how those trends actually go trending.
  • Moreover, this research also cast light on how to enhance a commercial promotion campaign by reasonably using Twitter trends. To investigate the possibility of manipulating Twitter trends, we need to deeply understand how Twitter trending works.
  • Twitter states that trends are determined by an algorithm and are always topics that are immediately popular. However, the detailed trending algorithm of Twitter is unknown to the public, and we have no way to find out what it specifically is. Instead, we study Twitter trending at the topic level and infer the key factors that can determine whether a topic trends from its popularity, coverage, transmission, potential coverage, and reputation

Advanatages

  • We demonstrate the evidence of the existing manipulation of Twitter trends. In particular, employing an influence model, we analyze the dynamics of an endogenous hashtag and identify the manipulation from its endogenous diffusion.
  • After further investigating the manipulation in the dynamics, we disclose the existence of a suspect spamming infrastructure.
  • We study Twitter trending at topic level, considering topics’ popularity, coverage, transmission, potential coverage, and reputation. The corresponding dynamics for each factor above are extracted, and then Support Vector Machine (SVM) classifier is used to check how accurately a factor could predict trending.
  • We find that, except for transmission, each studied factor is associated with trending.
  • We further illustrate the interaction pattern between malicious accounts and authenticated accounts, with respect to trending.
  • We present the threat of malicious manipulation of Twitter trending, given compromised and fake accounts in the suspect spamming infrastructure we observed. Then we demonstrate how compromised and fake accounts could threaten Twitter trending by simulating the manipulation of dynamics as compromised and fake accounts would do. Corresponding countermeasures are then discussed.
Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending

Related Work

To the best of our knowledge, this is the first effort to investigate whether Twitter trends could be manipulated. Research on trending topics in Twitter includes real event recognization [6], [7], realtime trending topic detection [14], [15], [16], [38], the evolution of trending topic characterization [17], [18], and the taxonomy of trending topics [9], [21], [22]. Becker et al. [6] analyzed the stream of Twitter messages and distinguished the messages about real events from non-event messages based on a clustering method. Zubiaga et al. [7] categorized different triggers that leverage the trending topics by using social features rather than content-based approaches. In the detection of realtime trending topics, Agarwal et al. [38] identified the emerging events before they became trending topics by modeling the detection problem as discovering dense clusters in highly dynamic graphs. Kasiviswanathan et al. [14] presented a dictionarylearning-based framework for detecting emerging topics in social media via the user-generated stream. Lu et al. [15] used an energy function to model the life activity of news events on Twitter and proposed a news event detection method based on online energy function. Cataldi et al. [16] identified emerging terms from user content by measuring user authority and proposing a keyword life cycle model, and then detected the emerging topics by formalizing the keyword-based topic graph. To address the evolution and taxonomy of trending topics, Altshuler and Pan [17] presented the lower bounds of the probability that emerging trends successfully spread through the scale-free networks. Asur et al. [18] studied trending topics on Twitter and theoretically analyzed the formation, persistence, and decay of trends. Naaman et al. [9] characterized the trends in multiple dimensions and presented a taxonomy of trends. They also proposed a collection of hypotheses on different kinds of trends and evaluated them. Lehmann et al. [21] classified the popular hashtags by the temporal dynamics of hashtags. Irani et al. [22] focused on the trendstuffing issue and developed a classifier to automatically identify the trend-stuffing in tweets. Whetheratopicbeginstrendingiscloselyrelatedto(1) the influence of users who are involved with the topic and (2) the topic adoption for users who are exposed to the topic. Cha et al. [24] performed a comparison of three different measures of influence: indegree, retweet, and mention. Weng et al. [25] proposed a topic-sensitive PageRank measure for user influence. Romero et al. [26] proposed an algorithm to measure the relative influence and passivity of each user from the viewpoint of a whole network. Bakshy et al. [27] measured the influence from the diffusion tree. The studies of topic adoption in Twitter mainly concentrate on hashtag adoption. Lin et al. [28] classified the adoption of hashtags into two classes and proposed a framework to capture the dynamics of hashtags based on their topicality, interactivity, diversity, and prominence. Yang et al. [29] studied the effect of the dual role of a hashtag on hashtag adoption.

Limitation

There are some limitations of our work, some of which will be addressed in our future work. First, we use a linear influence model to capture the network impact on the diffusion of a topic in Twitter, which enables us to find the evidence of manipulation. The application of the model is limited to linear scenarios. We will develop a non-linear model in our future work. Second, we randomly choose 11 topics and more than 10,000 related tweets to infer the relevance of five key factors over Twitter trending. Although we have tried our best to guarantee the randomness, those 11 sample topics may not be large enough to represent the overall scenario in practice. Besides, we study five comparatively straight-forward factors that may affect trending. In the future work, we will consider more complicated factors and sample more topics to study the factors over trending. Finally, we propose the countermeasures against Twitter trend manipulation but most of them remain in the discussion stage. We leave the implementation and evaluation of those counter measures for our future work. Specifically, we plan to develop a manipulation detection mechanism by using an SVM classifier. We will train the classifier using previously manipulated topics and then classify future trends as manipulated or not.

Conclusion

Twitter Trends Manipulation: A First Look Inside the Security of Twitter Trending,With the datasets we collected via Twitter API, we first evidence the manipulation of Twitter trending and observe a suspect spamming infrastructure. Then, we employ the SVM classifier to explore how accurately five different factors at the topic level (popularity, coverage, transmission, potential coverage, and reputation) could predict the trending. We observe that, except for transmission, the other factors are all closely related to Twitter trending. We further investigate the interacting patterns between authenticated accounts and malicious accounts. Finally, we present the threat posed by compromised and fake accounts to Twitter trending and discuss the corresponding counter measures against trending manipulation.