
Profiling Online Social Behaviors for Compromised Account Detection
Abstract
Introduction
COMPROMISED accounts in Online Social Networks (OSNs) are more favorable than sybil accounts to spammers and other malicious OSN attackers. Malicious parties exploit the well-established connections and trust relationships between the legitimate account owners and their friends, and efficiently distribute spam ads, phishing links, or malware, while avoiding being blocked by the service providers. Offline analyses of tweets and Facebook posts [10], [12] reveal that most spam are distributed via compromised accounts, instead of dedicated spam accounts.
Recent large-scale account hacking incidents [1], [2] in popular OSNs further evidence this trend. Unlike dedicated spam or sybil accounts, which are created solely to serve malicious purposes, compromised accounts are originally possessed by benign users, While dedicated malicious accounts can be simply banned or removed upon detection, compromised accounts cannot be handled likewise due to potential negative impact to normal user experience (e.g., those accounts may still be actively used by their legitimate benign owners). Major OSNs today employ IP geolocation logging to battle against account compromisation .
However, this approach is known to suffer from low detection granularity and high false positive rate. Previous research on spamming account detection mostly cannot distinguish compromised accounts from sybil accounts, with only one recent study by Egele et al. features compromised accounts detection. Existing approaches involve account profile analysis , and message content analysis(e.g. embedded URL analysis and message clustering). However, account profile analysis is hardly applicable for detecting compromised accounts, because their profiles are the original common users’ information which is likely to remain intact by spammers. URL blacklisting has the challenge of timely maintenance and update, and message clustering introduces significant overhead when subjected to a large number of real-time messages. Instead of analyzing user profile contents or message contents, we seek to uncover the behavioral anomaly of compromised accounts by using their legitimate owners’ history social activity patterns, which can be observed in a lightweight manner.
To better serve users’ various social communication needs, OSNs provide a great variety of online features for their users to engage in, such as building connections, sending messages, uploading photos, browsing friends’ latest updates, etc. However, how a user involves in each activity is completely driven by personal interests and social habits. As a result, the interaction patterns with a number of OSN activities tend to be divergent across a large set of users. While a user tends to conform to its social patterns, a hacker of the user account who knows little about the user’s behavior habit is likely to diverge from the patterns. Therefore, as long as an authentic user’s social patterns are recorded, checking the compliance of the account’s upcoming behaviors with the authentic patterns can detect account compromisation. Even though a user’s credential is hacked, a malicious party cannot easily obtain the user’s social behavior patterns without the control of the physical machines or the clickstreams. Moreover, considering that for a spammer, who carries very different social interests from those of regular users (e.g., mass spam distribution vs. entertaining with friends), it is very costly to mimic different individual user’s social interaction patterns, as it will significantly reduce spamming efficiency.
In sight of the above intuition and reasoning, we first conduct a study on online user social behaviors by collecting and analyzing user clickstreams of a well known OSN website. Based on our observation of user interaction with different OSN services, we propose several new behavioral features that can effectively quantify user differences in online social activities. For each behavioral feature, we deduce a behavioral metric by obtaining a statistical distribution of the value ranges, observed from each user’s clickstreams. Moreover, we combine the respective behavioral metrics of each user into a social behavioral profile, which represents a user’s social behavior patterns. To validate the effectiveness of social behavioral profile in detecting account activity anomaly, we apply the social behavioral profile of each user to differentiate clickstreams of its respective user from all other users. We conduct multiple cross-validation experiments, each with varying amount of input data for building social behavioral profiles. Our evaluation results show that social behavioral profile can effectively differentiate individual OSN users with accuracy up to 98.6%, and the more active a user, the more accurate the detection.

Related Work
Schneider et al. and Benevenuto et al. measured OSN users’ behaviors based on network traffic collected from ISPs. Both works analyze the popularity of OSN services, session length distributions, and user click sequences among OSN services, and discover that browsing accounts for a majority of users’ activities. Benevenuto et al. [6] further explored user interactions with friends and other users multiple hops away. While these works primarily emphasize the overall user OSN service usage, and aim to uncover general knowledge on how OSNs are used, this paper studies users’ social behavior characteristics for a very different purpose. We investigate the characterization of individual user’s social behaviors to detect account usage anomaly. Moreover, we propose several new user behavioral features and perform measurement study at a fine granularity. Viswanath et al. [20] also aim to detect abnormal user behaviors in Facebook, but they soly focus on “like” behaviors to detect spammers.
While most previous research on malicious account detection cannot differentiate compromised accounts from spam accounts, Egele et al.specifically studied the detection of compromised accounts. By recording a user’s message posting features, such as timing, topics and correlation with friends, they detected irregular posting behaviors; on the other hand, all messages in a certain duration are clustered based on the content, and the clusters in which most messages are posted by irregular behaviors are classified as from compromised accounts. While they also leveraged certain user behavior features to discern abnormality, we use a different and more complete set of metrics to characterizeusers’ general online social behaviors, instead of solely focusing on message posting behaviors. Additionally, our technique does not rely on deep inspection and classification of message contents and avoids the heavy weight processing. Wang et al. proposed an approach for sybil account detection by analyzing clickstreams. They differentiated sybil and common users’ clicks based on inter-arrival time and click sequence, and found that considering both factors leads to betterdetectionresults. Since sybils arespecialized fakeidentities owned by attackers, their clickstream patterns significantly differ from those of normal users. However, for compromised accounts, their clickstreams can be a mix from normal users and spammers, As a result, methods in cannot handle compromised accounts well.
In contrast, this paper aims to uncover users’ social behavior patterns and habits from the clickstreams, with which we can perform accurate and delicate detection on behavioral deviation. Regarding spammer detection, Lee et al. and Stringhini et al. set up honeypot accounts to harvest spam and identify common features among spammers, such as URL ratio in their messages and friends choice; using those features, both employ classification algorithms to detect spammers. Yang et al. introduced new features of spammers involving with their connection characteristics to achieve better accuracy. Thomas et al. analyzed the features of fraudulent accounts bought from the underground market and developed a classifier using the features to retrospectively detect fraudulent accounts. Instead of focusing on malicious accounts, Xie et al. proposedto vouch normal users based on the connections and interactions among legitimate users.
As for spam detection, Gao et al. proposed a realtime spam detection system, which consists of a cluster recognition system to cluster messages and a spam classifier using six spam message features. Thomas et al. thrived to detect spam by identifying malicious URLs in message content. , the authors conducted offline analysis to characterize social spam in Facebook and Twitter, respectively. They found that a significant portion of spam was from compromised accounts, instead of spam accounts. Meanwhile, Yang et al. investigated connections among identified spammers and other malicious account detection methods exploit the differences on static profile or connectivity information between normal and malicious accounts. Users’ social behavior analysis has also been employed to serve other purposes. Wilson et al.analyzed user interactions with friends from the trace of Facebook profiles to improve performance for sybil detection while reducing its complexity. the authors correlated users’ personalities with their OSN service usages.
User Social Behavior
In this section, we first propose several social behavior features on OSNs, and describe in detail how they can reflect user social interaction differences. Then, we present a measurement study on user behavior diversity by analyzing real user clickstreams of a well known OSN, Facebook, with respect to our proposed features.
Social Behavior Features
We categorize user social behaviors on an OSN into two classes, extroversive behaviors and introversive behaviors. Extroversive behaviors, such as uploading photos and sending messages, result in visible imprints to one or more peer users; introversive behaviors, such as browsing other users’ profiles and searching in message inbox, however, do not produce observable effects to other users. While most previous research only focus on the extroversive behaviors, such as public posting [8], we study both classes of behaviors for a more complete understanding and characterization of user social behaviors.
Extroversive Behavior Features:
Extroversive Behaviors directly reflect how a user interacts with its friends online, and thus they are important for characterizing a user’s social behaviors. We specify extroversive behaviors on the following four major aspects.
First Activity: The first extroversive activity a user engages in after logging in an OSN session can be habitual. Some users often start from commenting on friends’ new updates; while some others are more inclined to update their own status first. The first activity feature aims to capture a user’s habitual action at the beginning of each OSN session.
Activity Preference: How often a user engages in each type of extroversive activities relates to their personalities [5]. Some users like to post photos, while some others spend more time responding to friends’ posts; some mostly chat with friends via private messages, while some others always communicate by posting on each other’s public message boards. Typical OSNs provide a great variety of social activities to satisfy theirusers’ communicationneeds, forexample,commenting, updating status, posting notes, sending messages, sharing posts, inviting others to an event, etc. As a result, this feature can provide a detailed portrayal of a user’s social communication preferences.
Activity Sequence: The relative order a user completes multiple extroversive activities. While users have their preferences on different social activities, they may also have habitual patterns when switch from one activity to another. For instance, after commenting on friends’ updates, some users often update their own status, while some other users prefer to send messages to or chat with friends instead. Therefore, the action sequence feature reflects a different social behavioral pattern from the activity preference.
Action Latency: The speed of actions when a user engages in certain extroversive activities reflects the user’s social interaction style. Many activities on OSNs require multiple steps to complete. For example, posting photos involves loading the upload page, selecting one or more photos, uploading, editing (e.g., clipping, decorating, tagging, etc.), previewing and confirmation. The time a user takes to complete each action of a given activity is heavily influenced by the user’s social characteristics (e.g., serious vs. casual) and familiarity with the respective activity; but it doesn’t directly reflect how fast a user acts due to different content complexity. The action latency feature is proposed to provide more finegrained and accurate metric.
Introversive Behavior Features:
Although invisible to peer users, introversive behaviors make up the majority of a user’s OSN activity; as studied in previous work [6], [15] the dominant (i.e., over 90%) user behavior on an OSN is browsing. Through introversive activities users gather and consume social information, which helps them to form ideas and opinions, and eventually, establish social connections and initiate future social communications. Hence, introversive behavior patterns make up an essential part of a user’s online social behavioral characteristics. We propose the following four features to portray a user’s introversive behavior.
Browsing Preference: The frequence a user visits various OSN page types depicts its social information preferences. Typical OSNs classify social information into different page types. For instance, profile pages contain personal information of the account owners, i.e., names, photos, interests etc.; the homepage compose of the account owner’s friends’ latest updates while a group page consists posts or photos shared by group members. Users’ preferences on various types of social information naturally differ by their own interests, and the browsing preference feature intends to reflect this difference by observing users’ subjective behaviors.
Visit Duration: The time a user spends on visiting each webpage depicts another aspect of its social information consumption. Intuitively, users tend to spend less time on information that are “good-to-know”,while allocate more time on consuming information that are “important”, and their judgments are made based on their own personal interests. For example, some users prefer to stay on their own homepage reading friends’ comments and updates, while some others tend to spend more time reading others’ profile pages. The visit duration feature aims at capturing the social information consumption patterns for different users.
Request Latency: During a single visit to a webpage, a user may request multiple pieces of information. For example, browsing through a photo album requires loading each photo inside the album; reading comments from friends may also require “flipping” through many “pages” because only a limited amount of entries can be displayed at a time. Similar to the action latency feature for extroversive activities, the request latency featureprovidesfine-grainedcharacterizationof users’ social information consumption patterns.
Browsing Sequence: The order a user switches between different webpages reflects a user’s navigation patterns amongst different types of social information. OSNs usually provide easy navigation for users to move around various pages; a user on a friend profile page can directly navigate to another’s profile page, or go back to the homepage and then go to another friend’s profile page. How each user navigates during browsing can be habitual, and this feature intends to capture this characteristics.
Conclusion
In this Profiling Online Social Behaviors for Compromised Account Detectionpaper, we propose to build a social behavior profile for individual OSN users to characterize their behavioral patterns. Our approach takes into account both extroversive and introversive behaviors. Based on the characterized social behavioral profiles, we are able to distinguish a users from others, which can be easily employed for compromised account detection. Specifically, we introduce eight behavioral features to portray a user’s social behaviors, which include both its extroversive posting and introversive browsing activities. A user’s statistical distributions of those feature values comprise its behavioral profile. While users’ behavior profiles diverge,individualuser’s activitiesare highlylikely to conform to its behavioral profile. This fact is thus employed to detect a compromised account, since impostors’ social behaviors can hardly conform to the authentic user’s behavioral profile. Our evaluation on sample Facebook users indicates that we can achieve high detection accuracy when behavioral profiles are built in a complete and accurate fashion.