Efficiently Promoting Product Online Outcome: An Iterative Rating Attack Utilizing Product and Market Property
With the rapid growth of e-commerce and social media, online rating systems that let users post ratings/reviews of products and services are playing an increasingly important role in influencing users’ online purchasing/downloading decisions. On the one hand, users may directly rank products/services according to their rating scores. On the other hand, online recommender systems that help users identify their favorable items from vast amount of products/services also take such ratings/reviews as a critical input. According to a 2013 survey conducted by Dimensional Research 88% online users have been influenced by an online user review when making a buying decision. A survey conducted by comScore Inc. and The Kelsey Group reveals that consumers are willing to pay at least 20% more for services receiving an “Excellent” or 5-star rating than for the same services receiving a “Good”, or 4-star rating. EBay sellers with established reputation can expect about 8% more revenue than new sellers marketing the same goods.
The huge profits provide great incentive for companies to manipulate online user ratings/reviews in practice. Book authors and eBay users are shown to write or buy favorable ratings for their own products. A recent study has identified that 10% of online products have had their user ratings manipulated. Yelp has identified roughly 16% of its restaurant ratings as dishonest ratings. The boom of rating companies, which provide sophisticated rating manipulation packages at affordable prices, reinforces the prevalence of such manipulations. For just $9.99, a company named “IncreaseYouTubeViews.com” can provide 30 “I like” ratings or 30 real user comments to boost video clips on YouTube. Taobao, which is the largest Internet retail platform in China, has identified these rating boosting services as a severe threat. The protection of online rating systems in essence has roots in the thorough understanding of how attack strategy works. Hence, a number of studies have been conducted to investigate rating attack strategies. Generally speaking, rating attacks can be classiﬁed into two categories: self-boosting attacks, where malicious users aim to boost rating scores of their own products, and bad-mouthing attacks, where malicious users aim to downgrade rating scores of other competitors’ products –. Speciﬁcally, a number of diverse rating manipulation strategies have been proposed, such as Sybil attack , Oscillation attack  and RepTrap attack . Nevertheless, current studies on rating attacks are still immature due to several reasons. First, most of the current studies evaluate attack impact by measuring the distortion of a target product’s rating score , or the number of unfair ratings bypassing the detection scheme , , while seldom considering the economic impact on product market outcome. The product market outcome can be measured by ﬁrm equity values, online sales, or downloads, if the product is in digital format. The lack of economic analysis often leads to impractical designs of attacks that are effective in changing products’ rating scores while not necessarily attracting more real sales/downloads. Second, the current design of attack strategies mainly focuses on malicious user behavior while ignoring product properties. Although comprehensive malicious user behavior has been extensively investigated, it is not the only impact factor determining the attack consequences. The same malicious user behavior may lead to completely different impacts on products with different properties, such as existing rating value and volume, existing sales/downloads, market ranks, etc. Third, current attacks promote/downgrade products by considering only the “external energy” provided by unfair ratings while ignoring the “internal energy” generated by the market itself. For example, if a rating manipulation launched at time t −1 is able to increase the target product’s sales at time t, we ﬁnd that the greater sales and higher popularity can further bring in more sales at the next time point t+1 although rating manipulation has already stopped. To ﬁll the gap, we consider these three aspects in the design of the proposed attack and summarize our contributions as follows. First, we introduce economic analysis into the design of rating manipulations by modeling how manipulation related factors will inﬂuence products’online sales/downloads. Second, we further differentiate manipulation impact on products with different popularityby adopting a quantile regression model. Third, for the ﬁrst time, we discover a “self-exciting” property in the online rating market which may provide extra energy beyond the manipulation power to push up target products to a higher rank than expected. Inspired by these ﬁndings, a novel iterative rating attack strategy is proposed and its effectiveness has been validated through experiment results. Note that we mainly focus on self-boosting attacks in this study. The same logic, however, may also help the design of bad-mouthing attacks.
A. Inﬂuential Factors for Online User Choices A thorough understanding of how different factors may affect online users’ decision making serves as the foundation for the design of efﬁcient rating manipulation strategies. Therefore, we ﬁrst conduct a comprehensive literature review on inﬂuential factors for online user choices. Rating value and volume are generally recognized as critical inﬂuential factors on a product’s online market sales/downloads . Rating values, which reﬂect prior users’ preferences and perceptions of product quality, play an important role in inﬂuencing later users’ choices . Speciﬁcally, the impact of rating value is found to be nonlinear ], meaning that a ﬁxed increase in rating value can lead to disparate market sales/downloads for products with different existing ratings. On the other hand, rating volume, which indicates a product’s visibility on the market, helps bringing the product to users’ attention. Therefore, an increase in ratingvolume may help the productstand out fromabundant competitors and lead to a larger chance of receiving greater market sales/downloads . More interestingly, various studies in recent years have found that the impact of rating value and volume differs over products with different popularity , which is often represented by market ranks. To accurately capture such impact, quantile regression models have been proposed by prior studies in marketing and information systems and achieved good results. Historically, this particular regression methodology has also been introduced in Econometrics for a long time and is shown to be robust and appropriate to estimate the differential impact of inﬂuential factors on the whole distribution of the outcome variable . A product’s sales/downloads may also be affected by the network effect and herding effect. First, the product diffusion theory indicates a network effect where the greater user base of a product, generally measured by past sales/downloads, will help expand its market share . Second, the herding effect refers to that the empirical proof that online consumers follow others’ adoption decisions . In other words, if the products have become more popular (i.e. ranked more highly in the market), consumers may follow their predecessors’ steps and also choose those more popular products. We follow the above literature to adopt these inﬂuential factors in our quantile regression model.
B. Rating Manipulation Studies To compare the proposed attack to existing ones, we further review state-of-the-art rating manipulation studies. The design of rating attack strategies has been conducted by many security studies and is dynamically evolving. In simple attacks, unfair ratings are provided independently. For example, eBay users boost their own reputation often by buying and selling ratings from independent sources . Such simple attacks, however, cannot cause severe damage to the system due to the limited power of individual user accounts. Collusion attacks, where excessive number of online IDs coordinate to insert unfair ratings, are adopted by many rating manipulation strategies as a more powerful attack , . The Sybil attack  is a typical example of collusion attacks. The colluding malicious users can (1) provide high ratings for self-promoting; (2) provide low ratings for bad-mouthing (3) restore their reputation by providing honest ratings to products that they do not care or (4) whitewash their reputation by registering new user IDs . Advanced collusion attacks, where malicious IDs perform more diverse yet coordinated tasks, are proposed to further strengthen manipulation impact and to avoid being detected. For example, in Oscillation attacks multiple malicious user groups may perform different rating behavior to protect one another from being detected. The roles of these groups switch dynamically. Another example is RepTrap attack where malicious user IDs coordinate to overturn the reputation of some products and turn them into traps one after another. This way, users who provide honest ratings on these trap products will be mistakenly identiﬁed as malicious by the rating defense scheme. As a consequence, malicious users are trusted by the systems and become more powerful to launch furtherattacks while honest users are marked as untrustworthy. To defend against collusion attacks, many online rating systems increase the cost of acquiring multiple user IDs by bindingidentity with IP address , requiringentry fees , using network coordinates to detect Sybil attacks , and analyzing trust relationships in social networks to identify collusion groups . In addition, a variety of advanced defenses are proposed to statistically analyze products’ rating distributions , , to evaluate raters’ feedbac trust –, and to adopt temporal and user similarity information in unfair rating detection , , etc. Note that, in this study, we mainly focus on how to enhance manipulation impact when the same set of unfair ratings is applied on different target products. How these ratings can be inserted without being detected will be further studied in the future work and is beyond the scope of this paper. Therefore, defense solutions are not considered in this study.
The prosperity of online rating systems has signiﬁcantly inﬂuenced the way people make their online purchasing/downloading decisions. Meanwhile, the simplicity of generating online ratings/reviews makes such systems vulnerable to diverse manipulations from malicious vendors in practice.Being closely related to the root of this issue, the study of rating attack strategies, however, is still immature. In this study, we ﬁrst understand the economic impact of different inﬂuential factors on product sales/downloads by applying a quantile regression model on a real market data set that contains product download information. Based on such understandings, we then classify products into three types according to their own property and propose distinct feasible and optimal manipulation strategies for each product type. More important, we further disclose and validate the existence of the self-excited rank promotion. By integrating manipulation power with market’s self-exciting power, we then propose a novel iterative rating attack and validate its advantage through experiments. Beyond the speciﬁc CNETD context in this study, we would like to further discuss the feasibility of the proposed attack on other different online rating platforms. First, quantile regression models have been adopted by a number of prior work – to evaluate the impact of user ratings on products’ market outcome on a variety of online rating platforms and have demonstrated convincing results. Therefore, we believe that the market self-exciting power observed based on such model also exists across different online rating platforms. Second, we acknowledge that the proposed attack may be detected by existing defense schemes. One example is the user behavior pattern based unfair rating detection , –, which identiﬁes malicious reviewers by checking their review history and/or social networking connections. Furthermore, since the proposed attack aims to promote products’ rank, another potential defense could be tracking products’ ranking and identifying the products with rapid rank changes as suspicious products. In addition, many online rating platforms verify user ratings/reviews throughmanual checking, customer reporting, or transaction veriﬁcation, which makes the success of the proposed attack more difﬁcult. Nevertheless,the proposedattackis still feasibleondifferent online rating platforms due to two reasons. (1) It can be adjusted to partially avoid, if not completely,being detected by these defense schemes by slowly changing the target product’s rating value/volume and spending more time/money to mimic normal behavior pattern. The tradeoff is that the manipulation cost is increasing and the process becomes longer, which may lead to a slower increase of the manipulation proﬁts. (2) More important, in the proposed attack, malicious attackers can customize attack strategies based on their manipulation power, the speciﬁc property of the target product and the market selfexciting power. Most of current defense schemes, however,use same detection settings for all products and therefore cannot effectively detect each individual target product promoted by the proposed attack. Therefore, we believe that the proposed attack is feasible across different online rating platforms. Through this study, we also identify several interesting directions for further investigation. First, we emphasize that the promotional effect on products’ market outcome is determined by not only attacker’s manipulation power but also the speciﬁc property of the target product and the market selfexciting power.
To the best of our knowledge, this is theﬁrst work to make this statement. It also guides the future attack/defense studies in online rating systems towards the direction of customization, which is rarely investigated in current literature. Second, time factor may play an important role in inﬂuencing attack results. There have been many online articles/reports discussing the best time of a day to generate inﬂuential posts on different social networking sites, such as Facebook, Twitter, Instagram, and etc., whereas the reported best timing seems to vary across different platforms, regions, target audience and contents. Nevertheless, there is limited scientiﬁc work to study how different timing of fake rating injection will affect the manipulation impact. Our current data set only allows the analysis based on each week. We plan to collect data sets with ﬁner-grained time stamps. Last but not least, we would like to study the tradeoff between bypassing defense schemes and making manipulation proﬁts faster. An optimized solution to balance attack costs and proﬁts will make the proposed attack even stronger.