
Cold-Start Recommendation with Provable Guarantees: A Decoupled Approach
ABSTRACT
Cold-Start Recommendation with Provable Guarantees: A Decoupled Approach management report in data mining .A major challenge in collaborative filtering based recommender systems is how to provide recommendations when rating data is sparse or entirely missing for a subset of users or items, commonly known as the cold-start problem. In recent years, there has been considerable interest in developing new solutions that address the cold-start problem. These solutions are mainly based on the idea of exploiting other sources of information to compensate for the lack of rating data. In this paper, we propose a novel algorithmic framework based on matrix factorization that simultaneously exploits the similarity information among users and items to alleviate the cold-start problem.
In contrast to existing methods, the proposed algorithm decouples the following two aspects of the cold-start problem:
- the completion of a rating sub-matrix, which is generated by excluding coldstart users and items from the original rating matrix; and
- the transduction of knowledge from existing ratings to cold-start items/users using side information.
This crucial difference significantly boosts the performance when appropriate side information is incorporated. We provide theoretical guarantees on the estimation error of the proposed two-stage algorithm based on the richness of similarity information in capturing the rating data. To the best of our knowledge, this is the first algorithm that addresses the coldstart problem with provable guarantees. We also conduct thorough experiments on synthetic and real datasets that demonstrate the effectiveness of the proposed algorithm and highlights the usefulness of auxiliary information in dealing with both cold-start users and items.
INTRODUCTION
A recommendation system is a subclass of data filtering system predict the “rating” or “preference” that a client would give to an item. Recommendation systems technology currently is in use in many application domains. Recommendation system can suggest items of interest to client based on their preferences. Such preferences could be retrieved either explicitly or implicitly. In general, recommendations are based on models built from item characteristics or users‟ social environment. Recommendations could be based on preferences of other users having similar demographic characteristics (e.g. gender, age, occupation). The recommendation result is the outcome of a complex process that combines the attributes of items and information about clients. Recommendation algorithms try through intelligent techniques to identify possible connections between items and users to give the most efficient results. The aim is the maximization of the quality of recommendation. The quality of recommendation can define the value of matching between a specific item and a specific user (Mili Mohan and Robert 2015). Recommendation System predicts the likelihood that a user would prefer an item. Think about the fact that Amazon recommends books that they think you could like. Amazon might be making effective use of a Recommendation System behind the curtains. This simple definition, allows us to think in a diverse set of applications where Recommendation Systems might be useful. Applications such as movies, music or who to follow on Twitter, are pervasive and widely known in the world of Information Retrieval.
Importance of Recommendation System
Recommendation system plays a major role in many sites. In Netflix nearly two third of the movies watched are recommended, in amazon thirty five percent of the sales are come from recommendations.
Need for Recommendation
System For the service provider this recommendation system helps in deciding what kind of offerings should be made to the user and for the user it helps in choosing a large number of items.
Demographic Based Recommendation System
This system aims to categorize the users based on attributes and make recommendations based on demographic details. Many industries have taken this kind of method as it‟s not that complex and easy to implement. Demographic techniques form “people-topeople” correlations, but use different data. The benefit of a demographic approach is that it does not require a history of user ratings.
Recommendation System Problems
Cold start is an important issue in computer based information system. Specifically, it concerns the issue that the system cannot draw any information for users or things about which it has not yet gathered sufficient information. This refers to a situation where a recommender does not have information about a user or a thing in order to make relevant predictions. This is one of the important problem in recommendation system. The profile of such new user or item will be empty since he has not rated any item hence, his taste is not known to the system. Basically there are two types of cold start issues in recommendation system. They are,
Item Cold Start Problem
The item cold-start problem occurs when there is a new item is enter in to the system. Because it is a new product, it has no user ratings (or the number of ratings is less than a threshold) and is therefore ranked at the bottom of the recommended items list. The new user cold-start problem occurs when a new user enter into the system, the system do not have prior information (ratings) about the items. Addressing this problem has been the major focus of various studies in recent years.
User Cold Start Issues
When a new user enters into the system, the system does not have prior data about their preferences in order to make recommendations. Such data are related to ratings for items. Ratings are important to show the preferences of specific user. Addressing this problem has been the primary focus of various studies in recent years. Solving this new user cold start problem is carried out through clustering algorithm, similarity metrics and collaborative prediction method. The main objective of this work is to break the new user cold start problem by predict the unknown ratings of new user based on the ratings of similar existing users.
System Configuration:
H/W System Configuration:-
Processor : Pentium IV
Speed : 1 Ghz
RAM : 512 MB (min)
Hard Disk : 20GB
Keyboard : Standard Keyboard
Mouse : Two or Three Button Mouse
Monitor : LCD/LED Monitor
S/W System Configuration:-
Operating System : Windows XP/7
Programming Language : Java/J2EE
Software Version : JDK 1.7 or above
Database : MYSQL
Types of Recommender Systems
The recommender systems can be categorized on several bases. In the literature, the categorization of the recommender systems are usually found on the following bases;
- Approaches used
- Area of application for which recommendation is made
- Data mining techniques applied, etc.
In RS is categorized in 3 different criteria based on approaches,
- Content-based recommendations,
- Collaborative recommendations and
- Hybrid recommendations.
Bobadilla et al. have suggested four categories on the basis of filtering algorithms, Content-based filtering, collaborative filtering, hybrid filtering and demographic filtering. Burke [7] have categorized 5 types of the recommender systems based on the approaches. The categories are; Collaborative based recommendations, Content- based recommendations, Demographic based recommendations, Utility based recommendations and Knowledge based recommendations. We have categorized 8 types of recommender systems (RS). These categories broadly cover the techniques which have been used by the masses or the current generation researchers are frequently applying it.
- Collaborative Filtering based recommender systems (C.F)
- Reclusive methods based recommender systems (R.M)
- Demographic Filtering based recommender systems (D.F)
- Knowledge based recommender systems (K.B)
- Hybrid Recommender systems (H.R)
- Context Aware Recommendation System (CARS)
- Social network based recommender systems
- Soft Computing techniques based Recommender Systems
Collaborative Filtering based Recommender Systems
It is the most successful and frequently used recommendation technique discussed in the literature since the appearance of first recommender system in mid 1990s. The collaborative approach makes use of the recommendation from other customers whose choices are similar to the target customers (i.e. customer for whom the recommendation is made). The customers with similar choices are termed as neighbor. Thus, two major tasks are being performed in collaborative filtering;
- finding the neighbor of a customer and
- exploring the preferences of the neighbors of a target customer or user.
The neighbor of a user can be formed by analysing the past purchasing behavior of the user and calculating the similarity scores between the choices of these users. Whereas the recommendation of the neighborhood customers can be obtained either explicitly in terms of rating which are numerical values within a specified range, or implicitly with some defined measures. Implicit recommendations also involve customer’s feedback. The customer’s feedback can be their behavior noticed by the user’s log information or it can be users’ sentiments expressed in terms of their reviews.
Reclusive Methods based Recommender Systems
It is clear from the above discussion that collaborative filtering is based upon finding similarities between users. It does not need any representation of the objects to be recommended. Unlike collaborative filtering, reclusive approach exploits the features of the objects and requires its representation. The reclusive methods are considered as complementary to collaborative techniques. And it emphasizes on finding similarities between objects, i.e. items rather than finding the similarities between users.
Demographic Filtering based Recommender Systems
The recommender systems based on demographic filtering also use similarity measures as a metric. But instead of finding similar rated items by neighbor users, it tries to find the similarity between users’ demographic information like, age, sex, occupation etc. In this approach, the system stores the demographic information of the customers and whenever a new user comes to merchandisers’ site for the purchase of any product, the system identifies the similarity between user’s demographic information. According to the preference of the customer, the system recommends alike items to new user having similar age, sex, occupation etc. to customer.
Knowledge based Recommender Systems
The recommender system has much of its emergence due to the initial involvement of collaborative filtering methods. However, later a good amount of work is contributed using reclusive methods too. The early implication of collaborative and reclusive approaches to the recommendation technology has given a distinguished identity to the above two techniques in the categorization of recommender systems. As recommender system is a knowledge based approach, thus all the different categories are based on knowledge filtering techniques. The reason behind keeping reclusive and collaborative as a separate category is its familiarity and domination from early days of evolution of recommendation technology.
Hybrid Recommender Systems
Though Collaborative Filtering (C.F) and (R.M) are the most frequently used techniques in designing Recommender Systems (RS) but they inadequately provide any explanation of why the specific recommendations have been made to particular user along with recommendation, hence, they fail in fulfilling the explanation in various scenarios. These shortcomings of the both leading technologies can be overcome by the use of the combination of duo. The various combinations of these techniques have been presented in the literature. These combinations are termed as ‘hybrid technique’. We have categorized seven types of hybrid recommender systems based on different combinations.
- Hybrid Recommender Systems based on Collaborative Filtering (CF) dominated Reclusive Method (RM)
- Hybrid Recommender Systems based on RM dominated CF techniques
- Hybrid Recommender Systems based on unified RM and CF techniques.
- Hybrid Recommender Systems based on Subsequent Integration of separately applied CF techniques and RM
- Hybrid Recommender Systems based on Integration of CF and RM with knowledge based system (KBS)
- Other Hybrid Recommender Systems using CF techniques
- Other Hybrid Recommender Systems using RM
Context Aware Recommender Systems
Context aware recommender system though can be perceived as a special kind of knowledge based system, when context is involved as knowledge, required for recommendation. However, the high inclination of the recommender system research community towards recommender system for learning has provided a platform that compels us to keep CARS as a different category, and not a type of KBS. The ultimate goal of recommender system is to achieve user satisfaction. And user can only be assured for their satisfaction if they are delivered with the exact recommendations that meet their needs. The user’s requirement is not static and may vary time to time depending upon various social and other factors affecting their purchasing trend.
Social Network based Recommender Systems
The detail of the RS applied over social networking environment has been extensively studied and presented by Zhou et al. The authors have tried to explore the pros and cons and the opportunities of social network based RS. An overview of the Foafing the Music system is presented. The system used the text from RDF Site Summary (RSS) and Friend of a Friend (FOAF). The Foafing based system predicts music to a user that matches to his essence of music listening. Music information is collected from RSS feeds, music related blogs, upcoming albums and ‘mp3’ audio files at different music containing sites. The system discovered music with the help of user profiling, information and descriptions based on context supported ontological details of music domain.
Soft Computing Techniques based Recommender Systems
The soft computing techniques have now been increasingly used in recommender systems for incorporating collaborative recommendations, reclusive recommendations and hybrid recommendations. To deal with the uncertainty in various business marketing affairs, Cornelis et al. make use of fuzzy relations to model the degree of similitude between items and users. They also proposed a novel hybrid CF–CB approach whose rationale is concisely summed up as “recommending future items if they are similar to past items that similar users have liked”. A hybrid fuzzy logic-based recommendation framework was then developed to improve the trade exhibition recommender system for e-government. Zhang et al. has developed a telecom recommender system using fuzzy techniques. The authors have used fuzzy on item based similarity approaches. The have applied fuzzy set techniques on mobile product and service recommendation. They have designed system referred as Fuzzy-based Telecom Product Recommender System (FTCP-RS).
EXPERIMENTS
In this section, we conduct exhaustive experiments to demonstrate the merits and advantages of the proposed algorithm. We conduct the experiments on synthetic and two well-known NIPS 1 and MovieLens 2 datasets, aiming to accomplish and answer the following fundamental questions:
- Prediction accuracy: How does the proposed algorithm perform in comparison to the state-of-theart algorithms with incorporating side information of users/items. And to what degree the available side information could help in making more accurate recommendations?
- Dealing with cold-start users: How does exploiting similarity relationships between users affect the performance of recommending existing items to cold-start users?
- Dealing with cold-start items: How does exploiting similarity information between items affect the performance of recommending cold-start items to existing users?
- Dealing with cold-start users and items simultaneously: How does exploiting similarity information between users and items affect the performance of recommending cold-start items to cold-start users?
CONCLUSIONS
In Cold-Start Recommendation with Provable Guarantees: A Decoupled Approach management report in data mining paper, we proposed a novel factorization model to explicitly exploit similarity information about users and items to alleviate corresponding cold-start problems. In contrast to exiting methods such as subspace sharing and kernelized factorization methods, in the proposed method the completion of unobserved ratings and transduction of knowledge to cold-start items/users is decoupled. In particular, we first perform a full recovery of the sub-matrix obtained by excluding cold-start items and users, and then exploit the similarity matrices to transduct recovered ratings to cold-start users/items. The performance of the proposed algorithm is theoretically analyzed and empirically verified on synthetic and real datasets. Our results demonstrated that the proposed decoupling idea significantly improves the quality of the recommendations and alleviates the cold-start problem when rich side information about users and items is provided.







