Network security and secure computing

A Stable Approach for Routing Queries in Unstructured P2P Networks

January 21, 2019

2591

A Stable Approach for Routing Queries in Unstructured P2P Networks

Abstract

A Stable Approach for Routing Queries in Unstructured P2P Networks,Finding a document or resource in an unstructured peer-to-peer network can be an exceedingly difficult problem. In this paper we propose a query routing approach that accounts for arbitrary overlay topologies, nodes with heterogeneous processing capacity, e.g., reflecting their degree of altruism, and heterogenous class-based likelihoods of query resolution at nodes which may reflect query loads and the manner in which files/resources are distributed across the network. The approach is shown to be stabilize the query load subject to a grade of service constraint, i.e., a guarantee that queries’ routes meet pre-specified class-based bounds on their associated a priori probability of query resolution. An explicit characterization of the capacity region for such systems is given and numerically compared to that associated with random walk based searches. Simulation results further show the performance benefits, in terms of mean delay, of the proposed approach. Additional aspects associated with reducing complexity, estimating parameters, and adaptation to class-based query resolution probabilities and traffic loads are studied.

Introduction

Peer-to-peer (P2P) systems continue to ﬁnd increasing and diverse uses as a distributed, scalable and robust framework to deliver services, e.g., ﬁle sharing, video streaming, expert/advicesharing,sensornetworks,databases,etc.Oneofthe basic functions of such systems is that of efﬁciently resolving queries or discovering ﬁles/resources. This is the problem addressed in this paper. There is a considerable body of work exploring the design of efﬁcient search/routing mechanisms in structured and unstructured P2P networks, see e.g., [1]–[10]. In structured networks, peers/ﬁles/resources are organized to form overlays with speciﬁc topologies and properties. Search mechanisms that perform name resolution based on distributed hash table (DHT) coordinate systems can be devised to achieve good forwarding-delay properties, see e.g., [2]. In such systems, the query trafﬁc may depend on how keys are assigned. So, load balancing requires proactive/reactive assignments of keys to peers and data/service objects, e.g., [11], and possibly exploiting network hierarchies [10]. Fundamentally, in such networks the difﬁculty of search/discovery is shifted to that of maintaining the structural invariants required to achieve efﬁcient query resolution particularly in dynamic settings with peer/contentchurnorwhenreactiveloadbalancingisrequired.

Unstructured networks, by contrast, are easier to setup and maintain, but their mostly ad hoc overlay topologies make realizing efﬁcient searches challenging. In a purely unstructured P2P network, a node only knows its overlay neighbors. With such limited information, search techniques for unstructured networks have mostly been based on limited-scope ﬂooding, simulated random walks, and their variants [3]–[5]. Much research in this area has focused on evaluating these search techniques based on the contact time, i.e., number of hops required to ﬁnd the target, using the spectral theory of Markov chains on (random) graphs, see e.g., [4]–[6]. Unfortunately in heterogenous settings where service capacity or resolution likelihoods vary across peers, such search techniques perform poorly under high query loads. The inefﬁciencies of purely unstructured networks can be partially addressed by hybrid P2P systems, e.g., FastTrack and Gnutella2. Such systems use a simple two-level hierarchy wheresomepeersserveas‘super-peers.’Thesearehighdegree nodes which are well connected to other super-peers and to a set of subordinate nodes in a hub-and-spoke manner [12].

Though such systems have advantages in terms of scalability, proposed search techniques are still based on variants of ﬂooding and random walks. The work of [7] proposes an approach where peers cache the outcomes of past queries as informed by reverse-path forwarding. The idea is to learn, from past experience, the best way to forward certain classes of queries, i.e., to intelligently “bias” their forwarding decisions by correlating classes of queries with neighbors who can best resolve them. This approach involves considerable overhead, is not load sensitive, and has not yet given guarantees on performance. Although, as will be clear in the sequel, our results are not exclusive to hybrid P2P networks, these will serve as the focus of the paper. We assume that each super-peer contributes a possibly heterogenous amount of processing resource for resolving queries for the network – incentives for doing so are outsideofthescopeofthispaper,seee.g.,[8],[9].Super-peers serve their subordinates by resolving queries, or forwarding them to other super-peers. Super-peers can resolve queries by checking the ﬁles/resources they have, as well as those of their subordinate community. In our approach we also introduce a notion of query classes. These might, for example, represent types of content, such as music, ﬁlms, animations, documents, or some other classiﬁcation of ﬁles/resources relevant to the application at hand.

The idea is that such a grouping of queries into classes can be used as a low overhead approach to make useful inferences on how to relay queries. Given a hybrid P2P topology and query classiﬁcation, we propose a novel query resolution mechanism which stabilizes the system for all query loads within a ‘capacity region’, i.e., the set of loads for which stability is feasible. Essentially, our policy is a biased random walk where forwarding decision for each query is based on instantaneous query loads at super-peers. To balance the load across heterogeneous superpeers, the policy aims at reducing the differential backlog at neighboring super-peers, while taking into account the class and history information to improve the query’s resolvability. Our policy draws upon standard backpressure routing algorithm, which is used to achieve stability in packet switching networks, e.g., see [13], [14]. In previously studied backpressure based systems, the goal is to deliver packets to the correspondingdestinations.Bycontrast,our aimis to provide a grade of service in resolving querieswithnoﬁxeddestinations.

The random nature of the location of query resolution in the network leads us to deal with expected queue backlog instead of current queue backlog. Further, in P2P systems, the probability of resolution of a query at a given node depends on the query’s history, i.e., the path that led it to the current node. These characteristics of P2P systems are not captured in previous works on backpressure by Tassiulas and Ephremides [13] and the subsequent enhancements, see e.g., [14]–[21]. To summarize, our approach differs from standard work on backpressure in that we incorporate the following different issues that arise in P2P search: (a) we model the uncertainty in the locations where a query may be resolved depending upon where the ﬁle/object of interest are placed, (b) we guarantee a grade of service to each query under such uncertainties, (c) we incorporate the information about a query’s resolvability available through the knowledge of its history. We also propose several natural enhancements to our backpressure based query routing policy. By contrast to previous works on backpressure such as [15]–[19] and citations therein, these enhancements are also driven by P2P query routing setting. For example, in order to reduce delays previous works developalgorithmswhichprefershorterpathsoverlongerones by explicitly accounting for the hop length of various paths [15], or by ﬁnding good routes towards destinations using artiﬁcial ‘shadow’ queues which operate at larger loads to build gradients [17]. In our P2P query routing setting the destination of a query is not known a priori. We reduce delays via a simple ‘work conserving’ policy which efﬁciently uses available resources in routing queries at each node. We further propose a state aggregation policy aimed at reducing the complexity arising from the need to track the history of currently unresolved search queries.

Our Contributions. The main contributions of this paper are as follows. We propose a query forwarding mechanism for unstructured (hybrid) P2P networks with the following properties.

1. It dynamically accounts for heterogeneity in super-peer’s ‘service rate,’ reﬂecting their altruism, and query loads across the network. To the best of our knowledge, this is the ﬁrst work to rigorously account for such heterogeneity in devising a search mechanism for P2P networks.

2. It is based on classifying queries into classes. This classiﬁcation serves as a type of name aggregation, which enables nodes to inferthelikelihoods of resolving class queries,which, in turn, are used in learning how to forward queries.

3. Our approach is fully distributed in that it involves information sharing only amongst neighbors, and achieves stability subject to a Grade of Service (GoS) constraint on query resolution. The GoS constraint corresponds to guaranteeing that each query class follows a route over which it has a reasonable ‘chance’ of being resolved.

4. We provide and evaluate several interesting variations on our stable mechanism that help signiﬁcantly improve the delay performance, and further reduce the complexity making it amenable to implementation.

Speciﬁcally, we formally show that backpressure with aggregated queues, where aggregation is based on queries’ histories, is stable for fully connected super-peer networks. This provides a basis for substantially reducing complexity by approximations, e.g., in the case where content is randomly placed.

Proposed System

Given a hybrid P2P topology and query classification, we propose a novel query resolution mechanism which stabilizes the system for all query loads within a ‘capacity region’, i.e., the set of loads for which stability is feasible.
Essentially, our policy is a biased random walk where forwarding decision for each query is based on instantaneous query loads at super-peers.
To balance the load across heterogeneous super-peers, the policy aims at reducing the differential backlog at neighboring super-peers, while taking into account the class and history information to improve the query’s resolvability.
Our policy draws upon standard backpressure routing algorithm, which is used to achieve stability in packet switching networks,
We propose a query forwarding mechanism for unstructured (hybrid) P2P networks with the following properties.
It dynamically accounts for heterogeneity in super-peer’s ‘service rate,’ reflecting their altruism, and query loads across the network. To the best of our knowledge, this is the first work to rigorously account for such heterogeneity in devising a search mechanism for P2P networks.
It is based on classifying queries into classes. This classification serves as a type of name aggregation, which enables nodes to infer the likelihoods of resolving class queries, which, in turn, are used in learning how to forward queries.
Our approach is fully distributed in that it involves information sharing only amongst neighbors, and achieves stability subject to a Grade of Service (GoS) constraint on query resolution. The GoS constraint corresponds to guaranteeing that each query class follows a route over which it has a reasonable ‘chance’ of being resolved.
We provide and evaluate several interesting variations on our stable mechanism that help significantly improve the delay performance, and further reduce the complexity making it amenable to implementation.

Advantages

Reducing complexity
Estimating parameters, and adaptation to class-based query resolution probabilities and traffic loads are studied.
Stable Policies
Estimating Query Resolution Probabilities
Alternate Grades of Service Strategies
It is based on classifying queries into classes
The GoS constraint corresponds to guaranteeing that each query class follows a route over which
It has a reasonable ‘chance’ of being resolved
This provides a basis for substantially reducing complexity by approximations

Implementation

Estimating query resolution probabilities So far we have assumed that resolution probabilities for queries of different types are known. In practice they can be easily estimated. In order to ensure unbiased estimates can be obtained at each node, suppose a small fraction of all queries is marked ‘RW’, forwarded via the random walk policy with a large TTL, and given scheduling priority over other queries. With a sufﬁciently large TTLthisensuresthateachnodewillseearandomsample of all query and types it could see and thus allow for unbiased estimates. All queries which are not marked ‘RW’ are treated according to our backpressure policy based on the estimated query resolution probabilities. A node I receives O(tE) ‘RW’ marked samples in time t. ReducingcomplexityNotunlikestandardbackpressurebased routing our policies suffer from a major drawback: each node needs to share the state of its potentially large number of non-empty queues with its neighbors. For backpressure-based routing the number of queues per node corresponds to the number of ﬂows (commodities) in the network. In our context, the number of queues per node corresponds to number of query types it could see, i.e., worst case . In this section we propose simple modiﬁcation and approximations that considerably reduce the overheads, albeit with some penalty in the performance. The key idea is to deﬁne equivalence classes of query types that share a ‘similar’ history, in the sense that they have similar conditional probabilities of resolution, and have them share a queue. For example, all query types of class c which have visited the same number of nodes k mightbegroupedtogether,reducingthenumberofqueues to (|C||N|) or better. Alternatively we will show one can further reduce overheads by approximately grouping similar query types based on their classes cand the cumulative number of class c ﬁles/resources they have seen in nodes in H(T ), reducing the number of queues to (|C|L) where L is a set of quantization levels. Intuitively such queries have seen similar opportunities if ﬁles/resources are randomely made available in the network. Alternate grades of service strategies Till now we provided grade of service based on a priori probability( ). We provide below, in brief, some alternate strategies for providing GoS which may be more suited to some applications. Multiple responses for each query: For certain applications, it may be beneﬁcial to provide multiple responses to the source generating the query. For example, for applications requiring downloading a huge ﬁle, among the available options, the client may want to choose the server which is least loaded or is physically closest.

Conclusion

A Stable Approach for Routing Queries in Unstructured P2P Networks,To summarize, we provided a novel, distributed, and reliable search policy for unstructured peer-to-peer networks with super-peers. Our backpressure based policy can provide capacity gains of as large as 68 over traditional random walk techniques. We also provided modiﬁcations to the algorithm that make it amenable to implementation.

A Stable Approach for Routing Queries in Unstructured P2P Networks

A Stable Approach for Routing Queries in Unstructured P2P Networks

Abstract

Introduction

Implementation

Conclusion

MOST POPULAR

Collapse of World Trade Center

Optimizing Cost for Online Social Networks on Geo – Distributed Clouds

Sentiment Analysis for Product Rating System

Aadhaar Secure Travel Identity Java Project

HOT NEWS

Complaint Management

Recruitment of Faculty on Account of Super Annuation

Client Management

Projects Topics & Ideas On Software Projects-2019