E-Book Overview
Department of Electrical and Computer Engineering, UCSD. 6 p.
This paper addresses a simple, yet fundamental question in the design of peer-to-peer systems: What does it mean when we say (availability) and how does this understanding impact the engineering of practical systems? We argue that existing measurements and models do not capturethe complex time-varying nature of availability in today’s peer-to-peer environments. Further, we show that unforeseen methodological shortcomings have dramatically biased previous analyses of this phenomenon. As the basis of our study, we empirically characterize the availability of a largepeer-to-peer system over a period of 7 days, analyze the dependence of the underlying availability distributions, measure host turnover in the system, and discuss how these results may affect the design of high-availability peer-to-peer services.
E-Book Content
Understanding Availability Ranjita Bhagwan, Stefan Savage and Geoffrey M. Voelker Department of Computer Science and Engineering University of California, San Diego Abstract who join and leave the system independently of their own volition. Moreover, these components can be time varying. For example, a peer-to-peer system may replicate some file on machines at time . However, by time some machines may be turned off as their owners go to work, returning at some later time. The availability of the hosts is therefore dependent on time of day, and hence, the avail ability of the file is a function of time. Another issue is whether the availability of a host is dependent on the availability of another host, or, whether two host availabilities are interdependent. This issue is important since many peer-topeer systems [4, 12] are designed on the assumption that a random selection of hosts in a P2P network do not all fail together at the same time. Consequently, host availability is not well modeled as a single stationary distribution, but instead is a combination of a number of time-varying functions, ranging from the most transient (e.g., packet loss) to the most permanent (e.g., disk crash). Traditionally, distributed systems have assumed that transient failures are short enough to be transparently masked and only the long-term components of availability require explicit system engineering. In peer-to-peer systems, though, this abstraction is grossly insufficient. A new “intermittent” component of availability is introduced by users periodically leaving and joining the system again at a later time. Moreover, the set of hosts that comprise the system is continuously changing, as new hosts arrive the system and existing hosts depart it permanently on a daily basis. A peer-to-peer system designed on this substrate will need to incorporate arriving hosts into it without much overhead, while being able to provide all the functionality it promises to provide in the face of regular departures. We were motivated to study peer-to-peer host availability in part to shape the design and evaluation of a highly available, wide-area peer-to-peer storage system [15]. A primary goal of the system is to provide efficient, highly available file storage even when the system is comprised of hosts with relatively poor and highly variable availability. Even so, our results can apply to any peer-to-peer system constructed from a similar collection of hosts. The remainder of this paper examines these issues empirically by characterizing host availability in a large deployed peer-to-peer file sharing system over a 7 day period. We make four principal contributions in this work: First, we show that a minor methodological limitation of previous This paper addresses a simple, yet fundamental question in the design of peer-to-peer systems: What does it mean when we say “availability” and how does this understanding