[ba-ohs-talk] Why Gnutella Can't Scale. No, Really.

To: ba-ohs-talk@bootstrap.org
From: Jack Park <jackpark@thinkalong.com>
Date: Wed, 16 Jan 2002 07:23:38 -0800
Message-id: <4.2.2.20020116072036.0236c5c0@thinkalong.com>

Got this link from slashdot.
http://www.darkridge.com/~jpr5/doc/gnutella.html    (01)

It's a mathematical analysis of gnutella.  Some of the discussion may, or 
may not apply to P2P in general; I'm not sure so I thought I'd post it here 
for folks to discuss if interested.    (02)

"In the spring of 2000, when Gnutella was a hot topic on everyone's mind, a 
concerned few of us in the open-source community just sat back and shook 
our heads. Something just wasn't right. Any competent network engineer that 
observed a running gnutella application would tell you, through simple 
empirical observation alone, that the application was an incredible burden 
on modern networks and would probably never scale. I myself was just 
stupefied at the gross abuse of my limited bandwidth, and that was just DSL 
-- god help the dialup folks! We wondered to ourselves, Is no one paying 
attention, was no one bothered?
That summer we all saw a rush of press on Gnutella, and the rumour mill 
started churning. Most stories covering Gnutella were grossly and 
inappropriately evangelical, praising the not-yet-analyzed Gnutella as a 
technology capable of delivering on wildly fantastic promises of fully 
distributed, undeterrable, unstoppable, larger-than-life file sharing on 
the grandest scale. Many folks were convinced that Gnutella was the next 
generation Napster. Gene Kan, the first to spearhead the Gnutella 
evangelical movement, claimed in one early interview: "Gnutella is going to 
kick Napster in the pants." Later Kan admitted "Gnutella isn't perfect", 
but still went on to say that "there's no huge glaring thing missing". 
Well, something just wasn't right, and though we couldn't see it, it did 
seem pretty glaring.
We all understood the excitement. Herein was a technology that could 
potentially prove the true magnitude of Metcalfe's Law. That realization 
evoked nothing short of the phrase "holy shit!". But what I couldn't 
understand was why no one was questioning the legitimacy of these claims. 
For several months the only analyses anyone heard of practical 
implementations were generalizations and speculative comments, without much 
scientific or mathematical basis.
So I quickly got fed up, and resolved to write a research paper. Sometime 
in late March, I had begun analyzing the network structure of the Gnutella 
system, trying to find a way to gauge the capacity of a GnutellaNet in 
generalized terms, and to predict its realistic limits. What later resulted 
was a set of mathematical equations that could describe reachability, 
capacity, and bandwidth throughput. I then fed those equations into 
Mathematica to produce 3-D plots depicting, much to my own satisfaction, 
visual realizations of exactly what didn't make sense.
In the spring of 2000, when Gnutella was a hot topic on everyone's mind, a 
concerned few of us in the open-source community just sat back and shook 
our heads. Something just wasn't right. Any competent network engineer that 
observed a running gnutella application would tell you, through simple 
empirical observation alone, that the application was an incredible burden 
on modern networks and would probably never scale. I myself was just 
stupefied at the gross abuse of my limited bandwidth, and that was just DSL 
-- god help the dialup folks! We wondered to ourselves, Is no one paying 
attention, was no one bothered?
That summer we all saw a rush of press on Gnutella, and the rumour mill 
started churning. Most stories covering Gnutella were grossly and 
inappropriately evangelical, praising the not-yet-analyzed Gnutella as a 
technology capable of delivering on wildly fantastic promises of fully 
distributed, undeterrable, unstoppable, larger-than-life file sharing on 
the grandest scale. Many folks were convinced that Gnutella was the next 
generation Napster. Gene Kan, the first to spearhead the Gnutella 
evangelical movement, claimed in one early interview: "Gnutella is going to 
kick Napster in the pants." Later Kan admitted "Gnutella isn't perfect", 
but still went on to say that "there's no huge glaring thing missing". 
Well, something just wasn't right, and though we couldn't see it, it did 
seem pretty glaring.
We all understood the excitement. Herein was a technology that could 
potentially prove the true magnitude of Metcalfe's Law. That realization 
evoked nothing short of the phrase "holy shit!". But what I couldn't 
understand was why no one was questioning the legitimacy of these claims. 
For several months the only analyses anyone heard of practical 
implementations were generalizations and speculative comments, without much 
scientific or mathematical basis.
So I quickly got fed up, and resolved to write a research paper. Sometime 
in late March, I had begun analyzing the network structure of the Gnutella 
system, trying to find a way to gauge the capacity of a GnutellaNet in 
generalized terms, and to predict its realistic limits. What later resulted 
was a set of mathematical equations that could describe reachability, 
capacity, and bandwidth throughput. I then fed those equations into 
Mathematica to produce 3-D plots depicting, much to my own satisfaction, 
visual realizations of exactly what didn't make sense.
At about the same time, a fellow colleague in the security industry wrote a 
short paper detailing the various and flagrant insecurities inherent in 
this particular implementation of a distributed system. Seth McGann's 
security advisory titled Self-Replication Using Gnutella centered on the 
characteristics an Internet Worm inside a GnutellaNet could thrive from, 
and also touched on a few other flaws that would be useful to an attacker. 
His advisory posted in May of 2000, and unfortunately went mostly unnoticed 
(or misunderstood, because of its technical nature).
  "    (03)

Prev by Date: [ba-ohs-talk] Fwd: Grid/World Wide Computing Simulation tool [from Rajkumar Buyya]
Next by Date: [ba-ohs-talk] Peer and Web Services are Technologies of Connection and Coordination
Previous by thread: Re: [ba-ohs-talk] Heads up: 2 yr Mifflin Java developer post
Next by thread: [ba-ohs-talk] Peer and Web Services are Technologies of Connection and Coordination
Indexes: Date | Thread | Author