Date of Completion


Embargo Period



Distributed Algorithms, Network Supercomputing, Resource Discovery, Self-Stabilizing Algorithms, Fault Tolerance

Major Advisor

Alexander A. Shvartsman

Associate Advisor

Alexander Russell

Associate Advisor

Laurent Michel

Field of Study

Computer Science and Engineering


Doctor of Philosophy

Open Access

Open Access


Massive distributed cooperative computing in networks involves marshaling large collection of network nodes possessing the necessary computational resources. The computing power of the resources is used in solving partitionable, computation intensive problems. This is referred to as Internet, or network, supercomputing. Traditional approaches to Internet supercomputing employ a master processor and many worker processors that execute a collection of tasks on behalf of the master. Despite the simplicity and advantages of centralized schemes, the master processor is a performance bottleneck and a single point of failure. Additionally, a phenomenon of increasing concern is that workers may return incorrect results.

In this thesis, we present algorithms for the problem of network supercomputing that eliminate the master and instead use decentralized approach, where workers cooperate in performing tasks. The problem is studied under a variety of failure models, and all algorithms are designed to deal with undependable and crash-prone workers. Additionally, we present an algorithm that estimates the reliability of workers.

In order for the willing nodes to act in a concerted way in decentralized systems they must first discover one another. This is the general setting of the Resource Discovery Problem (RDP), and it serves as a building block for any kind of decentralized collaborative computing.

For the resource discovery problem this thesis explores solutions that can cope with intermittent failures, and, in particular we design self-stabilizing algorithms that solve the resource discovery problem in a deterministic synchronous setting.