Garbage Collection of Timestamped Data in Stampede

Rishiyur S. Nikhil and Umakishore Ramachandran

To appear at Nineteenth Annual ACM SIGACT-SIGOPS Symposium on PRINCIPLES OF DISTRIBUTED COMPUTING (PODC 2000), Portland, Oregon, 16-19 July 2000


Abstract

Stampede is a parallel programming system to facilitate the programming of interactive multimedia applications on clusters of SMPs. In a Stampede application, a variable number of threads can communicate data items to each other via "channels", which are distributed, synchronized data structures containing timestamped data such as images from a video camera. Threads may produce and consume channel items out of timestamp order; they may produce and consume items sparsely (skipping timestamps), and multiple threads (including newly created threads) may consume an item in a channel. These flexibilities are required due to the complex dynamic parallel structure of applications, to support increased parallelism, and because of real-time requirements. Under these circumstances, a key issue is: "When can an item in a channel be garbage collected?" In this paper we specify precisely Stampede's semantics concerning timestamps, and derive two associated garbage collection conditions-- a weak condition, and a computationally more expensive but stronger condition. We describe a distributed, concurrent algorithm that implements these two GC conditions. We present performance numbers that show little or no application-level performance penalty to using this algorithm for aiding automatic garbage collection in a cluster of SMPs. We conclude with some thoughts on future work.