Saturday, June 17th
Tutorial: High-Level Executable Specification and Reasoning for Improving Distributed Algorithms (in conjunction with PLDI)
Organizers: Yanhong Annie Liu and Scott Stoller
Monday, June 19th
Workshop: Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems (ApPLIED)
Organizers: Yanhong Annie Liu and Elad Michael Schiller
Tutorial: Fault-Tolerant Distributed Optimization and Learning
Organizers: Lili Su and Nitin H. Vaidya
Abstract: Consider a set of agents wherein each agent has its own “local” cost function that depends on some parameters of interest. The goal for the agents is to determine a parameter vector that minimizes the aggregate or “global” cost over all the agents. For example, in the context of machine learning, the local cost function of an agent may represent the “loss function” corresponding to the agent’s local dataset. Similar optimization problems arise in the context of many other applications as well.
This tutorial will address distributed optimization algorithms for such cost functions, with an emphasis on fault-tolerant algorithms for distributed optimization. Intuitively, the goal of fault-tolerant optimization considered here is to try to solve the problem “correctly” despite the presence of some Byzantine faulty (or adversarial) agents. Such faulty agents may supply incorrect information during the optimization procedure. The first step towards achieving this goal is to define what a “correct” outcome should be in presence of Byzantine faulty agents. The tutorial will define an ideal outcome, and discuss algorithms that may compute the ideal outcome exactly or approximately, depending on assumptions that can be made about the local cost functions of the agents.
The tutorial presentation will not assume any background in optimization or Byzantine fault-tolerance.
Friday, June 23rd
Workshop: Biological Distributed Algorithms (BDA)
Organizers: Arjun Chandrasekhar, Frederik Mallmann-Trenn, Yannic Maus, and Ted Pavlic
Tutorial: Cryptography in distributed protocols
Organizers: Rotem Oshman and Vinod Vaikuntanathan
Abstract: The tutorial will cover several basic cryptographic primitives and ideas, and focus on the ways in which computational assumptions allow us to circumvent fundamental impossibility results in distributed computing. Topics covered will include:
- Byzantine agreement and leader election (using signatures and verifiable random functions),
- Secure multiparty computation (using secret-sharing and oblivious transfer),
- Distributed certification of graph properties (using collision-resistant hash functions and succinct non-interactive arguments for P).
No prior background is assumed.
Tutorial: Distributed computing with live streaming data
Organizers: Adrian Kosowski, Krzysztof Nowicki, and Przemyslaw Uznanski (Pathway.com)
Abstract: “The only constant thing in life is change”. A lot of modern data processing applications work with data streams and changing data inputs, and their objective is to provide up-to-date outcomes with low latency at high data throughput.
In this tutorial, we look how to design dynamic algorithms in a systematic way, and to implement them in an actual distributed streaming system. A major challenge here is the design of dynamic algorithms ready for different input data scenarios: data streams with insertion, deletion, arrival of data out-of-order, backfilling, etc.
We center the discussion around designing iterative graph algorithms for time-changing data. For this task, we provide examples of code in industry-standard frameworks (Apache Flink, Spark Structured Streaming, Kafka Streams), as well as in Pathway. Pathway is a new performant data processing framework, for bounded and unbounded data streams, equipped with a Table API in Python, and powered by a distributed incremental dataflow in Rust. It is particularly well suited for implementing “local-type” algorithms.
In the course of a hands-on code tutorial, you will learn how to make a fully functional streaming application. We will write an unsupervised graph learning algorithm, and do a quick integration of data sources, and presentation of outputs. When the application is deployed in streaming mode, it will take care of updating classification outcomes automatically as new data arrives.
We will close with some remarks on consistency and correctness promises that can be asked of distributed streaming systems when executing dynamic algorithms.