Microsoft Research
Oracle Labs

PODC Social Network Workshop



A New Perspective on Vertex Connectivity
Keren Censor-Hillel

Vertex and edge connectivity are fundamental graph-theoretic concepts, as they give measures for flows and robustness. While there is extensive knowledge in the literature about edge connectivity, using the essential tool of edge-disjoint spanning trees, much less is known about vertex connectivity.

In this talk I will argue that CDS (connected dominating set) partitions and packings are the natural analogues of edge-disjoint spanning trees, focusing on vertex-disjointness rather than edge-disjointness. The talk will describe new results that constructively show CDS packings and partitions with tight and almost tight sizes, respectively. The strength of our approach is illustrated by applying our construction to analyze the vertex-connectivity of a graph induced by a random sample of vertices of a k-vertex-connected graph.

As an additional important application, we show CDS packings to be tightly related to the throughput of routing-based algorithms and use our new toolbox to yield a routing-based broadcast algorithm with optimal throughput. I will describe the consequences of the above for information dissemination in social networks.

Based on joint work with Mohsen Ghaffari and Fabian Kuhn.


Aggregating information from a crowd
Anirban Dasgupta

Crowdsourcing is now widely used to replace judgement or evaluation by an expert authority with an aggregate evaluation from a number of non-experts, in applications ranging from rating and categorizing online content all the way to evaluation of student assignments in massively open online courses (MOOCs) via peer grading. A key problem in these settings is how to aggregate these evaluations, thereby giving an accurate estimate of the ground truth, given that the agents are of varying, unknown, expertise.

In this talk, we consider a model to formalize this question of aggregating information collected from a crowd. We first present a simple model where tasks are binary and each agent has an unknown fixed, reliability that determines the agent’s error rate in performing tasks. The problem is to determine the truth values of the tasks solely based on the agent evaluations. We present algorithms whose error guarantees depend on the expansion properties of the agent-task graph. We next discuss generalizations of this model by using gaussian distributions to capture tasks with continuous feedback, and present initial results.


Tight bounds for rumor spreading with graph expansion
George Giakkoupis

Rumor spreading is a basic randomized mechanism for information dissemination in networks. Each node periodically contacts a random neighbor, and the two nodes exchange any information they currently have.
Rumor spreading provides a simple, scalable, and robust algorithm for message broadcasting, which is particularly relevant for large, unknown or dynamically changing networks. Further, it is interesting from a sociological perspective, as it provides a simple model for how information, rumors, or ideas spread in social networks.

In this talk I will present some results that relate the speed of rumor spreading, that is, how quickly information spreads from a single source to all nodes in the networks, with standard expansion measures of the network.


“Going Viral” and the Structure of Online Diffusion
Sharad Goel

New products, ideas, norms and behaviors are often thought to propagate through a person-to-person diffusion process analogous to the spread of an infectious disease. Until recently, however, it has been prohibitively difficult to directly observe this process, and thus to rigorously quantify or characterize the structure of information cascades. In one of the largest studies to date, we describe the diffusion structure of billions of events across several domains. We find that the vast majority of cascades are small, and are described by a handful of simple tree structures that terminate within one degree of an initial adopting “seed.” While large cascades are extremely rare, the scale of our data allows us to investigate even the one-in-a-million events. To study these rare, large cascades, we develop a formal measure of what we label “structural virality” that interpolates between two extremes: content that gains its popularity through a single, large broadcast, and that which grows via a multi-generational cascade where any one individual is directly responsible for only a fraction of the total adoption. We find that the very largest observed events nearly always exhibit high structural virality, providing some of the first direct evidence that many of the most popular products and ideas grow through person-to-person diffusion. However, medium-sized events — having thousands of adopters — exhibit surprising structural diversity, and are seen to grow both through broadcast and viral means. Finally, we show that our empirical results are largely consistent with an SIR model of contagion on a scale-free network, reminiscent of previous work on the long-term persistence of computer viruses.


Title: TBA
Ravi Kumar
Abstract: TBA


Local Algorithms and Large Scale Graph Clustering
Silvio Lattanzi

Motivated by applications of large-scale graph clustering, we study random-walk-based local algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. In particular, we develop a method with better theoretical guarantee compared to all previous work, both in terms of the clustering accuracy and the conductance of the output set. We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data. More specifically, our method outperforms prior work when the cluster is well-connected. In fact, the better it is well-connected inside, the more significant improvement we can obtain. Our results shed light on why in practice some random-walk-based algorithms perform better than its previous theory, and help guide future research about local clustering.

Joint work with: Zeyuan Allen Zhu, Vahab Mirrokni.


Large-scale graph (overlapping) clustering in MapReduce and Beyond
Vahab Mirrokni

I will discuss recent results about large-scale (overlapping) clustering of huge graphs. I will start by defining an overlapping clustering problem, will describe an approximation algorithm for the problem, and then discuss how to compute such clusters in large scale for big graphs. To achieve this goal, I will discuss computing graph clustering, and connected components in big graphs in distributed frameworks like MapReduce and a new asynchronous graph mining framework called ASYMP.


Title: TBA
Sandeep Pandey
Abstract: TBA


Network Discovery via Gossip: Analyzing Dynamically Evolving Distributed and Social Networks
Gopal Pandurangan

We study dynamic evolution of large-scale distributed networks like peer-to-peer or social networks that evolve by discovering new neighbors via localized gossip.

A well-studied problem in peer-to-peer networks is the resource discovery problem. There, the goal for nodes (hosts with IP addresses) is to discover the IP addresses of all other hosts. In social networks, nodes (people) discover new nodes through exchanging contacts with their neighbors (friends). In both cases the discovery of new nodes changes the underlying network – new edges are added to the network – and the process continues in the changed network. Rigorously analyzing such dynamic (stochastic) processes with a continuously self-changing topology remains a challenging problem with obvious applications.

We present and analyze two natural gossip-based discovery processes in dynamically changing networks. In the push process, each node repeatedly chooses two random neighbors and puts them in contact (i.e., “pushes” their mutual information to each other). In the pull discovery process, each node repeatedly requests or “pulls” a random contact from a random neighbor. Both processes are lightweight, local, and naturally robust due to their randomization.

Our main result is an almost-tight analysis of the time taken for these two randomized processes to converge. We show that in any undirected n-node graph both processes take O(n log^2 n) rounds to connect every node to all other nodes with high probability, whereas Omega(n log n) is a lower bound. In the directed case we give an O(n^2 log n) upper bound and an Omega(n^2) lower bound for strongly connected directed graphs. A key technical challenge that we overcome is the analysis of a randomized process that itself results in a constantly changing network which leads to complicated dependencies in every round.

Our processes are simple, lightweight, and easy to implement, and can be used for efficient resource discovery in distributed networks. Our processes can also give insight into the topology evolution of real-social networks such as LinkedIn and can help in designing algorithms and data structures to search and navigate such networks.

This is joint work with B. Haeupler, D. Peleg, R. Rajaraman, and Z. Sun.


Title: TBA
Bo Pang
Abstract: TBA


“Small World Navigability – Limits of Efficiency”
Philipp Woelfel

In 2000, Kleinberg introduced his seminal Small World Graph Model to study the performance of decentralized routing in social networks. Applications such as peer-to-peer networks are based on Kleinberg’s observation that a simple greedy routing algorithm can perform efficiently in randomly augmented grid and ring networks. In this talk I will discuss recent lower bound results for greedy routing in such augmented networks.