RANDOM GRAALS 2002 Random(ized) Graphs and Algorithms
27-31 May 2002 |
---|
08.00-09.00 | arrivals | breakfast | |||||
---|---|---|---|---|---|---|---|
09.15-10.15 | Joel Spencer | Uri Zwick | Eli Upfal | Luca Becchetti | jam session | departures | |
10.15-10.30 | coffee break | coffee break | coffee break | coffee break | |||
10.30-11.30 | Aravind Srinivasan | Micah Adler | Alan Frieze | Leonard Schulman | |||
12.00-13.15 | lunch! | lunch! | lunch! | lunch! | lunch! | ||
14.00-15.00 | Dana Randall | Berthold Vöcking | Muthu Muthukrishnan | Ashish Goel | departures | ||
15.00-16.00 | Eric Vigoda | Jirka Matousek | jam session | Alessandro Panconesi | |||
16.00-16.15 | coffee | coffee | jam session | ||||
16.15-17.15 | Michael Mitzenmacher | Jaikumar Radhakrishnan |
Arrival: | Sunday 26 May 2002 |
---|---|
Departure: | Friday 31 May - Saturday 1 June, 2002 |
Bertinoro itself is picturesque, with many narrow streets and walkways winding around the central peak. The meeting will be held in a redoubtable ex-Episcopal fortress that has been converted by the University of Bologna into a modern conference center with computing facilities and Internet access. From the fortress you can enjoy a beautiful the vista that stretches from the Tuscan Apennines to the Adriatic coast.
Scientific Organizing Committee | Michael Mitzenmacher Harvard University |
---|---|
Alessandro Panconesi, University La Sapienza of Rome | |
Aravind Srinivasan University of Maryland, College Park | |
Local Organization | |
Andrea Bandini, Elena Della Godenza, Centro Congressi di Bertinoro | |
Sponsored by | BICI Bertinoro International Center for Informatics |
Tradeoffs in Probabilistic Packet Marking for IP Traceback
by Micah Adler, University of Massachussets
There has been considerable recent interest in probabilistic
packet marking schemes for the problem of tracing a sequence
of network packets back to an anonymous source. An important
consideration for such schemes is b, the number of packet
header bits that need to be allocated to the marking
protocol. However, all previous schemes belong to a class of
protocols for which b must be at least log(n), where n is
the number of bits used to represent the path of the
packets. In this talk, we introduce a new marking technique
for tracing a sequence of packets sent along the same
path. This new technique is effective even when b=1. In
other words, the sequence of packets can be traced back to
their source using only a single bit in the packet
header. With this scheme, the number of packets required to
reconstruct the path is O(2^(2n)), but we also show that
Omega(2^n) packets are required for any protocol where b=1.
We also study the tradeoff between b and the number of
packets required. We provide a protocol and a lower bound
that together demonstrate that for the optimal protocol, the
number of packets required (roughly) increases exponentially
with n, but decreases doubly exponentially with b. The
protocol we introduce is simple enough to be useful in
practice. We also study the case where the packets are sent
along k different paths. For this case, we demonstrate that
any protocol must use at least log(2k-2) header bits. We
also provide a protocol that requires log(2k+1) header bits
in some restricted scenarios (to which the lower bound
applies). This protocol introduces a new coding technique
that may be of independent interest.
Phase Transitions for Random Processes
by Joel Spencer, Courant Institute
It has been forty years since the discovery by Paul Erd\H{o}s and Alfred R\'enyi that the random graph undergoes a phase transition (they called it: double jump) near $p=\frac{1}{n}$, many small components rapidly melding into the ``giant component." We now view this as a percolation effect, with classical connections to Mathematical Physics. In recent years a host of random processes coming from the fecund intersection of Discrete Math, Computer Science and Probability have shown strong percolation behavior. A particular set of examples involve what are called Achlioptos processes: each time unit two random edges are generated on $n$ vertices. Precisely one of them must be added to an evolving graph, initially empty. For example, add the first edge if isolated, otherwise add the second. There appears to be an explicit $t_0$ such that near time $t_0\frac{n}{2}$ a giant component suddenly emerges but there are many open questions about the nature of the process near that critical time.
Non-Clairvoyant Scheduling to Minimize the
Total Flow Time on Single and Parallel Machines
by Luca Becchetti, La Sapienza
Scheduling a sequence of jobs released over time when the processing time of a job is only known at its completion is a classical problem in CPU scheduling in time sharing operating systems. A widely used measure for the responsiveness of the system is the average flow time of the jobs, i.e. the average time spent by jobs in the system between release and completion.
The Windows NT and the Unix operating system scheduling policies are based on the Multi-level Feedback algorithm. In our work we proved that a randomized version of the Multi-level Feedback algorithm is competitive for single and parallel machine systems, in our opinion providing one theoretical validation of the goodness of an idea that has proven effective in practice along the last two decades.
Joint work with Stefano Leonardi
Probabilistic Analysis of Symmetric Travelling Salesman Problems
by Alan Frieze, CMU
Let the edges of the complete graph $K_n$ be assigned independent uniform $[0,1]$ random edge weights. Let $Z_{TSP}$ and $Z_{2FAC}$ be the weights of the minimum length travelling salesman tour and minimum weight 2-factor respectively. We show that with high probability $|Z_{TSP}-Z_{2FAC}|=o(1)$. The proof is via by the analysis of a polynomial time algorithm that finds a tour only a little longer than $Z_{2FAC}$.
Unique-sink orientations of cubes, a talk in two-parts: Part I
by Emo Welzl, ETH
There is an ongoing attempt of designing a provably fast method for solving Linear Progams and related optimization problems by combinatorial methods in the unit cost model (as opposed to the bit-model, where polynomial methods are known). Among others, this research has brought forth randomized methods with subexponential running time, but also a number of interesting combinatorial models like Abstract Objective Functions (AOF), Unimodal Numberings, LP-type Problems, Abstract Optimization Problems (AOP), Unique Sink Orientations of Cubes (USO), et cetera. After some general introduction we will concentrate on the USO model.
Unique-sink orientations of cubes, a talk in two-parts: Part II
Jirka Matousek, Charles University
The $n$-cube is considered as a graph (with vertex set $\{0,1\}^n$). An unique-sink orientation (USO) is an orientation of the edges of the $n$-cube such that every face of the cube has exactly one sink (directed cycles are allowed). Such orientations arise from several sources, such as linear programs (considering a generic linear function on a polytope isomorphic to the cube), certain linear complementarity problems, and certain convex programs. Algorithms have been studied for finding the global sink in a USO; the USO is specified by an oracle that, given a vertex, returns the orientation of the edges incident to that vertex. Upper and lower bounds for the complexity of such algorithms have recently been given by Welzl, Szab\'o, and Schurr; improving them significantly is the main challenge. The speaker has proved that the number of USO is $2^{\Theta(2^n \log n)}$. The number of acyclic USO is easily seen to be between $2^{\Omega(2^n)}$ and $2^{O(2^n\log n)}$; it would be nice to find better bounds.
Simultaneous Optimization with Concave and Convex Costs
by Ashish Goel, University of Southern California
In simultaneous optimization, we need to simultaneously approximate a whole
family of objective functions. In this talk, we will consider two problems:
a) Suppose we want to construct a tree for sending information from a
single source to k sinks. Further, suppose that the cost of sending
x units of information over a link is proportional to f(x), where f is concave,
f(0) = 0, and f is non-decreasing. This problem is known as single source
buy-at-bulk network design. We will present a simple randomized algorithm
that guarantees an O(log k) approximation to the cheapest tree for all such
functions f, simultaneously.
b) Suppose we want to assign jobs to machines, and the cost of an
allocation is some symmetric convex function of the loads on individual
machines. We will present a general framework for approximating all such
functions simultaneously, and apply it to the case when job sizes are
random variables.
(b) represents joint work with Adam Meyerson.
Building bipartite graphs with smaller bipartite graphs
by Jaikumar Radhakrishnan, TIFR Bombay
We say that a bipartite graph $G=(V,W,E)$ has an independent set of
size $k$ if there exist sets $S\subseteq V$ and $T \subseteq W$ of
size $k$ each with no edge connecting them.
K\H{o}vari, S\'{o}s and Tur\'{a}n proved that if $G$ has no
independent set of size $k$ ($n^{\epsilon} \leq k \leq
n^{1-\epsilon}$), then it has $\Omega(\frac{n^2}{k}\log n)$ edges.
We consider a generalization of this.
Given $n_1,n_2,\ldots,n_r$, we wish to construcuct a bipartite graph
$G$ (with vertex sets of size $n$) by putting together copies of
$K_{n_i,n_i}$ such that $G$ has no independent set of size $k$. The
theorem of K\"{o}vari, S\'{o}s and Tur\'{a}n refers to the situation
when all $n_i$ are $1$, and implies the following
necessary condition for our generalization:
\begin{equation}
\sum_{i} n_i^2 = \Omega(\frac{n^2}{k}\log n). \label{eq:kst}
\end{equation}
On the other hand, if we put copies of $K_{n_i,n_i}$ together at random,
the resulting graph has no independent set of size $k$ with high
probability provided
\begin{equation}
\sum_{n_i < k} \left(\frac{n_i}{n/k}\right)^2 + \sum_{n_i \geq k}
\left(\frac{n_i}{n/k}\right) \geq ck \log n. \label{eq:suff}
\end{equation}
Note that (\ref{eq:kst}) explains the first term on the left hand
side of (\ref{eq:suff}) but not the second.
We will show that the sufficient condition (\ref{eq:suff}) is also
necessary, and derive from this a lower bound on the size of depth-two
superconcentrators.
Approximate distance oracles
by Uri Zwick, Tel Aviv University
Let G=(V,E) be an undirected *weighted* graph with |V|=n and
|E|=m. Let k>1 be an integer. We show that G=(V,E) can be
preprocessed in O(kmn^{1/k}) expected time, constructing a data
structure of size O(kn^{1+1/k}), such that any subsequent
*distance query* can be answered, approximately, in O(k) time.
The approximate distance returned is of stretch at most 2k-1, i.e.,
the quotient obtained by dividing the estimated distance by the actual
distance lies between 1 and 2k-1. We show that a 1963 girth conjecture of
Erdo"s, implies that Omega(n^{1+1/k}) space is needed in the
worst case for any real stretch strictly smaller than 2k+1. The
space requirement of our algorithm is, therefore, essentially optimal.
The most impressive feature of our data structure is its *constant*
query time, hence the name ``oracle''. Previously, data
structures that used only O(n^{1+1/k}) space had a query time of
Omega(n^{1/k}) and a slightly larger, non-optimal, stretch.
Our algorithms are extremely simple and easy to implement efficiently.
They also provide faster constructions of sparse *spanners* of
weighted graphs, and improved *tree covers* and *distance
labelings* of weighted or unweighted graphs.
Joint work with Mikkel Thorup
Distributions on level-sets with applications to approximation algorithms
by Aravind Srinivasan, University of Maryland - College Park
We consider a family of distributions on fixed-weight vectors of some $t$ bits; these distributions enjoy a natural negative correlation property and also satisfy pre-specified conditions on their marginal distributions. We show the existence of such families, and present a linear-time algorithm to sample from them. We then present applications to approximation algorithms.
Building Low-Diameter P2P Networks
by Eli Upfal, Brown University
In a peer-to-peer (P2P) network, nodes connect into an existing
network and participate in providing and availing of services.
There is no dichotomy between a central server and distributed
clients. Current P2P networks (e.g., Gnutella) are constructed by
participants following their own un-coordinated (and often
whimsical) protocols; they consequently suffer from frequent
network overload and fragmentation into disconnected pieces
separated by choke-points with inadequate bandwidth.
In this paper we propose a simple scheme for participants to build
P2P networks in a distributed fashion, and prove that it results
in connected networks of constant degree and logarithmic diameter.
It does so with no global knowledge of all the nodes in the
network. In the most common P2P application to date (search),
these properties are crucial.
In evaluating the performance of the P2P protocol we focus on the
long term behavior of the system in a fully decentralized
environment in which nodes arrive and depart in an uncoordinated,
and unpredictable fashion. This setting is best modeled by
a stochastic, memoryless, continuous-time setting.
The arrival of new nodes is modeled by
Poisson distribution with rate $\lambda$, and the duration of time a
node stays connected to the network has an exponential
distribution with parameter $\mu$. Let $G_t$ be the network at
time $t$ ($G_0$ has no vertices). We analyze the evolution in time
of the stochastic process ${\cal G}=(G_t)_{t\geq 0}
(Joint work with G. Pandurangan and P. Raghavan.)
Tight Bounds for Worst-Case Equilibria
by Berthold Vöcking, Max Planck Institut
The central question of this talk is how much does network performance suffer from the lack of regulation? - As a first step toward formalizing this question mathematically, we assume that, in the absence of network regulation, users act in a purely selfish manner. Under this assumption, we can view network users as independent agents participating in a noncooperative game and expect the routes chosen by users to form a Nash equilibrium in the sense of classical game theory. We present some first results from this new and fascinating area of research.
Some general connections between phase transitions and
Markov chains
by Eric Vigoda, University of Chicago
We'll examine some general results relating absence of a phase transition in certain statistical physics models with convergence rates of associated Markov chains (known as Glauber dynamics). Recently Cesi (PTRF 2001) proved some general connections using Log-Sobolev inequalities. For the case of attractive systems we present a simpler coupling based proof of his result.
This talk is based on a joint work with M. Dyer, A. Sinclair, and D. Weitz.
Dynamic Models for File Sizes and Double Pareto Distributions
by Michael Mitzenmacher, Harvard University
We introduce and analyze a new generative user model to explain the behavior of file size distributions. Our Recursive Forest File model combines ideas from recent work by Downey with ideas from recent work on random graph models for the Web. Unlike similar previous work, our Recursive Forest File model allows new files to be created and old files to be deleted over time, and our analysis covers problematic issues such as correlation among file sizes. Moreover, our model allows natural variations where files that are copied or modified are more likely to be copied or modified subsequently.
Previous empirical work suggests that file sizes tend to have a lognormal body but a Pareto tail. The Recursive Forest File model explains this behavior, yielding a double Pareto distribution, which has a Pareto tail but close to a lognormal body. We believe the Recursive Forest model may be useful for describing other power law phenomena in computer systems as well as other fields.
Muirhead-Rado inequality for compact groups
by Leonard Schulman, Caltech
I will explain this classical majorization inequality, and discuss the compact, complex extension.
Surfing Streams: Randomized Techniques for Sublinear Algorithms
by S. Muthu Muthukrishnan, Rutgers University and AT&T Research
Say you go fishing in a pond. There are many different types of fish
and you catch a bounty. At the end of the day, how can you quickly
estimate the fish types you have caught that are only a few in your catch?
Suppose now that your colleague goes fishing as well and she also has a
bountiful catch. At the end of the day, how do you quickly check
if the fish types you have caught are similar or dissimilar?
These problems are interesting when you can not remember all the
fish types in your catch, and do not have the time to sort through them
for answering the questions above.
When you mine stream data from sensors, Internet routers, and other
sources, the problems one faces are not unlike the fishing examples
above. I will present solutions to these and other problems in data
streaming context. Solutions rely on a number of randomized techniques:
embeddings, construction of range summable random variables, group testing
with algebraic tests, use of approximate min-wish hash functions, etc.
Problems become hard when one throws some of the catch back into the pond!
This is joint work with several algorithms and database researchers.
Some Open Problems Concerning Distributed Algorithms
by Alessandro Panconesi, La Sapienza
Distributed graph algorithms is an area where the use of randomization
typically leads to simple and fast algorithms, whereas deterministic solutions
almost invariably are contrived and exponentially slower.
In this talk we shall present an overview of the state of the art and discuss
some open problems, the most important of which being whether
a separation result might be lurking in the background.
Simulated Tempering: Fast or Slow?
by Dana Randall, Georgia Institute of Technology
Simulated tempering is a compelling Markov chain heuristic used for random sampling when other Markov chains are known to be slow. The idea is to enhance the state space with a parameter modeling temperature, and to allow the temperature to vary during the simulation. At high temperature bottlenecks (which cause slow mixing) disappear, mixing occurs, and lowering the temperature recovers the stationary distribution of interest. The swapping algorithm is a variant of this method.
Recently Madras and Zheng analyzed the swapping algorithm on two bimodal distributions, including the mean-field Ising model, and showed that it is efficient. We generalize their results to asymmetric models and suggest a more general approach to implementing tempering-type algorithms.