RANDOM GRAALS 2002 Random(ized) Graphs and Algorithms
27-31 May 2002 |
![]() |
---|
08.00-09.00 | arrivals | breakfast | |||||
---|---|---|---|---|---|---|---|
09.15-10.15 | Joel Spencer | Uri Zwick | Eli Upfal | Luca Becchetti | jam session | departures | |
10.15-10.30 | coffee break | coffee break | coffee break | coffee break | |||
10.30-11.30 | Aravind Srinivasan | Micah Adler | Alan Frieze | Leonard Schulman | |||
12.00-13.15 | lunch! | lunch! | lunch! | lunch! | lunch! | ||
14.00-15.00 | Dana Randall | Berthold Vöcking | Muthu Muthukrishnan | Ashish Goel | departures | ||
15.00-16.00 | Eric Vigoda | Jirka Matousek | jam session | Alessandro Panconesi | |||
16.00-16.15 | coffee | coffee | jam session | ||||
16.15-17.15 | Michael Mitzenmacher | Jaikumar Radhakrishnan |
Arrival: | Sunday 26 May 2002 |
---|---|
Departure: | Friday 31 May - Saturday 1 June, 2002 |
Bertinoro itself is picturesque, with many narrow streets and walkways winding around the central peak. The meeting will be held in a redoubtable ex-Episcopal fortress that has been converted by the University of Bologna into a modern conference center with computing facilities and Internet access. From the fortress you can enjoy a beautiful the vista that stretches from the Tuscan Apennines to the Adriatic coast.
Scientific Organizing Committee | Michael Mitzenmacher Harvard University |
---|---|
Alessandro Panconesi, University La Sapienza of Rome | |
Aravind Srinivasan University of Maryland, College Park | |
Local Organization | |
Andrea Bandini, Elena Della Godenza, Centro Congressi di Bertinoro | |
Sponsored by |
![]() |
There has been considerable recent interest in probabilistic
packet marking schemes for the problem of tracing a sequence
of network packets back to an anonymous source. An important
consideration for such schemes is b, the number of packet
header bits that need to be allocated to the marking
protocol. However, all previous schemes belong to a class of
protocols for which b must be at least log(n), where n is
the number of bits used to represent the path of the
packets. In this talk, we introduce a new marking technique
for tracing a sequence of packets sent along the same
path. This new technique is effective even when b=1. In
other words, the sequence of packets can be traced back to
their source using only a single bit in the packet
header. With this scheme, the number of packets required to
reconstruct the path is O(2^(2n)), but we also show that
Omega(2^n) packets are required for any protocol where b=1.
Phase Transitions for Random Processes
It has been forty years since the discovery by
Paul Erd\H{o}s and Alfred R\'enyi that the random graph
undergoes a phase transition (they called it: double jump)
near $p=\frac{1}{n}$, many small components rapidly melding
into the ``giant component." We now view this as a percolation
effect, with classical connections to Mathematical Physics.
In recent years a host of random processes coming from the
fecund intersection of Discrete Math, Computer Science and
Probability have shown strong percolation behavior.
A particular set of examples involve what are called
Achlioptos processes: each time unit two random edges are
generated on $n$ vertices. Precisely one of them must be
added to an evolving graph, initially empty. For example,
add the first edge if isolated, otherwise add the second.
There appears to be an explicit $t_0$ such that near time
$t_0\frac{n}{2}$ a giant component suddenly emerges but
there are many open questions about the nature of the process
near that critical time.
Non-Clairvoyant Scheduling to Minimize the
Total Flow Time on Single and Parallel Machines
Scheduling a sequence of jobs released over time when the processing
time of a job is only known at its completion is a classical problem
in CPU scheduling in time sharing operating systems.
A widely used measure for the responsiveness of the system
is the average flow time of the jobs, i.e. the
average time spent by jobs in the system between release and
completion.
The Windows NT and the Unix operating system scheduling policies
are based on the Multi-level Feedback algorithm.
In our work we proved that a randomized version
of the Multi-level Feedback algorithm
is competitive for single and parallel machine systems, in our
opinion providing one theoretical validation of the goodness of
an idea that has proven effective in practice along the last two
decades.
Joint work with Stefano Leonardi
Probabilistic Analysis of Symmetric Travelling Salesman Problems
Let the edges of the complete graph $K_n$ be assigned
independent uniform $[0,1]$ random edge weights. Let
$Z_{TSP}$ and $Z_{2FAC}$ be the weights of the minimum length
travelling salesman tour and minimum weight 2-factor
respectively. We show that with high probability
$|Z_{TSP}-Z_{2FAC}|=o(1)$. The proof is via by the analysis
of a polynomial time algorithm that finds a tour only a little
longer than $Z_{2FAC}$.
Unique-sink orientations of cubes, a talk in two-parts: Part I
There is an ongoing attempt of designing a
provably fast method for solving
Linear Progams and related optimization
problems by combinatorial methods in the unit
cost model (as opposed to the bit-model,
where polynomial methods are known). Among
others, this research has brought forth
randomized methods with subexponential running
time, but also a number of interesting
combinatorial models like Abstract Objective
Functions (AOF), Unimodal Numberings,
LP-type Problems, Abstract Optimization Problems
(AOP), Unique Sink Orientations of Cubes (USO),
et cetera. After some general introduction we
will concentrate on the USO model.
Unique-sink orientations of cubes, a talk in two-parts: Part II
The $n$-cube is considered as a graph (with vertex set $\{0,1\}^n$).
An unique-sink orientation (USO) is an orientation
of the edges of the $n$-cube such that every face of the
cube has exactly one sink (directed cycles are allowed).
Such orientations arise from several sources, such as linear programs
(considering a generic linear function on a polytope isomorphic to the
cube), certain linear complementarity problems, and certain
convex programs. Algorithms have been studied for finding
the global sink in a USO; the USO is specified by an oracle that, given
a vertex, returns the orientation of the edges
incident to that vertex. Upper and lower bounds for the complexity
of such algorithms have recently been given by Welzl, Szab\'o, and Schurr;
improving them significantly is the main challenge.
The speaker has proved that the number of USO is
$2^{\Theta(2^n \log n)}$. The number of acyclic USO
is easily seen to be between $2^{\Omega(2^n)}$ and $2^{O(2^n\log n)}$;
it would be nice to find better bounds.
Simultaneous Optimization with Concave and Convex Costs
In simultaneous optimization, we need to simultaneously approximate a whole
family of objective functions. In this talk, we will consider two problems:
Building bipartite graphs with smaller bipartite graphs
We say that a bipartite graph $G=(V,W,E)$ has an independent set of
size $k$ if there exist sets $S\subseteq V$ and $T \subseteq W$ of
size $k$ each with no edge connecting them.
K\H{o}vari, S\'{o}s and Tur\'{a}n proved that if $G$ has no
independent set of size $k$ ($n^{\epsilon} \leq k \leq
n^{1-\epsilon}$), then it has $\Omega(\frac{n^2}{k}\log n)$ edges.
We consider a generalization of this.
Approximate distance oracles
Let G=(V,E) be an undirected *weighted* graph with |V|=n and
|E|=m. Let k>1 be an integer. We show that G=(V,E) can be
preprocessed in O(kmn^{1/k}) expected time, constructing a data
structure of size O(kn^{1+1/k}), such that any subsequent
*distance query* can be answered, approximately, in O(k) time.
The approximate distance returned is of stretch at most 2k-1, i.e.,
the quotient obtained by dividing the estimated distance by the actual
distance lies between 1 and 2k-1. We show that a 1963 girth conjecture of
Erdo"s, implies that Omega(n^{1+1/k}) space is needed in the
worst case for any real stretch strictly smaller than 2k+1. The
space requirement of our algorithm is, therefore, essentially optimal.
The most impressive feature of our data structure is its *constant*
query time, hence the name ``oracle''. Previously, data
structures that used only O(n^{1+1/k}) space had a query time of
Omega(n^{1/k}) and a slightly larger, non-optimal, stretch.
Joint work with Mikkel Thorup
Distributions on level-sets with applications to approximation algorithms
We consider a family of distributions on fixed-weight
vectors of some $t$ bits; these distributions enjoy a natural
negative correlation property and also satisfy pre-specified
conditions on their marginal distributions. We show the existence
of such families, and present a linear-time algorithm to sample
from them. We then present applications to approximation algorithms.
Building Low-Diameter P2P Networks
In a peer-to-peer (P2P) network, nodes connect into an existing
network and participate in providing and availing of services.
There is no dichotomy between a central server and distributed
clients. Current P2P networks (e.g., Gnutella) are constructed by
participants following their own un-coordinated (and often
whimsical) protocols; they consequently suffer from frequent
network overload and fragmentation into disconnected pieces
separated by choke-points with inadequate bandwidth.
Tight Bounds for Worst-Case Equilibria
The central question of this talk is how much does network
performance suffer from the lack of regulation? - As a first
step toward formalizing this question mathematically, we assume
that, in the absence of network regulation, users act in a purely
selfish manner. Under this assumption, we can view network users
as independent agents participating in a noncooperative game and
expect the routes chosen by users to form a Nash equilibrium
in the sense of classical game theory. We present some first
results from this new and fascinating area of research.
Some general connections between phase transitions and
Markov chains
We'll examine some general results relating absence of a phase
transition in certain statistical physics models with convergence
rates of associated Markov chains (known as Glauber dynamics).
Recently Cesi (PTRF 2001) proved some general connections using
Log-Sobolev inequalities. For the case of attractive systems we
present a simpler coupling based proof of his result.
This talk is based on a joint work with M. Dyer, A. Sinclair, and
D. Weitz.
Dynamic Models for File Sizes and Double Pareto Distributions
We introduce and analyze a new generative user model to explain the
behavior of file size distributions. Our Recursive Forest File model
combines ideas from recent work by Downey with ideas from recent work
on random graph models for the Web. Unlike similar previous work, our
Recursive Forest File model allows new files to be created and old
files to be deleted over time, and our analysis covers problematic
issues such as correlation among file sizes. Moreover, our model
allows natural variations where files that are copied or modified are
more likely to be copied or modified subsequently.
Previous empirical work suggests that file sizes tend to have a
lognormal body but a Pareto tail. The Recursive Forest File model
explains this behavior, yielding a double Pareto distribution, which
has a Pareto tail but close to a lognormal body. We believe the
Recursive Forest model may be useful for describing other power law
phenomena in computer systems as well as other fields.
Muirhead-Rado inequality for compact groups
I will explain this classical majorization inequality, and discuss the
compact, complex extension.
Surfing Streams: Randomized Techniques for Sublinear Algorithms
Say you go fishing in a pond. There are many different types of fish
and you catch a bounty. At the end of the day, how can you quickly
estimate the fish types you have caught that are only a few in your catch?
Suppose now that your colleague goes fishing as well and she also has a
bountiful catch. At the end of the day, how do you quickly check
if the fish types you have caught are similar or dissimilar?
These problems are interesting when you can not remember all the
fish types in your catch, and do not have the time to sort through them
for answering the questions above.
ABSTRACTS
Tradeoffs in Probabilistic Packet Marking for IP Traceback
by Micah Adler, University of Massachussets
We also study the tradeoff between b and the number of
packets required. We provide a protocol and a lower bound
that together demonstrate that for the optimal protocol, the
number of packets required (roughly) increases exponentially
with n, but decreases doubly exponentially with b. The
protocol we introduce is simple enough to be useful in
practice. We also study the case where the packets are sent
along k different paths. For this case, we demonstrate that
any protocol must use at least log(2k-2) header bits. We
also provide a protocol that requires log(2k+1) header bits
in some restricted scenarios (to which the lower bound
applies). This protocol introduces a new coding technique
that may be of independent interest.
by Joel Spencer, Courant Institute
by Luca Becchetti, La Sapienza
by Alan Frieze, CMU
by Emo Welzl, ETH
Jirka Matousek, Charles University
by Ashish Goel, University of Southern California
a) Suppose we want to construct a tree for sending information from a
single source to k sinks. Further, suppose that the cost of sending
x units of information over a link is proportional to f(x), where f is concave,
f(0) = 0, and f is non-decreasing. This problem is known as single source
buy-at-bulk network design. We will present a simple randomized algorithm
that guarantees an O(log k) approximation to the cheapest tree for all such
functions f, simultaneously.
b) Suppose we want to assign jobs to machines, and the cost of an
allocation is some symmetric convex function of the loads on individual
machines. We will present a general framework for approximating all such
functions simultaneously, and apply it to the case when job sizes are
random variables.
(b) represents joint work with Adam Meyerson.
by Jaikumar Radhakrishnan, TIFR Bombay
Given $n_1,n_2,\ldots,n_r$, we wish to construcuct a bipartite graph
$G$ (with vertex sets of size $n$) by putting together copies of
$K_{n_i,n_i}$ such that $G$ has no independent set of size $k$. The
theorem of K\"{o}vari, S\'{o}s and Tur\'{a}n refers to the situation
when all $n_i$ are $1$, and implies the following
necessary condition for our generalization:
\begin{equation}
\sum_{i} n_i^2 = \Omega(\frac{n^2}{k}\log n). \label{eq:kst}
\end{equation}
On the other hand, if we put copies of $K_{n_i,n_i}$ together at random,
the resulting graph has no independent set of size $k$ with high
probability provided
\begin{equation}
\sum_{n_i < k} \left(\frac{n_i}{n/k}\right)^2 + \sum_{n_i \geq k}
\left(\frac{n_i}{n/k}\right) \geq ck \log n. \label{eq:suff}
\end{equation}
Note that (\ref{eq:kst}) explains the first term on the left hand
side of (\ref{eq:suff}) but not the second.
We will show that the sufficient condition (\ref{eq:suff}) is also
necessary, and derive from this a lower bound on the size of depth-two
superconcentrators.
by Uri Zwick, Tel Aviv University
Our algorithms are extremely simple and easy to implement efficiently.
They also provide faster constructions of sparse *spanners* of
weighted graphs, and improved *tree covers* and *distance
labelings* of weighted or unweighted graphs.
by Aravind Srinivasan, University of Maryland - College Park
by Eli Upfal, Brown University
In this paper we propose a simple scheme for participants to build
P2P networks in a distributed fashion, and prove that it results
in connected networks of constant degree and logarithmic diameter.
It does so with no global knowledge of all the nodes in the
network. In the most common P2P application to date (search),
these properties are crucial.
In evaluating the performance of the P2P protocol we focus on the
long term behavior of the system in a fully decentralized
environment in which nodes arrive and depart in an uncoordinated,
and unpredictable fashion. This setting is best modeled by
a stochastic, memoryless, continuous-time setting.
The arrival of new nodes is modeled by
Poisson distribution with rate $\lambda$, and the duration of time a
node stays connected to the network has an exponential
distribution with parameter $\mu$. Let $G_t$ be the network at
time $t$ ($G_0$ has no vertices). We analyze the evolution in time
of the stochastic process ${\cal G}=(G_t)_{t\geq 0}
(Joint work with G. Pandurangan and P. Raghavan.)
by Berthold Vöcking, Max Planck Institut
by Eric Vigoda, University of Chicago
by Michael Mitzenmacher, Harvard University
by Leonard Schulman, Caltech
by S. Muthu Muthukrishnan, Rutgers University and AT&T Research
When you mine stream data from sensors, Internet routers, and other
sources, the problems one faces are not unlike the fishing examples
above. I will present solutions to these and other problems in data
streaming context. Solutions rely on a number of randomized techniques:
embeddings, construction of range summable random variables, group testing
with algebraic tests, use of approximate min-wish hash functions, etc.
Problems become hard when one throws some of the catch back into the pond!
This is joint work with several algorithms and database researchers.