Balls and Bins: A Study in Negative Dependence
DOI:
https://doi.org/10.7146/brics.v3i25.20006Resumé
This paper investigates the notion of negative dependence amongst random variables and attempts to advocate its use as a simple and unifying paradigm forthe analysis of random structures and algorithms.
The assumption of independence between random variables is often very convenient for the several reasons. Firstly, it makes analyses and calculations much simpler. Secondly, one has at hand a whole array of powerful mathematical concepts and tools from classical probability theory for the analysis, such as laws of large numbers, central limit theorems and large deviation bounds which are usually derived under the assumption of independence. Unfortunately, the analysis of most randomized algorithms involves random variables that are not independent. In this case, classical tools from standard probability
theory like large deviation theorems, that are valid under the assumption of independence between the random variables involved, cannot be used as such. It is
then necessary to determine under what conditions of dependence one can still use the classical tools.
It has been observed before [32, 33, 38, 8], that in some situations, even though the variables involved are not independent, one can still apply some of the standard
tools that are valid for independent variables (directly or in suitably modified form), provided that the variables are dependent in specific ways. Unfortunately, it
appears that in most cases somewhat ad hoc strategems have been devised, tailored to the specific situation at hand, and that a unifying underlying theory that delves
deeper into the nature of dependence amongst the variables involved is lacking. A frequently occurring scenario underlying the analysis of many randomised
algorithms and processes involves random variables that are, intuitively, dependent in the following negative way: if one subset of the variables is "high" then a disjoint
subset of the variables is "low". In this paper, we bring to the forefront and systematize some precise notions of negative dependence in the literature, analyse
their properties, compare them relative to each other, and illustrate them with several applications.
One specific paradigm involving negative dependence is the classical "balls and bins" experiment. Suppose we throw m balls into n bins independently at random.
For i in [n], let Bi be the random variable denoting the number of balls in the ith bin. We will often refer to these variables as occupancy numbers. This is a
classical probabilistic paradigm [16, 22, 26] (see also [31, sec. 3.1]) that underlies the
analysis of many probabilistic algorithms and processes. In the case when the balls
are identical, this gives rise to the well-known multinomial distribution [16, sec VI.9]: there are m repeated independent trials (balls) where each trial (ball) can result in one of the outcomes E1, ..., En (bins). The probability of the realisation of event Ei is pi for i in [n] for each trial. (Of course the probabilities are subject to the condition
Sum_i pi = 1.) Under the multinomial distribution, for any integers
m1, ..., mn such that
Sum_i mi = m the probability that for each i in [n], event Ei
occurs mi times is m!
m1! : : :mn!pm1
1 : : :pmn
n :
The balls and bins experiment is a generalisation of the multinomial distribution:
in the general case, one can have an arbitrary set of probabilities for each ball: the
probability that ball k goes into bin i is pi;k, subject only to the natural restriction
that for each ball k,
P
i pi;k = 1. The joint distribution function correspondingly
has a more complicated form.
A fundamental natural question of interest is: how are these Bi related? Note
that even though the balls are thrown independently of each other, the Bi variables are not independent; in particular, their sum is fixed to m. Intuitively, the Bi's
are negatively dependent on each other in the manner described above: if one set
of variables is "high", a disjoint set is "low". However, establishing such assertions
precisely by a direct calculation from the joint distribution function, though possible
in principle, appears to be quite a formidable task, even in the case where the balls are assumed to be identical.
One of the major contributions of this paper is establishing that the the Bi are negatively dependent in a very strong sense. In particular, we show that the Bi variables satisfy negative association and negative regression, two strong notions of negative dependence that we define precisely below. All the intuitively obvious assertions of negative dependence in the balls and bins experiment follow as easy corollaries. We illustrate the usefulness of these results by showing how to streamline and simplify many existing probabilistic analyses in literature.
Downloads
Publiceret
1996-01-25
Citation/Eksport
Dubhashi, D. P., & Ranjan, D. (1996). Balls and Bins: A Study in Negative Dependence. BRICS Report Series, 3(25). https://doi.org/10.7146/brics.v3i25.20006
Nummer
Sektion
Artikler
Licens
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).