Description
Please solve the Question and show all the steps. you may need the note(pdf) but you cannot copy from.
2 attachmentsSlide 1 of 2attachment_1attachment_1attachment_2attachment_2
Unformatted Attachment Preview
MATH3143
Combinatorics
Lecture Notes, 2021
University of Leeds
Lecturer for 2021 Course: Rudolf Tange.
Acknowledgments
Apart from some changes by Rudolf Tange, these notes were written by Charles
Harris and Richard Elwes who used for Chapters 1-4 and 9 a previous version written
by Prof. Anand Pillay and Dr Alison Parker (with their permission).
Chapters 1-5 and 9 are based on material from How to Count: An Introduction to
Combinatorics by R. Allenby and A. Slomson.
Chapters 6-8 are based on material from Combinatorics: Topics, Techniques and
Algorithms, by P. Cameron.
These are the references books for the course. Copies are held in the library.
Contents
Chapter 1. Ordered and unordered selections, occupancy problems.
1
Chapter 2. Partitions of sets, Bell numbers, and Stirling numbers.
10
Chapter 3. The Inclusion–Exclusion principle and applications.
14
Chapter 4. Partitions of numbers.
18
Chapter 5. Generating functions. Euler’s generating function for partition
numbers.
22
1. Functions and Power Series.
22
2. Generating Functions.
24
3. Euler’s generating function for partition numbers.
26
Chapter 6. Systems of distinct representatives and Hall’s Theorem.
30
1. Hall’s (marriage) Theorem
30
2. How many SDR’s are there?
34
Chapter 7. Latin squares.
35
1. Latin squares and Latin rectangles
35
2. How many Latin Squares are there?
36
3. Youden squares
37
Chapter 8. Extremal Set Theory.
38
1. Intersecting Families
38
2. Sperner Families
40
3. The De Bruijn Erdös Theorem (Not Examinable)
43
Chapter 9. The pigeonhole principle & Ramsey numbers.
45
1. Preliminaries
45
2. The Pigeonhole Principle
46
3. Ramsey Theory
47
ii
CHAPTER 1
Ordered and unordered selections, occupancy problems.
There is some overlap between this chapter and Chapter 2 of Allenby–Slomson.
What is Combinatorics?
Combinatorics is about counting. But how can this be, when counting is something we all learn to do in early childhood? The difference is that in combinatorics
we count sets of objects which depend on one or more parameters, and we want the
answer as a function of these parameters; preferably an explicit formula.
The applications of Combinatorics are widespread in computer science, finance,
statistics, optimisation, and industrial engineering to name just a few. There is also
a particularly close connection between combinatorics and probability theory.
Many problems arise in the real world in the form “how many…?” Not all such
problems have easy solutions. During this module we will introduce techniques for
dealing with some of the more difficult problems of this kind.
We’ll start with some basic counting principles and examples.
Principle 1.1. Two ways of counting the same finite set give the same answer.
This is obvious of course. All the same, it is the basis of many of the proofs we
shall meet in this course. A closely related principle is
Principle 1.2. If there is a bijection between two finite sets, then these sets
have the same size.
Here is an application of the first principle.
Lemma 1.3 (Hand-shaking lemma). The number of delegates at a conference
who shake hands an odd number of times is even.
Proof. Let D1 ,. . .,Dn be the delegates, and let X = {(i, j) | Di and Dj shake hands}.
We count |X| in two ways.
First, |X| is twice the number of handshakes, so |X| is even.
Next, let xi be the number of times that Di shakes hands. Then |X| = x1 +
· · · + xn . Thus x1 + · · · + xn is even, which implies that evenly many of the xi are
odd, which is what we wanted to prove.
1
Principle 1.4 (Multiplication principle). If we make k successive choices, and
for 1 ≤ i ≤ k the number of ways of making the ith choice is ni , then the total
number of choices is n1 × · · · × nk .
One way of illustrating this is to use a tree. For example, suppose we are in a
restaurant ordering from a menu with 2 different starters (soup or pâté), 3 different
main courses (steak, fish or veggie) and 2 desserts (chocolate cake or lemon sorbet).
Pâté
Soup
Steak
Fish
Steak
Veggie
Fish
Veggie
Cake Sorbet Cake Sorbet Cake Sorbet Cake Sorbet Cake Sorbet Cake Sorbet
Since our choice at each course doesn’t affect the number of choices for subsequent courses, we see that there are 2 × 3 × 2 = 12 different total possible dinners.
We can also use this to illustrate a connection to probability theory. What is the
chance that someone selecting randomly from the menu will choose Soup, Veggie,
Sorbet? This is one choice among 12 equally likely possibilities, giving
1
.
12
(What
is the probability they will start with Soup and end with Sorbet?)
Corollary 1.5. For k ≤ n the number of ways of arranging (or permuting, or
selecting in order) k objects from a total of n, is n × (n − 1) × · · · × (n − k + 1) =
n!
.
(n−k)!
This number is also denoted P (n, k) (the “P” stands for permutations).
NB: to make our formulae consistent, it is convenient to set 0! = 1.
Example 1.6. In a race with 20 horses the number of ways in which the first 3
places can be filled (with no ties) is 20 × 19 × 18 = P (20, 3) = 6840.
Remark 1.7. Note that P (n, n) = n! = the number of permutations of a set of
n objects.
Example 1.8. How many ways can we place 8 rooks on an 8 by 8 chessboard
(with a distinguished corner) so that no two rooks can attack each other? This is
2
the same as asking how many ways are there placing 8 ones in an 8 by 8 matrix so
that no column or row has more than 1 one in it.
We place a rook in each row in turn. We first have 8 choices, then 7 choices for
the 2nd rook in the 2nd row and then 6 choices for the third rook in the third row
and so on. There are thus 8! = 40320 different arrangements.
Lemma 1.9. The number of ways of choosing k out of n objects (without regard
to order) is the same as the number of k-element subsets of an n-element set; it
equals
P (n,k)
k!
=
n×(n−1)×···×(n−k+1)
k!
=
(“C” for “choose” or “combination”)
n!
.
(n−k)!k!
n
or k .
We denote this number by C(n, k)
Proof. Each choice of k out of n objects gives rise to k! permutations of k out
of n objects. So C(n, k).k! = P (n, k).
Remark 1.10. Note that C(n, k) = C(n, n − k). (Why? Because to choose k
out of n is the same thing as choosing the complement.)
Example 1.11. The number of different poker hands (5 cards) drawn from a
normal pack of cards (containing 52 cards) is C(52, 5) =
P (52,5)
.
5!
Lemma 1.12. The number of subsets of an n-element set is 2n .
Proof. Let the set be {a1 , . . . , an }. A subset S of this set is determined by
choosing for each i whether ai ∈ S or ai ∈
/ S. By the multiplication principle there
are 2 × · · · × 2 = 2n choices, hence 2n subsets.
Corollary 1.13.
2n = C(n, 0) + C(n, 1) + · · · + C(n, n).
(Why? The left hand side is the total number of subsets, and the right hand side
is the number of subsets with 0 elements plus the number of subsets with 1 element,
etc.)
The numbers C(n, k) =
n
k
for k ≤ n are also known as the binomial coefficients
(on your calculator they may be denoted n Cr , Crn or n Cr ). There are numerous
algebraic identities involving these numbers!
Lemma 1.14. (i). (a + b)n =
Pn
k=0
C(n, k)ak bn−k whenever a and b commute.
(ii). C(n + 1, k) = C(n, k) + C(n, k − 1).
Proof. (i). (a + b)n = (a + b)(a + b) · · · (a + b). When we multiply this out
each term arises from picking either a or b in each bracket and multiplying them
together. The number of ways of getting ak bn−k is the number of ways of choosing k
3
of the n factors (then we pick a from these factors and b from the remaining n − k).
By definition, this number is C(n, k).
(ii). Let X = {1, . . . , n + 1}. Every subset Y ⊆ X where |Y | = k arises in one
of two ways:
(a) n + 1 ∈
/ Y , so Y is a subset of {1, . . . , n} of size k.
(b) n + 1 ∈ Y . In this case Y is determined by Y ∩ {1, . . . , n} which is a subset of
{1, . . . , n} of size k − 1.
There are C(n, k) possibilities in case (a), and C(n, k − 1) possibilities in case (b).
As cases (a) and (b) are mutually exclusive, C(n + 1, k) = C(n, k) + C(n, k − 1).
It is the second part of this lemma that gives us Pascal’s triangle.
n=0
1
n=1
1
n=2
1
n=3
2
1
n=4
1
n=5
1
..
.
1
3
4
1
3
6
5
..
.
10
..
.
1
4
10
..
.
1
5
..
.
1
..
.
Apart from the 1s along the edges, every entry is formed by adding together the two
numbers above it.
Lemma 1.15. For all positive integers r, n with r 6 n we have
r · C(n, r) = n · C(n − 1, r − 1).
Proof. We could prove this purely algebraically using the formula for C(n, r)
(this is an exercise). Instead let’s give a combinatorial proof.
Let X be an n-element set. Consider the set
def
S = {(x, A) | x ∈ A, A ⊆ X, |A| = r}.
We will count this set in two different ways, using the multiplication principle in
both cases.
Firstly, we can first choose A and then x. The number of choices for A is C(n, r)
and the number of choices for x, once A is chosen, is r. So |S| = C(n, r) · r.
Secondly, we can first choose x ∈ X and then A ⊆ X with |A| = r and x ∈ A.
The number of choices for x is n. Given x ∈ X we can choose A ⊆ X with |A| = r
and x ∈ A by choosing r − 1 elements from X {x} and then adding x to these to
4
obtain A. There are C(n − 1, r − 1) choices for such A. So |S| = n · C(n − 1, r − 1)
and therefore r · C(n, r) = n · C(n − 1, r − 1).
Example 1.16. How many different ice cream cones are there with 3 different
flavoured scoops of ice cream, where order doesn’t matter, if there are 10 different
possible flavours of ice-cream? How many are there if the order does matter?
Answer: If the order doesn’t matter, we want C(10, 3) =
10.9.8
3.2
= 120.
If the order does matter, the number we need is P (10, 3) = 10.9.8 = 720.
Before we move on, Lemma 1.12 gives us the total number of making a selection
from n objects, with no repetitions and without regard to order. What happens if
we do take the order into account?
Lemma 1.17. The total number of ordered selections, of different sizes without
repetitions, from n > 1 objects is
n
X
P (n, k) = be · n!c
k=0
(That is the greatest integer less than or equal to e · n!.)
Proof. It is obvious that the number of selections is
this is equal to e · n!, we shall use the fact that e =
1
0!
+
1
1!
Pn
k=0 P (n, k). To prove
+ 2!1 + . . .. The number
we are interested in is
n
X
P (n, k) =
k=0
=
=
=
=
n! n! n!
n!
+
+
+ … +
0!
1!
2!
n!
1
1
1
1
+ + + … +
n!
0! 1! 2!
n!
1
1
1
n! e −
−
−
− …
(n + 1)! (n + 2)! (n + 3)!
n!
n!
n!
+
+
+ …
n! · e −
(n + 1)! (n + 2)! (n + 3)!
1
1
1
+
+
+ … .
n! · e −
(n + 1) (n + 1)(n + 2) (n + 1)(n + 2)(n + 3)
5
So the number we want is an integer and is equal to n! · e less the error term (call
it E) inside the brackets. If we can show that E < 1 we will have finished. Well,
1
1
1
+
+
+ ...
(n + 1) (n + 1)(n + 2) (n + 1)(n + 2)(n + 3)
1
1
1
<
+
+
+ ...
2
n + 1 (n + 1)
(n + 1)3
E =
=
1
n+1
1−
1
n+1
=
1
6 1.
n
Example 1.18. Four people are having a picnic, when an ice-cream van arrives.
Some number between 0 and 4 of them go to queue up. How many distinct queues
are possible?
Solution We treat different orderings as different queues. In this case we need to
know the number of total permutations of between 0 and 4 objects from 4. This is
be · 4!c = 65.
We now add a little twist to the situation by allowing repetitions, meaning that
we can choose the same object more than once. Of course, we can consider this
scenario either with or without regard to order – it can get a bit fiddly! First we
consider the case where we select in order. In fact this is rather easy:
Lemma 1.19. Let n > 0 and k ≥ 0. The number of ways of selecting in order k
from n objects, with repetitions allowed, is nk .
Proof. For the first we have n choices, for the second n choices, . . . , and for
the k-th we have n choices. So altogether there are nk choices by the multiplication
principle.
One example of this sort is a piece of paper divided into k stripes, and we want
to colour each stripe with one of n colours. (We’re allowed to use the same colour
for different stripes.) This amounts to choosing k colours from n in a given order,
but with repetition allowed.
Another important class of example is that of occupancy problems, such as how
many ways are there of putting k distinguishable balls in n distinct boxes.
Example 1.20. How many ways are there to put 5 distinguishable balls into 3
distinct boxes?
Answer
Think of the balls as the numbers 1, 2, 3, 4, 5 and the boxes as A, B, C.
One way of making the assignment is putting ball 1 in box A, ball 2 in A, ball 3 in
6
C, ball 4 in B, ball 5 in A. This corresponds to the selection (in order) AACBA, of
5 (=k) from 3 (n) with repetitions allowed. By Lemma 1.19 there are 35 different
ways of doing this.
Now, things become a little more subtle when we want to choose k objects from
n, allowing repetitions, but without regard to order.
For example, suppose we wish to paint each of k canvases with one of n colours,
where, as before, we can choose the same colour twice, but now where the canvases
are identical (unlike the stripes in the example above). Notice that making choice
of paints is equivalent to deciding, for each of the n colours, how many canvases are
painted with that colour.
A corresponding occupancy problem is to place k indistinguishable balls in n
distinguishable boxes. Again, a choice here is equivalent to deciding, for each box,
how many of the balls are placed in it. (So this is different from Example 1.20,
as we now consider AACBA and ABACA to be the same choice, since the balls
are treated as indistinguishable.) The following result gives the general solution to
problems of this type:
Theorem 1.21. Let n > 0, k ≥ 0. The number of ways of selecting k objects
from n, without regard for order, and with repetitions allowed, is C(n + k − 1, k) =
C(n + k − 1, n − 1).1
The proof of this theorem will be in two steps. The first is:
Lemma 1.22. Let n > 0, k ≥ 0. The number of ways of selecting (without regard
for order) k out of n distinct objects, with repetitions allowed, is equal to the number
of solutions to e1 + · · · + en = k where the ei are nonnegative integers. That is to
say, quantity we want is the number of (ordered) n-tuples (e1 , . . . , en ) of nonnegative
P
integers such that ni=1 ei = k.
Proof. Let {x1 , . . . , xn } be our set of n objects. Consider some selection k
from these, without regard for order, and with repetitions allowed. Suppose that ei
is the number of times xi is chosen. Clearly then, because we have chosen k objects
P
altogether, we must have ni=1 ei = k.
P
Conversely, any tuple of nonnegative integers e1 , . . . , en where ni=1 ei = k corresponds to a unique choice of k objects from {x1 , . . . , xn }, by interpreting ei as the
number of times xi is chosen.
1The
corresponding occupancy problem is placing k indistinguishable balls into n distinguish-
able boxes. Note that the boxes in the occupancy problem play the role of the objects in the
corresponding selection problem: You make an unordered selection of k boxes from the total of n.
7
The second step is:
Lemma 1.23. The number of solutions to e1 + · · · + en = k where ei are nonnegative integers is C(n + k − 1, k) = C(n + k − 1, n − 1).
Proof. We use the bijection principle to prove this: we will give a bijection
onto another set which we can count easily. The other set is the set of sequences of
k stars (∗) and n − 1 bars (|). The bijection is given by
(e1 , . . . , en ) 7→ |∗ ·{z
· · ∗} | ∗| ·{z
· · ∗} | · · · · · · | |∗ ·{z
· · ∗}
e1 stars e2 stars
en stars
It is easy to see that this map is a bijection. Since the star-bar sequences have total
length n + k − 1 and are completely determined by where we choose the k stars,
there are C(n + k − 1, k) of them.
The previous two lemmas yield Theorem 1.21.
Example 1.24. How many ways are there to place 5 indistinguishable balls into
3 distinct boxes?
Answer:
The quantity we want is the number of ways of choosing 5 objects from
3 without regard to order and with repetitions allowed. So we may apply Theorem
1.21, taking n = 3 and k = 5.
This gives C(3 + 5 − 1, 2) = C(7, 2) =
7!
5!.2!
=
7.6
2
= 21.
Example 1.25. Using Lemma 1.23 we can now say exactly how many nonnegative integer solutions there are to equations like
x1 + x2 + x3 + x4 + x5 = 20.
We get that there are C(5 + 20 − 1, 5 − 1) = C(24, 4) solutions.
We can also now say exactly how many non-negative integer solutions there are
to the inequality
x1 + x2 + x3 + x4 + x5 < 20.
First note that as all the xi are integers this is the same as asking how many nonnegative integer solutions are there to:
x1 + x2 + x3 + x4 + x5 6 19.
8
We change this inequality to an equation by adding a slack variable x6 . The idea
being that x6 is the difference between 19 and x1 + x2 + x3 + x4 + x5 . This difference
has to be non-negative as the sum of the xi ’s is less than or equal to 19. (It’s equal
exactly when x6 = 0.) Thus the number of non-negative integer solutions there are
to the orginal inequality is the same as the non-negative integer solutions to the
equality
x1 + x2 + x3 + x4 + x5 + x6 = 19.
which is C(6 + 19 − 1, 6 − 1) = C(24, 5).
9
CHAPTER 2
Partitions of sets, Bell numbers, and Stirling numbers.
See Chapter 4 and 5 of Allenby–Slomson for more information about Stirling
numbers and Chapter 6 for partitions.
Definition 2.1. A (finite) partition of a set X is an (unordered) collection
{A1 , . . . , Ak } of nonempty, pairwise disjoint, subsets A1 , . . . , Ak of X whose union
is X.
To say the same things in symbols: Ai 6= ∅ for all i = 1, . . . , k, as well as
S
Ai ∩ Aj = ∅ for i 6= j, and ki=1 Ai = X.
The number of parts of the partition is by definition k.
Example 2.2. If X = {1, 2, 3} the partitions of X are:
n
o n
o n
o n
o n
o
{1}, {2}, {3} , {1, 2}, {3} , {1, 3}, {2} , {2, 3}, {1} , {1, 2, 3} .
(Note that the partition {{1, 2}, {3}} is the same as the partition {{3}, {1, 2}}.)
Definition 2.3. (i) The Bell number, B(n) is the number of partitions of an
n-element set. (Also we take B(0) = 1 by convention.)
(ii) The Stirling number, S(n, k) is the number of partitions of an n-element set into
k parts.
Example 2.4. By taking n = 3 and referring to the previous example we see
that B(3) = 5, S(3, 1) = 1, S(3, 2) = 3 and S(3, 3) = 1.
The sequence of Bell numbers begins 1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147,
115975, . . .
Note that S(0, 0) = 1 and S(n, k) = 0 for k > n, and S(n, 0) = 0 and S(n, 1) =
S(n, n) = 1 for n ≥ 1. Now, the following is obvious by definition:
P
Lemma 2.5. We have B(n) = nk=0 S(n, k).
The aims of this chapter and the next are to find formulae for these quantities
and to relate them to counting and occupancy problems.
Lemma 2.6. The number of ways of distributing b distinct balls among c indistinguishable boxes, with no box being empty is S(b, c).
10
Proof. Suppose we have some distribution of the b balls with every box being
nonempty. Label the boxes 1, . . . , c. Let Ai be the set of balls in box i. Then clearly
the Ai are all nonempty, and are pairwise disjoint (no ball goes in more than one
box), and A1 ∪· · ·∪Ac is the original set of b balls. This establishes that {A1 , . . . , Ac }
is a partition of the set of b balls into c parts, and it is clearly independent of the
labelling of the boxes.
We give the inverse to the above correspondence. Suppose {A1 , . . . , Ac } is a
partition of the set of b balls into c parts. Then if we put all the balls in Ai into
the ith box, we have found a way to distribute the b balls among c non-empty
boxes. The indistinguishability of the boxes guarantees that this is independent of
the labelling of the sets in the partition. (For instance {A1 , A2 , . . . , Ac } represents
the same partition as {A2 , A1 , . . . , Ac }, and also gives rise to the same distribution
of balls into boxes because we cannot distinguish between box 1 and box 2.)
What if we wish to dispense with the no empty boxes clause? All we can do is
the following:
Corollary 2.7. The number of ways of distributing b distinct balls among c
indistinguishable boxes (allowing empty boxes) is S(b, 0) + S(b, 1) + . . . + S(b, c).
Definition 2.8. An equivalence relation on a set X is a binary relation R on
X which is reflexive, symmetric, and transitive. That is to say:
(i) xRx for all x ∈ X
(ii) xRy implies yRx for all x, y ∈ X
(iii) if xRy and yRz then xRz for all x, y, z ∈ X.
Furthermore, the equivalence class of y ∈ X under R is the set
[y]R := {x ∈ X : xRy}.
Example 2.9. Let X = Z (the set of integers). Define xRy if and only if x − y
is divisible by 2.
In this case R has two equivalence classes: the set of even numbers and the set
of odd numbers. Notice also that this pair comprise a partition of X.
For another example, consider the nine cells of a 3 × 3 grid. Given two cells x
and y, say xRy if there is a symmetry of the grid taking x to y. Here there are three
equivalence classes: the four corner cells, the four central edge cells, and the single
central cell.
The following generalises this observation, and establishes that partitions and
equivalence relations are essentially the same thing.
11
Proposition 2.10. If R is an equivalence relation on a set X, then the collection
of equivalence classes comprises a partition of X, and every partition of X arises
this way. This gives a 1-1 correspondence between equivalence relations on X and
partitions of X.
Proof. (Sketch — filling in the details is an exercise on coursework 2.) Given
an equivalence relation R on X, show that any two equivalence classes are either
equal or disjoint, that no equivalence class is empty, and that every element of X
lies in some equivalence class. Hence the set of distinct equivalence classes forms a
partition of X.
Conversely, if {A1 , A2 , . . . , Ak } is a partition of X, then define the relation R on
X by xRy if x and y are in the same member of the partition (i.e. in the same Ai ).
Then prove that R is an equivalence relation on X.
Finally, check the above maps are each others inverse.
P
Theorem 2.11. For n ≥ 1, B(n) = nk=1 C(n − 1, k − 1) · B(n − k).
Proof. Let X = {1, . . . , n}. Any partition P of X can be constructed as follows:
first pick a subset Y of X containing n. Then choose a partition {A1 , . . . , A` } of
X Y , and then put P = {A1 , . . . , A` , Y }.
Now (for fixed k) there are C(n − 1, k − 1) ways of choosing a subset Y of X
which has cardinality k and contains n. For each such choice of Y there are B(n−k)
partitions of X Y .
So, using the multiplication principle, the number of partitions of X where n lies
in a set of size k is C(n − 1, k − 1) · B(n − k). The final step is to sum this over all
P
k, giving us nk=1 C(n − 1, k − 1)B(n − k).
Proposition 2.12. For n > 0 and k > 1 we have S(n + 1, k) = k · S(n, k) +
S(n, k − 1).
Proof. This is a similar style of proof to that of the previous theorem. Let
X = {1, . . . n + 1}. Now, any partition of X into k parts arises in one of the
following two ways:
(i) Choose a partition {A1 , . . . , Ak } of {1, . . . , n} into k parts, then add n + 1
to one of the Ai to form a partition of X into k parts. There are S(n, k) · k
ways this can be done (using the multiplication principle).
(ii) Choose a partition {A1 , . . . , Ak−1 } of {1, . . . , n} into k −1 parts, then adjoin
{n + 1} to obtain the partition {A1 , . . . , Ak−1 , {n + 1}} of X into k parts.
There are S(n, k − 1) ways this can be done.
12
Furthermore these two are mutually exclusive, since (ii) arises precisely when
{n + 1} is one of the sets in the partition. Thus the total number of partitions of X
into k parts is S(n, k) · k + S(n, k − 1).
Proposition 2.13. The number of functions from an n-element set to a kelement set is k n .
Proof. Let the sets be X = {a1 , . . . , an }, Y = {b1 , . . . , bk }. A function f from
X to Y is determined by deciding the value of f (ai ) for each i. There are k choices for
f (a1 ). For each such choice there are k choices for f (a2 ), etc. So by multiplication
principle the number of choices is k × · · · × k = k n .
Notice that this is the same as the occupancy problem of distributing n distinguishable balls between k distinct boxes. This is also the same as selecting in order
n from k distinct objects allowing repetitions. (see Lemma 1.19, be careful about
the roles of n and k.)
What about the number of surjective functions from a set of size n to a set of
size k?
Lemma 2.14. The number of surjective (or onto) functions from an n-element
set to a k-element set is k!S(n, k).
Proof. Let A = {a1 , . . . , an } and B = {b1 , . . . , bk }. Observe that a surjective
function from A to B is given by a partition of A into k parts together with an ordering (or arrangement) of those parts. Namely the first part is {x ∈ A | f (x) = b1 },
second part is {x ∈ A | f (x) = b2 }, etc. Hence the number of surjective functions
from A to B is precisely the number of partitions of A into k parts multiplied by
the number of possible arrangements of k objects, which is S(n, k) × k!.
Corollary 2.15. The number of ways of distributing n distinct balls among k
distinct boxes leaving no box empty, is k!S(n, k).
Remark 2.16. The number of injective functions from a set of size k to a set of
size n, where k ≤ n is P (n, k). See Exercise 1, Workshop 2.
13
CHAPTER 3
The Inclusion–Exclusion principle and applications.
See also Chapter 4 of Allenby–Slomson and/or Chapter 2 of Slomson’s first
edition for more on Inclusion-Exclusion.
Our aim now is to find a formula for the Stirling numbers S(n, k). See Corollary
3.4 below. We will make use of the so-called inclusion-exclusion principle, which
also has many other uses.
First, recall the easy fact that if A and B are finite sets, then
|A ∪ B| = |A| + |B| − |A ∩ B|.
Similarly
|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|.
We can now give the general formula:
Theorem 3.1 (The Inclusion–Exclusion Principle Theorem). Given finite subsets A1 , . . ., An of some given set,
|A1 ∪ · · · ∪ An | =
n
X
k=1
!
X
(−1)k+1
|Ai1 ∩ · · · ∩ Aik | .
1≤i1 1, an integer, (iv) B(4).
2
1
1 X
1
(−1)2−j C(2, j)j 3 =
−C(2, 1) · 1 + C(2, 2) · 23 = [−2 + 8] = 3.
2! j=1
2
2
(i) S(3, 2) =
4
1 X
(ii) S(8, 4) =
(−1)4−j C(4, j)j 8
4! j=1
1
−C(4, 1) · 1 + C(4, 2) · 28 − C(4, 3) · 38 + C(4, 4) · 48
24
1
1
−4 + 6 · 28 − 4 · 38 + 48 =
[1532 − 26, 244 + 65, 536] = 1, 701.
=
24
24
=
2
1 X
1
(iii) S(n, 2) =
(−1)2−j C(2, j)j n = [−C(2, 1) · 1 + C(2, 2) · 2n ]
2! j=1
2
=
(iv) B(4) =
1 n
[2 − 2] = 2n−1 − 1.
2
n
X
S(4, k) = S(4, 1)+S(4, 2)+S(4, 3)+S(4, 4)
k=1
16
We know S(4, 1) = 1, S(4, 2) = 23 − 1, by (iii). Also S(4, 4) = 1 so we need to
calculate S(4, 3).
3
1 X
S(4, 3) =
(−1)3−j C(3, j)j 4
3! j=1
1
C(3, 1) · 1 − C(3, 2) · 24 + C(3, 3) · 34
6
1
= [3 − 3 · 16 + 81] = 6.
6
So B(4) = 1 + 7 + 6 + 1 = 15.
=
Example 3.6. How many hands of 5 cards from a normal 52 card deck are there
for which there is at least one suit contributing no cards?
We let VS , VH , VD and VC be the five-card hands with no spades, hearts, diamonds or clubs respectively. We want to calculate |VS ∪ VH ∪ VD ∪ VC |. Using
inclusion-exclusion this is:
|VS |+|VH |+|VD |+|VC |−|VS ∩VH |−|VS ∩VD |−|VS ∩VC |−|VH ∩VD |−|VH ∩VC |−|VD ∩VC |
+|VS ∩VH ∩VD |+|VS ∩VH ∩VC |+|VS ∩VD ∩VC |+|VH ∩VD ∩VC |−|VS ∩VH ∩VD ∩VC |.
Now each suit has 13 cards in it so |VS | = C(39, 5) = 575, 757, we pick 5 cards from
the 39 cards which aren’t spades. Similarly, |VH | = |VD | = |VC | = 575, 757.
A hand in VS ∩ VH consists of 5 cards drawn from the 26 cards which are neither
spades nor hearts so |VS ∩VH | = C(26, 5) = 65, 780. Similarly, |VS ∩VD | = |VS ∩VC | =
|VH ∩ VD | = |VH ∩ VC | = |VD ∩ VC | = 65, 780.
Similarly |VS ∩ VH ∩ VD | = |VS ∩ VH ∩ VC | = |VS ∩ VD ∩ VC | = |VH ∩ VD ∩ VC | =
C(13, 5) = 1, 287. (5 cards from 13 cards).
Also, VS ∩ VH ∩ VD ∩ VC = ∅ as a hand must contain cards from at least one suit.
So, but noting that there are C(4, 1) ways to get one set by itself, C(4, 2) ways
to get 2 sets, C(4, 3) ways to get 3 sets and C(4, 4) ways to get 4 sets we have:
|VS ∪ VH ∪ VD ∪ VC | = C(4, 1)C(39, 5) − C(4, 2)C(26, 5) + C(4, 3)C(13, 5) − C(4, 4) · 0
= 4 · 575, 757 − 6 · 65, 780 + 4 · 1, 287 = 1, 913, 496.
17
CHAPTER 4
Partitions of numbers.
See Chapter 6 of Allenby-Slomson and Chapter 3 of Slomson (first edition).
Definition 4.1. Let n be an integer > 0. By a partition of the number n we
mean a representation of n as a sum of positive integers, where the order does not
matter (and repetition is allowed).
So (2, 2, 1, 3) and (2, 3, 1, 2) count as the same partition of 8. We may also write
this as 2 + 2 + 1 + 3.
The numbers that occur in the partition are called the parts or the sizes of parts.
Our convention will be to write the parts in decreasing order: (3, 2, 1) for the
partition representing 6 = 3 + 2 + 1. So an equivalent definition of a partition of the
number n is a decreasing sequence of positive integers which sum to n:
(n1 , n2 , . . . , nk ) where n1 ≥ · · · ≥ nk ≥ 1 and
k
X
ni = n.
i=1
The number of parts k is called the length of the partition.
Example 4.2. The partitions of 3 are (3), (2, 1) and (1, 1, 1).
The partitions of 5 are:
(5), (4, 1), (3, 2), (3, 1, 1), (2, 2, 1), (2, 1, 1, 1), (1, 1, 1, 1, 1).
Remark 4.3. A partition of the number n does not mean the same thing as
a partition of a set of size n. But what is the relationship? Take a partition
{A1 , . . . , Ak } of a set X where |X| = n. Arrange the parts Ai such that |A1 | ≥
|A2 | ≥ · · · |Ak | ≥ 1. Then |A1 | + |A2 | + · · · + |Ak | = n so (|A1 |, |A2 |, . . . , |Ak |) is
a partition of the number n. All partitions of n arise this way. But two distinct
partitions of an
n n elemento set can
n give rise oto the same partition of the number n.
For instance, {1, 2}, {3} and {1, 3}, {2} each correspond to the partition (2, 1)
of 3.
Suppose {A1 , . . . , Ak } and {B1 , . . . , B` } are partitions of a set X of size n, both
ordered by size. We could say they are equivalent if k = ` and |Ai | = |Bi | for all i.
Then equivalent partitions of X correspond to the same partition of n. Hence the
number of partitions of n is the number of equivalence classes of partitions of X.
18
Remark 4.4. The number of partitions of the number n with at most k parts
coincides with the number of ways of distributing n indistinguishable balls among
k indistinguishable boxes. (Why?)
Definition 4.5. If n and k are integers > 0, then
(i) p(n) is the number of partitions of n.
(ii) pk (n) is the number of partitions of n with at most k parts.
(iii) By convention p(0) = 1, pk (0) = 1 for k ≥ 0 and p0 (n) = 0 for n ≥ 1.
The sequence of partition numbers p(n) begins 1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42,
56, 77, 101, 135, 176, 231, 297, 385, 490, . . .
Remark 4.6 (Non-examinable). A very nice approximate expression for p(n)
was provided by Hardy and Ramanujan in 1918:
√ 2n
1
p(n) ∼ √ eπ· 3
4n 3
This is a limiting value in the sense that if you divide p(n) by this expression, the
result tends to 1 as n → ∞. We shall not get this far!
Lemma 4.7. Let n ≥ 0. Then
(a) p1 (n) = 1.
(b) For all k ≥ n, we have pk (n) = p(n).
(c) p2 (n) = b n2 c + 1,
where bxc is the largest integer ≤ x.
Proof. (a) is obvious, and (b) is true because if k ≥ n then any partition of
the number n has at most k parts.
(c) For n = 0 it’s obvious, so assume n > 1. By (a) there is exactly one partition
of n with one part: (n). The partitions of the number n with two parts are clearly
(n − 1, 1), (n − 2, 2), . . . , (n − b n2 c, b n2 c). So altogether there are b n2 c + 1 partitions
of n with at most two parts.
Definition 4.8. We let qk (n) be the number of partitions of n with all parts
(or summands) at most k.
Example 4.9. For example q5 (5) = p(5).
Also, q1 (5) = 1 (the only partition being (1, 1, 1, 1, 1)).
Theorem 4.10. For all integers k, n > 0, pk (n) = qk (n).
Proof. This theorem follows from an interesting duality for partitions, which
can itself be seen via so-called Ferrers diagrams. We start with an example.
19
The partition (7, 4, 3, 3, 1) of 18 can be represented by the following diagram of
dots:
• • • • • • •
• • • •
• • •
• • •
•
The first row is 7 dots, second row is 4 dots etc. The total number of dots is (of
course) 18. Note that the number of rows in the diagram is 5 which is the length
of the partition. Also, the number of columns is 7 which is the largest part of the
partition.
Here is the key insight: we can also read the diagram vertically rather than
horizontally, to obtain another partition of 18, namely (5, 4, 4, 2, 1, 1, 1). We call
this the dual (or conjugate or transpose) of the original partition.
We can do this in general: given a partition π of the number n, write down its
Ferrers diagram. Formally, if π = (n1 , n2 , . . . , nk ), then the Ferrers diagram consists
of a grid of k rows. In the ith row, the first ni spaces contain dots, and the remaining
spaces are empty.
Once again, reading the diagram vertically gives another partition of n. We
denote this π ∗ . Note that (π ∗ )∗ = π, and the map taking π to π ∗ gives a permutation
of the set of partitions of n.
Let us emphasize that π has k parts if and only if the largest part of π ∗ is k.
Similarly, π has at most k parts if and only if all the parts of π ∗ are ≤ k.
So
∗
induces a bijection between the set of partitions of n with at most k parts,
and the set of partitions of n with all parts ≤ k. Thus pk (n) = qk (n).
Theorem 4.11. Let 1 ≤ k ≤ n. Then
(a) pk (n) = pk−1 (n) + pk (n − k), and
(b) qk (n) = qk−1 (n) + qk (n − k).
Proof. By the previous theorem, it suffices to prove (a). Let (n1 , . . . , n` ) be a
partition of n into l 6 k parts. We have two cases:
Case (i). ` < k. Then the partition is already a partition of n into at most k − 1
parts. There are pk−1 (n) possibilities here.
Case (ii). ` = k, i.e. the partition has exactly k parts, meaning n1 , . . . , nk ≥ 1. But
then (n1 − 1) + (n2 − 1) + · · · + (nk − 1) = n − k. Now, some of these summands
may be zero. Deleting these leaves us with a partition of n − k into at most k parts.
There are precisely pk (n − k) possibilities here.
20
Now add up the possibilities for case (i) and for case (ii) to get the answer.
Before the next theorem let us first note some properties of b nk c. Assume 1 ≤
k ≤ n. We can write n = sk + r where s, r ∈ N and 0 6 r < k. Dividing by k we
get
where 0 6
r
k
< 1. So
r
k
n
r
=s+
k
k
is the fractional part of
n
k
and s = b nk c. By noting that
r = n − sk we get 0 6 n − sk < k. Also observe that n − (s − 1)k = n − sk + k =
r + k > k.
Theorem 4.12. Let 1 ≤ k ≤ n. Let s = bn/kc. Then
(a) qk (n) = qk−1 (n) + qk−1 (n − k) + qk−1 (n − 2k) + · · · + qk−1 (n − sk).
(b) pk (n) = pk−1 (n) + pk−1 (n − k) + pk−1 (n − 2k) + · · · + pk−1 (n − sk).
Proof. It is enough to prove (a), by Theorem 4.10. Iterating Theorem 4.11(b)
yields:
qk (n) = qk−1 (n) + qk (n − k)
qk (n − k) = qk−1 (n − k) + qk (n − 2k)
..
.
..
.
..
.
qk (n − (s − 1)k) = qk−1 (n − (s − 1)k) + qk (n − sk).
As remarked above n − sk < k, and thus no partition of n − sk can have a part of
size k. So qk (n − sk) = qk−1 (n − sk). Putting all this together:
qk (n) = qk−1 (n) + qk−1 (n − k) + · · · + qk−1 (n − (s − 1)k) + qk−1 (n − sk).
Remark 4.13. If we define p(k, n) to be the number of partitions of n of length
k and q(k, n) to be the number of partitions of n with largest part k, then
(i) p(0, 0) = 1, p(0, n) = 0 for n > 1, and p(k, n) = 0 for k > n > 0,
(ii) p(k, n) = q(k, n) for all k, n > 0,
(iii) p(k, n) = p(k − 1, n − 1) + p(k, n − k) for 1 6 k 6 n.
Assertion (i) is obvious, (ii) is Coursework 2 Question 4(i) and (iii) is Workshop 2
Question 5(i).
21
CHAPTER 5
Generating functions. Euler’s generating function for
partition numbers.
1. Functions and Power Series.
We shall consider power series both from an algebraic and from a functional
point of view in order to solve combinatorial problems.
Question 5.1. Consider the series given by the product:
(1 + x + x2 + x3 + · · · )(1 + x2 + x4 + x6 + · · · ) .
(5.1)
What is the coefficient of x7 in this product?
Answer. When we multiply out we obtain the following terms (also called summands) of degree 7:
x · x 6 , x 3 · x 4 , x 5 · x 2 , x 7 · x0 .
So the coefficient of x7 is 4.
Question 5.2. In how many ways can you write down 7 as the sum l + m of
nonnegative integers l, m where m is even?
Answer. 1 + 6, 3 + 4, 5 + 2, 7 + 0.
Note the correspondence! In fact we can see that the coefficient of xn in equation
(5.1) is the number of ways of representing n as the sum l+m of nonnegative integers
such that m is even. In other words, if an is the number of ways of writing n as the
sum of two nonnegative integers where the second is even, then
2
3
2
4
6
(1 + x + x + x + · · · )(1 + x + x + x + · · · )
=
∞
X
an x n .
(5.2)
n=0
Now, to obtain combinatorial results we consider power series in two ways: algebraically and functionally.
22
The Algebraic Approach. Here we view the series
a0 + a1 x + a2 x 2 + · · ·
∞
X
=
an x n
(5.3)
n=0
as the infinite sequence
(a0 , a1 , a2 , · · · )
=
(an )n>0
(5.4)
Note. A polynomial corresponds to the series in (5.3) with only finitely many
nonzero coefficients and so in (5.4) to a finite sequence.
Important Point. Coefficients of power series themselves obtained by multiplication and addition of power series can be obtained by multiplication and addition of
the corresponding sequences as set out below.
Addition.
(a0 , a1 , a2 , · · · ) + (b0 , b1 , b2 , · · · )
=
(a0 + b0 , a1 + b1 , a2 + b2 , · · · )
Multiplication.
(a0 , a1 , a2 , · · · ) ∗ (b0 , b1 , b2 , · · · )
where cn =
Pn
i=0
=
(c0 , c1 , c2 , · · · )
ai bn−i . Indeed, multiplying a0 +a1 x+a2 x2 +· · · and b0 +b1 x+b2 x2 +
· · · and gathering terms/summands of the same degree we see that the coefficient
of xn is:
a0 bn + a1 bn−1 + · · · + an b0
=
n
X
ai bn−i .
i=0
The Functional Approach. We regard a power series as a function of x. Thus,
in equation (5.2) we have
f (x) · g(x) = h(x)
(5.5)
P
P∞ 2n
P
n
n
where f (x) = ∞
and h(x) = ∞
n=0 x , g(x) =
n=0 x
n=0 an x . This equation
now tells us that the product f (x)·g(x) is the same as h(x) if both f (x) and g(x) are
defined. Thus equation (5.2) makes sense provided f (x) and g(x) are both defined
for some x ∈ R (i.e. if they both have radius of of convergence > 0.)
For example in equation (5.2) f (x) and g(x) correspond to geometric series and
so converge for any x ∈ R such that |x| < 1. So their product—i.e. the series
corresponding to h(x)—also converges for such x.
23
Combining the Algebraic and Functional Approaches. A standard theoP
n
rem (that we could easily prove) tells us that if, for some R > 0, the series ∞
n=0 an x
P
n
and ∞
n=0 bn x both converge for all x ∈ R with |x| < R, then
∞
X
an x n =
n=0
∞
X
bn xn for all x with |x| < R
⇔
an = bn for all n > 0.
(5.6)
n=0
Thus (under such conditions) an equation such as (5.2) is true regarded as a functional equation if and only if it is true when regarded as an algebraic equation
between (formal) power series /infinite sequences.
We often use equation (5.6) to argue that if two functions are the same, then the
coefficients in the power series represented by these functions must match exactly.
i.e. we use the method of equating coefficients.
2. Generating Functions.
Definition 5.3. The generating function of the sequence (an )n>0 is the function
x 7→
∞
X
an x n
n=0
(i.e. f (x) such that f (x) =
P∞
n=0
an xn ).
Note. We only use Definition 5.3 when
P∞
n=0
an xn has positive radius of con-
vergence, i.e. is defined for some x 6= 0, in which case we can use the method of
equating coefficients.
Example 5.4. Find the generating function of the (finite) sequence {C(m, n)}m
n=0 .
P
n
m
Answer. By the binomial theorem m
n=0 C(m, n)x = (1 + x) . Hence f (x) =
(1 + x)m is the generating function of the sequence {C(m, n)}m
n=0 .
(Remark. f (x) is clearly convergent. Why?)
1
is the generating function
1−x
where an = 1 for all n > 0—i.e. the constant sequence
Example 5.5. Show that the function f (x) =
of the sequence (an )n>0
(1, 1, 1, . . . ).
P
n
Answer. The geometric series ∞
n=0 x converges for any x ∈ R such that |x| < 1
P
1
n
and, for this range of values ∞
. Note that this
n=0 x = f (x) where f (x) =
1−x
P∞
series is of the form n=0 an xn with an = 1 for all n > 0. Hence the result.
Example 5.6. Determine the generating function of the sequence (n)n>0 .
24
Answer. We know that we can compute the derivative of a power series over its
domain of convergence by differentiating the series term by term. Thus, as
1
1−x
∞
X
=
xn
for x such that |x| < 1 ,
(5.7)
n=0
∞
X
1
by differentiating both sides of equation (5.7) we see that
=
nxn−1 . So
2
(1 − x)
n=0
x
(1 − x)2
∞
X
=
nxn
for x such that |x| < 1 .
n=0
x
Thus g(x) =
is the generating function for the sequence {n}.
(1 − x)2
Example 5.7. We can obtain the generating function f (x) of
1
n(n + 1) n>0
2
by
first finding the generating function h(x) for (n2 )n>0 (See Exercise 3 on Coursework
3) so that we get f (x) = 21 (h(x) + g(x)) where g(x) is the generating function for
(n)n>0 from above.
Alternatively we note that 21 n(n + 1) = 1 + 2 + · · · + n so that
1
n(n + 1) n>0
2
is the
sequence (an )n>0 where:
a0 = 0 and an+1 = an + (n + 1) for all n > 0.
Then
f (x) =
∞
X
an x n .
n=0
Multiplying the equation an+1 = an + (n + 1) by xn and summing we get:
∞
X
an+1 x
n
=
∞
X
n=0
n
an x +
n=0
n=0
However
∞
X
∞
X
nx +
n=0
∞
an+1 x
n
n
∞
X
xn .
n=0
∞
1X
1X
1
=
an+1 xn+1 =
an xn = f (x)
x n=0
x n=0
x
where we note that the second to last equality holds due to the fact that a0 =
1
0(0
2
+ 1) = 0. Thus, from these equations we deduce that:
1
x
1
f (x) = f (x) +
+
2
x
(1 − x)
1−x
1
= f (x) +
.
(1 − x)2
Hence
f (x) =
x
.
(1 − x)3
25
(5.8)
3. Euler’s generating function for partition numbers.
In this section we aim to solve the following type problem.
Problem (∗). Given an integer n > 0, is the number of ways of writing n as the
sum of odd positive integers the same as the number of ways of writing n as the
sum of positive integers that are all different?
Question 5.8. Check that the solution to Problem (∗) is yes for n = 7.
Answer. There are five partitions of 7 into parts that are all odd as follows:
7, 5 + 1 + 1, 3 + 3 + 1, 3 + 1 + 1 + 1 + 1, 1 + 1 + 1 + 1 + 1 + 1 + 1
and there are five partitions of 7 into distinct parts:
7, 6 + 1, 5 + 2, 4 + 3, 4 + 2 + 1 .
We will solve Problem (∗) in general by showing that the generating functions
for the sequences corresponding to these two types of numbers are identical.
Our next question exemplifies our approach.
Question 5.9. What is the coefficient of x89 in the series:
(1 + x3 + x6 + x9 + · · · )(1 + x8 + x16 + x24 + · · · )
(5.9)
Answer. By inspection we obtain x89 in the following four ways,
x9 · x80 , x33 · x56 , x57 · x32 , x81 · x8 ,
and hence the coefficient of x89 in the series is 4. This is the same as the number
of ways of writing 89 in the form 3m + 8n where m and n are positive integers. I.e.
the number of partitions of 89 using only parts of size 3 and 8.
Remark 5.10. So the general observation here is that a partition π = (n1 >
n2 > · · · > nk > 1) of n, can also be written in ascending order and then in
“exponential form”:
1, . . . , 1 , 2, . . . , 2 · · · · · · = 1m1 2m2 · · · ,
| {z } | {z }
m1 copies
m2 copies
where the mi , 1 6 i 6 n1 , are integers > 0 satisfying
Pn1
i=1
i · mi = n. Here mi = 0
means that i does not occur as a part in π. E.g. (4, 4, 2, 1, 1, 1) = 13 2 42 = 13 21 30 42 .
26
Now, we easily generalise the example in Question 5.9.
All Partitions. Since a partition of positive integer n may use parts of any size
1 6 t 6 n, a series whose coefficients count all the partitions of n (for whichever n
we choose) must be the infinite product of (all) infinite series corresponding to parts
of size 1, 2, 3, . . . Indeed the following result gives us the generating function for the
sequence p(n) n>0 of partition numbers. (Remember that p(0) = 1 by convention.)
Theorem 5.11 (Euler). If P (x) =
the sequence p(n) n>0 , then
P∞
n=0
p(n)xn is the generating function for
P (x) = (1 + x + x2 + x3 + · · · )(1 + x2 + x4 + x6 + · · · )(1 + x3 + x6 + x9 + · · · ) · · ·
=
∞
Y
k
(1 + x + x
2k
+x
3k
+ ···) =
k=1
∞
Y
1
.
(1 − xk )
k=1
Proof. Note firstly that, to compute the coefficients of xn in the first (or second)
expression for P (x), we do not need to consider factors (1 + xk + x2k + · · · ) of the
product where k > n. Hence, when multiplying out, each copy of xn arises as
xn = (x1 )m1 × (x2 )m2 × · · · × (xn )mn
(5.10)
where 0 6 mi 6 n for all1 1 6 i 6 n and
1 · m1 + 2 · m2 + · · · + n · mn = n .
(5.11)
So in the i-th factor we pick the (mi + 1)-th term. It follows that the coefficient
of xn in P (x) is the number of tuples (m1 , m2 , . . . , mn ) as above. By Remark 5.10,
this is the number of partitions of n.
Remark. It is not hard to show that the infinite product in Theorem 5.11
Q
converges for |x| < 1. A series product ∞
k=1 uk makes sense purely algebraically if
(1) all but finitely many uk ’s have constant term one and (2) the degree of the first
nonconstant term of uk tends to infinity when k → ∞. The latter can be rephrased
as saying that each power xn , n > 1, occurs in only finitely many uk ’s with nonzero
coefficient.
Now for some examples.
1Note,
for example, that the only case when we have mi = n for some i is when i = 1, in
which case mj = 0 for all 1 < j 6 n. On the other hand mn ∈ {0, 1} and in the case when mn = 1
clearly mk = 0 for all 1 6 k < n.
27
Example 5.12. Evaluate p(n) for 1 6 n 6 5.
Q
k
2k
3k
Answer. We restrict ourselves to terms in ∞
k=1 (1 + x + x + x + · · · ) of degree
at most 5. (Make sure at this point that you understand why!) I.e. the product
(1 + x1 + x2·1 + x3·1 + x4·1 + x5·1 )(1 + x2 + x2·2 )(1 + x3 )(1 + x4 )(1 + x5 ) .
Now, rewriting this product as a sum we see that—when we truncate the sum to
show only the first six terms—we have
1 + x + 2x2 + 3x3 + 5x4 + 7x5
since, for example there are 7 individual terms of degree 5 in the sum2:
x5 , x2 · x3 , x1 · x4 , x1 · x2·2 , x2·1 · x3 , x3·1 · x2 , x5·1
(and a similar analysis gives the other coefficients). Thus we see that:
p(1) = 1, p(2) = 2, p(3) = 3, p(4) = 5, and p(5) = 7 .
Reminder. In Problem (∗) we asked whether, for any n > 0 the number of ways
podd (n) of writing n as the sum of odd positive integers is the same as the number
of ways pdis (n) of writing n as the sum of integers that are all distinct.
Definition. Having defined podd (n) and pdis (n) as above, define Podd (x) and
Pdis (x) to be the generating functions for the sequences podd (n) n>0 and pdis (n) n>0 .
Lemma 5.13. Let Podd (x) and Pdis (x) be the generating functions defined above.
Then the following are true.
1
Podd (x) =
,
(1 − x)(1 − x3 )(1 − x5 ) · · ·
(a)
2
3
i.e. Podd (x) =
t=1
1
.
(1 − x2t−1 )
∞
Y
(1 + xt ) .
i.e. Pdis (x) =
4
Pdis (x) = (1 + x)(1 + x )(1 + x )(1 + x ) · · · ,
(b)
∞
Y
t=1
Proof. • podd (n) is the number of partitions of n into odd parts so the gener
ating function for the sequence podd (n) n>0 is obtained by taking the factors in the
product in Theorem 5.11 corresponding to parts of odd size. So
∞
Y
Podd (x) =
t
2t
(1 + x + x + · · · ) =
t=1
t>1, t odd
2Written
1
2·2
x ·x
0
out in full this is: x0 · x0 · x0 · x0 · x5 ,
0
0
·x ·x ·x , x
2·1
0
3
0
∞
Y
0
3·1
·x ·x ·x ·x , x
2
1
.
(1 − x2t−1 )
x0 · x2 · x3 · x0 · x0 ,
0
0
0
·x ·x ·x ·x , x
5·1
0
x1 · x0 · x0 · x4 · x0 ,
0
· x · x · x0 · x0 .
Note also that these terms correspond (respectively) to (all) the partitions of 5 as follows: (5),
(3, 2), (4, 1), (2, 2, 1) (3, 1, 1), (2, 1, 1, 1), (1, 1, 1, 1, 1).
28
• pdis (n) is the number of partitions of n into distinct parts, i.e. partitions of n in
which there is at most one part of any given size. So the generating function for the
sequence pdis (n) n>0 is obtained by taking the first two summands/terms in each
factor of the product. I.e.
∞
∞
.
.
Y
Y
t
2t
3t
Pdis (x) =
(1 + x + x + x + · · · ) =
(1 + xt ) .
t=1
t=1
The statement of the Lemma follows immediately from the above observations.
Now to solve Problem (∗) on page 26 it suffices to prove that the functions
Podd (x) and Pdis (x) are identical.
Proposition 5.14. For all integers n > 0, pdis (n) = podd (n).
Proof. Note that (1 − x2t ) = (1 + xt )(1 − xt ), so
∞
Q
Podd (x) = Q
∞
1
(1 −
t=1
=
x2t−1 )
(1 − x2t )
t=1
∞
Q
=
(1 −
xt )
(1 − x2t )
= (1 + xt ). Therefore
(1 − xt )
∞
Y
(1 − x2t )
t=1
(1 − xt )
∞
Y
(1 + xt ) = Pdis (x) ,
=
t=1
t=1
where the third expression is obtained from the second by multiplying numerator
∞
Q
and denominator by (1 − x2t ).
t=1
Thus the generating functions of the sequences pdis (n) n>0 and podd (n) n>0
are equal. But this is equivalent to saying that the coefficients of (the series corresponding to) these generating functions are equal. I.e. pdis (n) = podd (n) for all
n > 0.
29
CHAPTER 6
Systems of distinct representatives and Hall’s Theorem.
The Students Union has n clubs. Each club elects a delegate to the Executive
Committee. Clearly:
• The delegate must be a member of the club that he/she represents.
• No one person can represent more than one club.
Questions for the combinatorialist.
(1) Is it possible to choose representatives according to these rules?
(2) In how many ways can this be done?
We consider (1) here and touch on (2) which is more difficult.
Example. Jane and Peter are the only members of the Tennis Club, the Music
Club and the Chess Club. Then clearly the election is impossible.
More generally. If any m clubs contain between them less than m members then
the election is impossible. So a necessary condition for the election is that any m
clubs have between them at least m members. This condition also turns out to be
sufficient. This is the content of Hall’s Theorem.
1. Hall’s (marriage) Theorem
Continuing more mathematically, let A1 , A2 , . . . , An be sets which we suppose to
be subsets of some universal finite set X.
Note. In general the Ai “share” elements, i.e. any an element x ∈ X may be in
several Ai (i.e. x ∈ Ai1 ∩ · · · ∩ Aik for some i1 , . . . , ik ) or x may be in just one Ai
(or in no Ai ). We also allow for the possibility that some sets are equal.
Definition 6.1. A system of distinct representatives (SDR) for the (ordered)
family of sets A1 , . . . , An is an n-tuple (x1 , . . . , xn ) of elements of X with the
following properties.
(a) xi ∈ Ai for i = 1, . . . , n.
This says that the xi are representatives of the sets Ai .
(b) xi 6= xj for i 6= j. This says that the representatives are distinct.
30
Notation 6.2. For J = {i1 , . . . , ik } ⊆ {1, . . . , n} we will often use the shorthand:
A(J) =
[
Ai = Ai1 ∪ · · · ∪ Aik .
i∈J
Note that A(J1 ∪ J2 ) = A(J1 ) ∪ A(J2 ).
Definition 6.3. The family of sets A1 , . . . , An satisfies Hall’s condition if, for
any subset J ⊆ {1, . . . , n},
|A(J)| > |J| .
Theorem 6.4 (Hall’s Theorem). Let A1 , . . . , An be a family of subsets of X.
Then there exists a SDR for A1 , . . . , An ⇔ Hall’s condition holds (for this family).
In other words, Hall’s condition is a sufficient and necessary condition for A1 , . . . , An
to have a SDR.
Proof. (⇒) Suppose that there is a SDR—say (x1 , . . . , xn )—for the family
A1 , . . . , An . Consider any subset J ⊆ {1, . . . , n}. Then A(J) contains all the
S
sets Ai for i ∈ J (as A(J) = i∈J Ai ) and so contains the set of representatives
{ xi | i ∈ J }. Since the representatives are distinct we have |A(J)| > |J|. So Hall’s
condition holds.
(⇐) Let A1 , . . . , An be a family of sets satisfying Hall’s condition. We must prove
that a SDR can be found for the family. Our proof will be by induction on n. Firstly
we note the following.
• Base case n = 1. Hall’s condition guarantees that A1 is nonempty. (Why?) So
picking x1 ∈ A1 we have that (x1 ) is a SDR for the family {A1 }.
• So we assume that n > 1 and as Inductive Hypothesis (IH) that, any family
{B1 , . . . , Bm } such that m < n which satisfies Hall’s condition has a SDR.
Let us call a set J ⊆ {1, . . . , n} is critical if |A(J)| = |J|. (Note that in this case
all of the members of the sets Ai such that i ∈ J must be used as representatives
for these sets.)
We split the rest of the proof into 2 cases.
Case 1. No nonempty set is critical except possibly {1, . . . , n}. I.e. |A(J)| > |J|
for every nonempty proper subset of indices J.
• Choose any element in An as the representative xn for this set. (Note that
An 6= ∅ by Hall’s condition applied to J = {n}.) Then xn cannot be the
representative of any other set so we remove it.
31
• Let A0i = Ai {xn } for each 1 6 i 6 n − 1. Note that, for any nonempty
subset J ⊆ {1, . . . , n − 1} we have
|A0 (J)| > |A(J)| − 1 > |J| − 1
since |A(J)| > |J| in the present case. But this implies that |A0 (J)| > |J|,
so the family A01 , . . . , A0n−1 satisfies Hall’s condition. By the inductive
hypothesis this family has a SDR (x1 , . . . , xn−1 ) say.
• Hence (x1 , . . . , xn−1 , xn ) is a SDR for the original family A1 , . . . , An since
xn ∈ An and xn 6= xi for 1 6 i 6 n − 1 since xi belongs to A0i but xn does
not!
Case 2. There is a critical subset J ⊆ {1, . . . , n} such that 1 6 |J| 6 n − 1.
• By the inductive hypothesis we can choose a SDR xi i∈J for the family of
sets Ai i∈J (it clearly satisfies Hall’s condition).
• For any k ∈ {1, . . . , n} J let A∗k = Ak A(J). In other words, we obtain
S
A∗k by removing from Ak all the elements in A(J) = i∈J Ai as they have
all been used as representatives for the family Ai i∈J .
• We check Hall’s condition for the family of sets A∗k k∈{1,…,n}J . For K ⊆
{1, . . . , n} J we have A∗ (K) = A(K) A(J) = A(J ∪ K) A(J).1 So
|A∗ (K)| = |A(J ∪ K)| − |A(J)|
(6.1)
> |J ∪ K| − |J|
(6.2)
= |J| + |K| − |J|
(6.3)
= |K| ,
where (6.1) holds since A(J) ⊆ A(J ∪K), and (6.2) holds by Hall’s condition
and the fact that J is critical. Also, (6.3) holds since K ∩ J = ∅. Hence
the family of sets A∗k k∈{1,…,n}J satisfies Hall’s condition. By the inductive
hypothesis (as |{1, . . . , n}J| < n) we can find a SDR xk k∈{1,...,n}J for this
family. Putting this together with the previously chosen SDR for Ai i∈J
gives us a SDR for the whole family A1 , . . . , An .
This concludes the inductive step and hence the theorem follows by mathematical
induction on n > 1.
1Remember
A(J ∪ K) =
S
to read the expressions used in (6.1) as A∗ (K) =
i∈J ∪ K
Ai = A(J) ∪ A(K).
32
S
k∈K
A∗k , A(J) =
S
i∈J
Ai and
Remark (Marriage Interpretation). Assume given n women labelled 1, . . . , n
and for each i a set of men Ai which woman i would be happy to marry. Then all
n women can be happily married if and only if A1 , . . . , An has a SDR. Of course,
we used here that bigamy is illegal.
We can make this problem a bit less “sexist” by taking Ai to be the set of men
that woman i would be happy to marry and that would be happy to marry woman
i. Then the happiness above can be taken to be mutual.
In order to check Hall’s condition there is a lot of work to do in general. However
there is one situation in which we can guarantee Hall’s condition.
Proposition 6.5. Let A1 , . . . , An be a family of subsets of the set X. Suppose
that there is some k > 1 such that
(a) |Ai | > k for each index i = 1, . . . , n, and
(b) for each x ∈ X there are at most k indices i ∈ {1, . . . , n} with x ∈ Ai .
Then the family has a SDR.
Proof. We show that Hall’s condition holds.
Take J ⊆ {1, . . . , n} and consider the set of pairs
Y = { (i, x) | i ∈ J and x ∈ Ai } .
Then we see the following.
• |Y | > |J| · k, since |Ai | > k for all i.
• On the other hand consider any x such that (i, x) ∈ Y for some i ∈ J. Then
S
x ∈ A(J) = i∈J Ai . Also, there are 6 k indices i ∈ J ⊆ {1, . . . , n} with
x ∈ Ai . Hence there are 6 k pairs (i, x) ∈ Y with x as second component.
So |Y | 6 k · |A(J)|.
• We thus have that k · |A(J)| > |Y | > |J| · k. I.e. |A(J)| > |J|.
Example 6.6. The Fano Plane, see also the picture on the next page,
n
o
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6}
consists of seven subsets of X = {1, . . . , 7} called “lines”. The elements of X are
called “points”. So each line contains exactly 3 points and each point is on exactly
three lines. Thus we know by Proposition 6.5 (which applies with n = 7 and k = 3)
that the Fano Plane has a SDR. For example, (1, 4, 6, 2, 7, 3, 5) is a SDR.
For those of you who have heard of projective planes and finite fields, the Fano
plane is the projective plane P2 (F2 ) = P(F32 ) over the field F2 with 2 elements.
33
2. How many SDR’s are there?
If the family A1 , . . . , An does not satisfy Hall’s condition then it has no SDR.
But if it does satisfy Hall’s condition then it may have many SDR’s.
Proposition 6.7. Let A1 , . . . , An be a family of subsets of a set X which
satisfies Hall’s condition. Suppose that |Ai | > k for i = 1, . . . , n where k > 1. Then
the number of SDR’s of the family A1 , . . . , An is at least2
k!
if k 6 n ,
P (k, n) if k > n .
Proof. By induction on n. See Cameron’s book.
Corollary 6.8. Suppose that the hypotheses of Proposition 6.5 are satisfied
with k 6 n. Then there are at least k! distinct SDR’s of the family of sets.
2Remember
that, for 0 6 n 6 k, P (k, n) = k(k − 1) · · · (k − n + 1).
34
CHAPTER 7
Latin squares.
1. Latin squares and Latin rectangles
Definition 7.1. A Latin square of order n is an n × n array with each entry
of the array being one element from the set {1, . . . , n} with the property that each
element of the set {1, . . . , n} occurs exactly once in each row and exactly once in
each column of the array.
Example 7.2. The following is
1
2
3
4
5
a Latin square of order 6.
2 3 4 5 6
4 6 1 3 5
6 2 5 1 4
1 5 2 6 3
3 1 6 4 2
6 5 4 3 2 1
Notice that every row and every column can be seen as a permutation of {1, . . . , 6}.
Note. Sometimes we use a different set of symbols (i.e. not {1, . . . , n}) but the
definition of Latin square is still the same.
We also need a more general definition.
Definition 7.3. A Latin rectangle is a k × n array with k 6 n, having entries
from the set {1, . . . , n}, such that each element occurs exactly once in each row and
at most once in each column. Thus the first k rows of a Latin square form a Latin
rectangle.
We now show that, for any n > 1, we can construct a Latin square row by row
using Proposition 6.5.
Proposition 7.4. Let L be a k × n Latin rectangle with k < n. Then L can be
extended to a (k + 1) × n Latin rectangle.
Proof. • For i = 1, . . . , n Let Ai be the set of symbols which do not occur in
the ith column. Then |Ai | = n − k since there are k distinct symbols in each column
of L.
35
• Consider a symbol j. How many i are there with j ∈ Ai ? Well, we know that j
occurs k times in L—once in each row. However these occurrences are in k different
columns, so there are n − k columns in which j does not occur. I.e. there are n − k
indices i with j ∈ Ai .
• We have now verified the hypotheses of Proposition 6.5 for the family of sets
A1 , . . . , An . Hence we know that this family has a system of distinct representatives (x1 , . . . , xn ) say.
• Now, all of the xi are distinct so no symbol is repeated in (x1 , . . . , xn ). Also
xi ∈ Ai for each 1 6 i 6 n, i.e. xi does not occur in the ith column. Thus, by adding
(x1 , . . . , xn ) as the (k + 1)th row we obtain a (k + 1) × n Latin rectangle.
2. How many Latin Squares are there?
Theorem 7.5. The number of Latin squares of order n is at least
n! × (n − 1)! × · · · × 2! × 1! .
Proof. In the proof of Proposition 7.4 we replace Proposition 6.5 by Corol
lary 6.8 to conclude that the number of SDR’s of the family of sets A1 , . . . , An is
at least (n − k)! and so the number of ways of extending a k × n Latin rectangle to
a (k + 1) × n Latin rectangle is at least (n − k)!.
Note that there are n! many 1 × n Latin rectangles. Each such rectangle can be
extended to a 2 × n Latin rectangle in > (n − 1)! ways. Then each such choice can
be extended to a 3 × n Latin rectangle in > (n − 2)! ways etc. We thus deduce the
result.
Finding an upper bound for the number of Latin squares of order n.
2
• There is a trivial upper bound: nn which is the number of ways of putting
n symbols in an n × n array (ignoring the Latin square constraints.)
• Refining this we see that every row is a permutation of {1, . . . , n} so there
are 6 (n!)n Latin squares of order n.
• Refining this further. . . we see that, having chosen the first row, every other
row is a derangement (i.e. a permutation π such that π(i) 6= i for all 1 6
i 6 n) of it so the number of Latin squares is 6 n!(d(n))n−1 where d(n) is
n
X
(−1)i
the derangement number. (And it is known that d(n) = n!
.)
i!
i=0
36
• The exact number has been calculated for n 6 11 by exhaustive search. For
example McKay and Wanless, in the Annals of Combinatorics 2005, show
that the number of Latin squares of order 11 is:
11! × 10! × 5363937773277371298119673540771840
i.e. very very roughly 11 × (10!)2 × 5.36393777 × 1033 .
3. Youden squares
Latin rectangles can arise from families of subsets of {1, . . . , n} in the following
way.
Proposition 7.6. Let A1 , . . . , An be subsets of {1, . . . , n}. Suppose that, for
some k > 0 every set Ai has k elements and every element x ∈ {1, . . . , n} lies in
precisely k of the sets Ai . Then there is a k × n Latin rectangle M such that the
entries in the ith column of M form the set Ai .
Proof. • By Proposition 6.5 the family of sets has a SDR (x1 , . . . , xn ) which
we can take to be the first row of the rectangle.
• Now let A0i = Ai {xi } for i = 1, . . . , n. Clearly |A0i | = k − 1. Also, given any
x ∈ {1, . . . , n}, we have used x as the representative for one of the sets of which
it is a member. So there are precisely k − 1 indices with x ∈ A0i . So the family
{A0i , . . . , A0n } satisfies the conditions of Proposition 6.5 with k replaced by k − 1. We
continue this process with each SDR forming a new row until k = 0.
The Latin rectangles that represent a family of sets as above are sometimes called
Youden squares (although they aren’t squares).
37
CHAPTER 8
Extremal Set Theory.
Given X = {1, . . . , n} we know that |P(X)| = 2n (where P(X) denotes the
powerset of X, i.e. the set/family of subsets of X) and that the number of subsets of
X of fixed size k is C(n, k). We will now look at families of subsets of X (i.e. subsets
of the elements of P(X)) which satisfy some given constraint on the relationship
between members of the family. Our objective is to find an upper bound on the
cardinality of such a family and to try to find out which families achieve this upper
bound.
1. Intersecting Families
Definition 8.1. A family F of subsets of a set X is intersecting if, for all
A, B ∈ F, A ∩ B 6= ∅.
Notation 8.2. We say that a (unordered) pair of subsets {A, B} of a set X is
a complementary pair if B = X A. (I.e. if either {A, B} = {X, ∅} or {A, B} is a
partition of X into 2 parts.)
Proposition 8.3. An intersecting family F of subsets of X = {1, . . . , n} satisfies |F| 6 2n−1 . There exist intersecting families of size 2n−1 .
Proof. • We can partition the 2n subsets of X into 2n−1 complementary pairs
{A, X A}. Note that an intersecting family F contains at most one set from each
pair. Hence |F| 6 2n−1 .
• Let F 0 be the family of all subsets containing 1 (say). Then |F 0 | = 2n−1 =
|P({2, . . . , n})|
There are many intersecting families of subsets (of an n-element set X) of size
n−1
2
. We saw one example in the proof of Proposition 8.3. The following are further
examples.
Notation 8.4. For simplicity, we sometimes use the shorthand i1 i2 . . . ik to
denote the subset {i1 , i2 , . . . , ik } of some set X. For example, {12, 13, 14, 23, 24, 34}
denotes the family of subsets
{{1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}}
38
of size 2 of the set X = {1, 2, 3, 4}.
Example 8.5. We suppose that X = {1, . . . , n}.
• Suppose firstly that n is odd. Then the family of subsets containing more than
n
2
elements has size/cardinality 2n−1 since it contains exactly one of each comple-
mentary pair of subsets. Note that this family is intersecting (see below for the
argument).
• If n is even, then any family F which contains all the subsets of size >
precisely one of each complementary pair of sets of size
n
2
n
2
and
(i.e. pairs {A, X A} such
that |A| = |X A| = n2 ) contains 2n−1 sets (one from each complementary pair). It
is also intersecting. Indeed if A, B ⊆ X are such that |A| >
|A ∩ B| = |A| + |B| −|A ∪ B| >
n
2
n
2
and |B| > n2 , then
+ n2 − n = 0; and if |A| = |B| =
n
2
and A ∩ B = ∅,
then A and B are complementary, so they cannot both belong to F.
• Specific examples of the second case above are (using the shorthand of Notation 8.4) for X = {1, 2, 3, 4} the two families
F1 = {1234, 123, 124, 134, 234, 12, 24, 23} ,
F2 = {1234, 123, 124, 134, 234, 13, 14, 34} .
Note that these intersecting families both have cardinality 8 and differ precisely in
the 3 sets of size 2. There are 6 more of such intersecting families of maximal size.
Example 8.6. Let X = {1, 2, 3, 4, 5, 6, 7} and let B be the family of seven subsets
{123, 145, 167, 246, 257, 347, 356} .
This is the Fano Plane from Example 6.6. It is also known that (X, B) is the only
Steiner Triple System of order 7. Let F be the family of all those subsets of X
which contain a member of B. Then F is intersecting and |F| = 64 = 27−1 (See
Workshop 5, Question 2).
Remark 8.7. Suppose that, given some universal n-element set X, we only
consider families F consisting of k-element subsets. Note the following.
(a) For n < 2k any family of k-element subsets is intersecting.
(b) Now assume n > 2k.
• If n = 2k, then, by the argument from Example 8.5, the intersecting
families F of k-element subsets are precisely those which contain at most
39
one of each complementary pair of subsets (as defined in Notation 8.2) of
X of size k. So
1
1 (2k)(n − 1)!
|F| 6 · C(n, k) = ·
=
2
2 (n − k)!k!
(n − 1)!
(n − 1) − (k − 1) !(k − 1)!
= C(n − 1, k − 1) .
Note that if we define for i ∈ X (n arbitrary), Fk (i) as the family of all
k-element subsets of the n-element set X containing the element i, then
|Fk (i)| = C(n − 1, k − 1).
• If n > 2k, then we still have for any intersecting family F of k-element
subsets of X that |F| 6 C(n − 1, k − 1). Moreover, if n > 2k and F is such
a family with |F| = C(n − 1, k − 1), then F = Fk (i) for some i ∈ X.
• We can generalise this further. Call a family F t-intersecting if |A ∩ B| > t
for all A, B ∈ F. Then we have the following result
Theorem 8.8. (Not Examinable) Given k > t > 1 there exist n1 and
n2 such that the following hold.
(a) If n > n1 , any t-intersecting family of k-element subsets of an nelement set has size at most C(n − t, k − t).
(b) If n > n2 , a t-intersecting family of k-element subsets of an n-element
set which has size C(n − t, k − t) consists of all k-element subsets
containing some fixed t-element subset (of the n-element set).
Theorem 8.8 and the t = 1 result for n > 2k before it are due to Erdös,
Ko and Rado.
2. Sperner Families
Definition 8.9. A family F of subsets of a set X is called a Sperner family if
no member of F properly contains any other member of F, i.e.
A, B ∈ F ⇒ A 6⊂ B and B 6⊂ A ,
where ⊂ denotes proper inclusion.
Note 8.10. Suppose that |X| = n. Then, for any fixed 1 6 k 6 n the family of
all k-element subsets of X forms a Sperner family containing C(n, k) sets. Notice
that the binomial coefficients increase to the midpoint and then decrease. So the
largest Sperner families of this type occur when k =
n+1
2
n
2
if n is even and k =
n−1
2
or
if n is odd. We will see that these are the largest Sperner families in general.
40
Theorem 8.11. (Sperner’s Theorem) Let F be a Sperner family of subsets of
the n-element set X. Then the following statements are true.
(a) |F| 6 C(n, b n2 c).
(b) If equality holds in (a) then F consists of all subsets of X of size b n2 c or F
consists of all subsets of X of size d n2 e.
Note. If n is even, then b n2 c = d n2 e = n2 .
Proof. We may assume X = {1, . . . , n}. Define a chain of subsets of X to be
a sequence of the form
∅ = A0 ⊂ A1 ⊂ · · · ⊂ An = X
Remark. Recall that “⊂” is proper inclusion. Thus |Aj+1 Aj | = 1 for all
j ∈ {0, . . . , n − 1}.
We now prove (a) and (b).
(a) Claim 1. There are n! chains.
Proof of Claim 1. If π is a permutation of X, then A0 ⊂ A1 ⊂ · · · ⊂
An is a chain where Ai = {π(1), . . . , π(i)} for 1 6 i 6 n (and A0 = ∅). Also,
if B0 ⊂ B1 ⊂ · · · ⊂ Bn is a chain, then π : X → X defined by setting π(i)
to be the element in Bi+1 Bi for all 0 6 i 6 n − 1 is a permutation of
X. Thus chains and permutations of X are in one-one correspondence. So
there are n! chains.
Claim 2. For any k-element subset A, A lies in k!(n − k)! chains.
Proof. Indeed, there are k! chains ∅ = A0 ⊂ · · · ⊂ Ak = A and (n−k)!
chains A = B0 ⊂ · · · ⊂ Bn−k = X (by the same reasoning). So there are
k!(n − k)! such chains in total.
Remark. Note also that
k!(n − k)! = n! ×
1
.
C(n, k)
Thus A (in the proof of Claim 2) is contained in
(8.1)
1
C(n,k)
of the overall number
of chains. Moreover, by symmetry we can deduce this directly from the fact
that there are C(n, k) k-element subsets and that for any two distinct kelement subsets the sets of chains to which they belong are disjoint. Using
this observation and equation (8.1) we obtain another proof of Claim 2.
41
Now consider a Sperner family F of subsets of X. By definition any chain
contains at most one member of F. So for the number N of chains containing a member of F we have
X
N =
|A|!(n − |A|)!
A∈F
|A|!(n − |A|)!
=
n!
n!
A∈F
!
X
1
= n!
.
C(n, |A|)
A∈F
X
However N 6 n! as there are only n! chains in all. Thus
X
N
1
=
6 1.
C(n, |A|)
n!
A∈F
But, as mentioned above the middle binomial coefficients are largest so their
reciprocals are smallest. Thus, setting m = b n2 c we have
X
1
1
|F| ×
6
6 1.
C(n, m)
C(n, |A|)
A∈F
Thus |F| 6 C(n, m).
(b) When is |F| = C(n, b n2 c)?
By inspection of the argument above, this only happens when every chain
contains a member of F (N = n!) and for all A ∈ F, C(n, |A|) = C(n, b n2 c).
So, noting that C(n, b n2 c) = C(n, d n2 e) we must have |A| ∈ {b n2 c, d n2 e} for
all A ∈ F. There are two cases.
(i) n is even. Then
n
2
= b n2 c = d n2 e and F consists of all subsets of size n2 .
(ii) n is odd. Again put m = b n2 c. Then m + 1 = d n2 e.
Claim. F either consists of all subsets of size m or all subsets of size m + 1.
Proof of Claim. • Suppose that F contains some A such that |A| =
m. Note that any other m-element subset B of X contains j elements
outside A and m − j elements in A, for a unique 1 6 j 6 m. We call B “jdifferent” to A. We prove by induction that all j-different (to A) m-element
subsets are in F (and so all m-element subsets are in F). For j = 1, suppose
A = Y ∪ {a} and B = Y ∪ {b}. Then A ⊂ Y ∪ {a, b} so Y ∪ {a, b} ∈
/ F.
Also B ⊂ Y ∪ {a, b} is part of a chain. Thus, as Y ∪ {a, b} ∈
/ F it must
be the case that B ∈ F (B and Y ∪ {a, b} are the only elements in such
42
a chain which have size m or m + 1, so one of these two must be in F).
Hence every “1-different” (to A) k-element subset is in F. Now suppose
that 1 6 j 6 m and that every “j-different” m-element subset is in F. Let
B be a “(j + 1)-different” m-element subset. Then choose any “j-different”
m-element subset C which differs from B on just one element. Then C ∈ F
and by the argument above we show that therefore also B ∈ F. Thus by
induction on 1 6 j 6 m we see that all m-element subsets are in F (as
each such subset is “j-different” (to A) for some 0 6 j 6 m). As there are
C(n, m) such subsets there are no other subsets in F.
• If F doesn’t contain some A with |A| = m, then |A| = m + 1 for all
A ∈ F. Since there are C(n, m + 1) = C(n, m) such subsets, F consists of
all the (m + 1)-element subsets of X.
This concludes the proof of Theorem 8.11.
Example 8.12.
(a) Show that the largest Sperner family of subsets of X =
{1, 2, 3, 4, 5, 6} containing the subsets {1}, {2}, {3, 4} and {5, 6} has cardinality 8 (i.e. contains precisely 8 subsets of X).
Solution. Suppose that F ⊆ P(X) is a Sperner family such that {1}, {2},
{3, 4}, {5, 6} ∈ F. Then, for any other B ∈ F, 1, 2 ∈
/ B; B contains 3 or 4,
but not both (otherwise B ⊂ {5, 6} or {3, 4} ⊂ B); and B contains 5 or 6,
but not both (otherwise B ⊂ {3, 4} or {5, 6} ⊂ B). Hence F contains the
four subsets above and some of the sets {3, 5}, {3, 6}, {4, 5}, {4, 6}. Note
that these four sets together with the initial four sets form a Sperner family.
So any Sperner family F containing the subsets {1}, {2}, {3, 4} and {5, 6}
satisfies |F| 6 8 and amongst these there is a unique one with |F| = 8.
(b) How does this compare with the bound N given by Sperners Theorem?
Solution. The bound N is achieved by a family F where F contains all
subsets of size b 26 c =
6
2
= 3. Thus N = |F| = C(6, 3) =
6×5×4
3×2×1
= 20.
3. The De Bruijn Erdös Theorem (Not Examinable)
This concerns a special case of intersecting families. Here, instead of assuming
that any two sets in the family F intersect over at least one element we assume that
they intersect in precisely one element.
43
Theorem 8.13. Let F be a family of subsets of X = {1, . . . , n}. Suppose that
any two sets of F have exactly one point in common. Then |F| 6 n. If equality
holds, then one of the following situations occurs.
(a) Up to renumbering the points and sets, we have F = {A1 , . . . , An } where
Ai = {i, n} for 1 6 i 6 n (so that An = {n}).
(b) Up to renumbering the points and sets, we have F = {A1 , . . . , An } where
An = {1, 2, . . . , n − 1} and Ai = {i, n} for all 1 6 i 6 n − 1.
(c) For some positive integer q we have n = q 2 + q + 1 with each set in F being
of size q + 1 and each point lying in q + 1 members of F.
Remarks 8.14.
1. The proof of Theorem 8.13 uses Hall’s Theorem on the existence of SDR’s.
2. When n = 3 (i.e. q = 1), then case (b) and case (c) describe the same case.
3. For q > 2 the structure described in case (c) is called a projective plane of order
q. Only for q a prime power examples are known. In this case the usual projective
plane P2 (Fq ) = P(F3q ) over the finite field Fq with q elements is an example. The
first example—i.e. for q = 2—is known as the Steiner triple system (X, F) of order
7 where X = {1, 2, 3, 4, 5, 6, 7} and F = {123, 145, 167, 246, 257, 347, 356}. This
configuration is also known as the Fano Plane, see Examples 6.6 and 8.6. Note that,
as q = 2, n = q 2 + q + 1 = 7 and each set has q + 1 = 3 elements whereas each point
lies in q + 1 = 3 sets of the family F.
44
CHAPTER 9
The pigeonhole principle & Ramsey numbers.
In this final chapter we discuss counting problems with a slightly different flavour.
We start with some preliminary definitions which will be useful to our understanding
of Ramsey Theory.
1. Preliminaries
We start by defining the notion of a graph.
Intuitively by a graph we will mean a certain set of “points”, called vertices,
where some pairs of points are joined by an edge (and others not).
The edges have no direction, we only allow distinct points to be joined by an edge,
and at most one edge can link any two points. These objects are sometimes called
simple graphs or “undirected graphs without loops and without multiple edges”.
More formally, a graph is a pair G = (V, E) such that V is a set and E is a
set which consists of 2-element subsets of V . So V consists of the vertices of G and
{x, y} ∈ E means that x and y are connected by an edge of G. We will be concerned
with finite graphs, namely graphs with a finite number of vertices, so also a finite
number of edges. In this case we often take the set of vertices to be {1, 2, . . . , n}.
By a subgraph of a graph G = (V, E) we mean a graph G0 = (V 0 , E 0 ) where
V 0 is a subset of V and E 0 consists of all edges in E which have endpoints in V 0 .
So a subgraph of G is completely determined by choosing a subset V 0 of the set of
vertices V of G.
Graphs are a useful mathematical way of modelling networks of various types.
For example, the Facebook network can be modelled by a node for every user, and an
edge between nodes if the people are friends. (Notice that the Twitter and Google+
networks are not immediately graphs in our sense, since ‘following’ and ‘encircling’
are not symmetric: I can follow Stephen Fry without him following me.)
The actual graphs that we will encounter in this chapter are complete graphs as
described in the following example.
45
Example 9.1. A graph G = (V, E) is called complete, if E consists of all 2element subsets of V . So in a complete graph any two (distinct) vertices are connected by an edge. Note that a subgraph of a complete graph is again complete.
We denote the complete graph with vertex set {1, . . . , n} by Kn and will refer to
any complete graph G = (V, E) with |V | = n as “a Kn ”.
So K3 has vertex set V = {1, 2, 3} while its edge set is E = {{1, 2}, {1, 3}, {2, 3}}
(namely all unordered pairs).
Figure 1. The complete graphs Kn for n = 1, 2, 3, 4, 5
2. The Pigeonhole Principle
Proposition 9.2 (Pigeonhole principle). Suppose that n > r ≥ 1. If n pigeons
are distributed among r pigeonholes then at least one pigeonhole will contain at least
two pigeons.
Here is a slight generalisation:
Proposition 9.3 (Generalised pigeonhole principle). Suppose that r, m ≥ 1 and
n > mr. Suppose that a set of n elements is written as a disjoint union of r subsets.
Then at least one of these subsets will contain at least m + 1 elements.
Proof. Notation: we shall use the square union symbol t to indicate pairwise
disjoint union. So, if S is our set of size n, then the hypothesis is that S = S1 t
S2 t · · · t Sr for some sets Si . Suppose for a contradiction that each Si has size
≤ m. Then n = |S| = |S1 t S2 t · · · t Sr | = |S1 | + |S2 | + · · · + |Sr | ≤ mr, which is a
contradiction.1
Example 9.4. Show that at least 29 numbers between 1 and 200 have the same
remainder when is divided by 7.
Let S := {1, 2, 3, . . . , 200}. For i ∈ {0, 1, 2, 3, 4, 5, 6}, let Si be the set of elements
of S whose remainder under division by 7 is i. Then S = S0 t S1 t S2 t . . . t S6 .
Since 200 > 28 · 7 = 196, the generalised pigeonhole principle tells us that at least
one Si contains at least 29 elements, meaning that at least 29 of the numbers in S
have remainder i when divided by 7.
1For
the proof we only need that S is the union of the Si , we don’t really need the disjointness.
46
Example 9.5. Suppose that each day, a child paints her calendar with 2 out of
a total set of 5 colours. Prove that over a period of 21 days there are at least 3 days
which she has painted with the same two colours.
Let S be the set of 21 days. There are C(5, 2) = 10 possible choices of 2 out of
5. For each such choice, i say, let Si be the set of days that she chooses i. Then
S = S1 t · · · t S10 . As 21 > 2 · 10, by the generalised pigeonhole principle, some Si
contains at least 3 elements. Namely the same choice is made on at least 3 days.
Here is a rather harder statement proven in 1935:
Theorem 9.6 (Erdős and Szekeres). Let r, s be integers ≥ 0. Then any sequence
a1 , a2 , . . ., an of n > rs distinct real numbers contains a subsequence of length
r + 1 which is strictly increasing or a subsequence of length s + 1 which is strictly
decreasing.
Proof. For each i = 1, 2, . . . , n, let ti be the length of the longest increasing
subsequence starting with ai . If some ti ≥ r + 1, the proposition is proved. So let’s
assume that each ti ≤ r, and we will try to find a decreasing subsequence of length
s + 1.
For k = 1, . . . , r, let Sk be the set of those i such that ti = k. So the sets Sk for
k = 1, . . . , r break up the set {1, 2, . . . , n} into r (disjoint) subsets (some of which
may be empty). By the generalised pigeonhole principle, as n > r · s, there is some
k such that Sk contains at least s + 1 elements.
Let Sk = {i1 , i2 , . . . , is+1 , . . .} with i1 < i2 < · · · < is+1 . Now suppose that
ail < aim for l, m with 1 6 l < m 6 s + 1. Then, since im ∈ Sk , there is an
increasing subsequence of length k starting with aim . Appending ail to the start of
the subsequence would then give an increasing sequence of length k + 1 beginning
with ail , contradicting il ∈ Sk . So ai1 > ai2 > · · · > ais+1 , which is a strictly
decreasing sequence of length s + 1 as required.
Remark. If in Theorem 9.6 we omit “distinct” and replace one of the two
occurrences of “strictly” by “weakly”, then the result is still true.
3. Ramsey Theory
The subject of Ramsey theory has the same flavour as the generalised pigeonhole principle and the Erdős and Szekeres Theorem. In a very general setting, with
apparently little information to work with, we want to find a large subset which is
well-behaved or structured in some sense.
47
More precisely, we shall work in the following context: we have a set S of n
elements, and the unordered pairs {a, b} from S are assigned colours red or blue
(according to some rule, algorithm, or random process about which we may know
nothing). The challenge is then to find a “large” subset S 0 ⊆ S such that all
unordered pairs of elements from S 0 have the same colour.
Recalling that unordered pairs can be thought of as the edges of a graph, it will
be convenient to think of this situation graph-theoretically. We consider our set S
of n elements as the set of vertices of the complete graph Kn . (Recall that in the
complete graph, all pairs of distinct vertices are joined by an edge.) The edges of
Kn are then coloured red or blue. Our challenge is to find a “large” subgraph where
all edges have the same colour.
Here is a key example, sometimes called the Party problem: At a party of 6
people there must be 3 people who have all met one another before, or 3 people who
are mutual strangers.
Lemma 9.7 (The Party Problem). Suppose |S| = 6, and unordered pairs from S
are assigned colours red or blue. Then there is a subset S 0 of S of size 3 such that
all unordered pairs of elements of S 0 are assigned the same colour.
Equivalently if the edges of K6 are coloured red or blue then K6 has a subgraph
on 3 vertices with all edges having the same colour.
Proof. Pick some fixed a ∈ S. Of the remaining 5 elements of S let X be the
set of those elements joined to a by a red edge, and Y the set of those joined to a
by a blue edge. Then S {a} = X t Y . So, by the generalised pigeon-hole principle,
either |X| ≥ 3 or |Y | ≥ 3.
Suppose first that |X| ≥ 3. Let b, c, d ∈ X be three distinct elements. If at
least 2 of them, for example b, c, are joined by a red edge, then {a, b, c} solves our
problem, namely each pair from {a, b, c} is joined by a red edge. Otherwise all pairs
from b, c, d are joined by a blue edge, and we have again solved the problem.
The case where |Y | ≥ 3 proceeds by the same argument.
Example 9.8. We cannot do the same thing with a set S of size 5. In the
following picture, interpret dashed lines as red edges, and solid lines as blue edges.
Then there is no red or blue triangle.
48
• Given integers n, p, q ≥ 1, we say that Ram(p, q, n)
Definition 9.9.
holds if: for any colouring of the edges of Kn with red and blue, Kn contains a subgraph on p vertices all of whose edges are red, or Kn contains a
subgraph on q vertices all of whose edges are blue.
• The Ramsey number R(p, q) is the least integer n, if one exists, such that
Ram(p, q, n) holds.
Of course, if given p, q there is no integer n satisfying Ram(p, q, n) then R(p, q)
is undefined.
Lemma 9.10.
(i) If Ram(p, q, n) holds then Ram(p, q, m) holds for all m ≥ n.
(ii) R(p, q) = R(q, p).
Proof.
(i) Let m ≥ n and let c be a 2-colouring of G = Km . Now look at
any n vertices of G and the subgraph G0 they form. Since Ram(p, q, n) holds,
there is a suitable red subgraph on p vertices of G0 —and so of G—or a suitable
blue subgraph on q vertices of G0 —and so of G.
(ii) Suppose that Ram(p, q, n) fails. Then there is some colouring c of Kn which
witnesses this. Now look at the reverse colouring c0 which has a red edge everywhere c has a blue edge, and vice versa. Then c0 witnesses that Ram(q, p, n)
fails.
We have actually already computed one Ramsey number:
Proposition 9.11. R(3, 3) = 6.
Proof. By Lemma 9.7 R(3, 3) exists and R(3, 3) ≤ 6. On the other hand, by
Example 9.8, R(3, 3) > 5.
Some other Ramsey numbers are easy to pin down:
Proposition 9.12.
(i) R(p, 1) = R(1, p) = 1 for all p ≥ 1.
(ii) R(p, 2) = R(2, p) = p for all p ≥ 2.
Proof. Part (i) is entirely trivial since a subgraph on one vertex has no edges,
and thus all the subgraph’s edges are vacuously red/blue. Thus we move swiftly on
to part (ii). By Remark 9.10, it is enough to show that R(p, 2) = p. As usual, we
do this in two steps:
Claim 1 R(p, 2) ≤ p.
Proof of Claim 1 We have to show that Ram(p, 2, p) in Definition 9.9 holds. Let
c be any colouring of Kp .
49
Case (i). All edges are coloured red. In this case Kp is a subgraph of itself with all
edges coloured red.
Case (ii). Some edge is coloured blue. Then the subgraph of Kp whose vertices are
the endpoints of one fixed blue edge gives a subgraph of Kp on 2 vertices all of whose
edges are blue. QED Claim 1
Claim 2 R(p, 2) > p − 1.
Proof of Claim 2 We shall show that Ram(p, 2, p − 1) does not hold. We need a
colouring c of the edges of Kp−1 red/blue with no totally red subgraph on p vertices
and no totally blue subgraph on 2 vertices. We can simply let c be the colouring
which colours all edges of Kp−1 red. QED Claim 2
More general Ramsey numbers are notoriously hard to pin down. But we can at
least establish that they exist, and give a (weak) bound:
Theorem 9.13.
(i) R(p, q) exists for all p, q ≥ 1.
(ii) R(p, q) ≤ R(p, q − 1) + R(p − 1, q) for all p, q ≥ 2.
Proof. We shall prove part (i) by induction on p+q, establishing part (ii) in the
process. By Proposition 9.12(i), we may assume that both p, q > 2. The inductive
assumption will be that R(x, y) exists for all x, y ≥ 1 such that x + y < p + q.
We shall prove that if G = (V, E) is a complete graph with R(p, q−1)+R(p−1, q)
vertices and its edges are coloured red or blue then there will be a subgraph on p
vertices with all edges red or a subgraph on q vertices with all edges blue. This will
establish both parts of the theorem.
Fix a vertex v ∈ V . Let X(v) be the set of vertices of G joined to v by a red
edge, and Y (v) the set of vertices of G joined to v by a blue edge. Then either
(I) |X(v)| ≥ R(p − 1, q), or
(II) |Y (v)| ≥ R(p, q − 1).
(Otherwise, if both (I) and (II) fail, then |V | = |V {v}| + 1 = |X(v)| + |Y (v)| + 1 6
R(p, q − 1) + R(p − 1, q) − 1.)
Suppose first that (I) holds. Then by the definition of R(p − 1, q), the subgraph
of G with X(v) as set of vertices has either (a) a subset of p − 1 vertices with all
edges between them red, or (b) a subset of q vertices with all edges between them
blue. Case (b) gives us a totally blue subgraph of G on q edges. In case (a), by
adjoining v, we get a totally red subgraph of G on p vertices.
There is a similar argument when (II) holds. This proves the theorem.
50
Corollary 9.14. For all p, q ≥ 1, R(p, q) ≤ C(p + q − 2, p − 1) = C(p + q −
2, q − 1).
Proof. This follows from Proposition 9.12(i), Theorem 9.13, induction, and the
fact (1.14 (ii)) that C(n + 1, k) = C(n, k) + C(n, k − 1). It will be left as as exercise
on the coursework.
We finish with a lower bound.
Proposition 9.15. For all p, q ≥ 1, R(p, q) > (p − 1)(q − 1).
Proof. By Proposition 9.12(i), we may assume p, q > 2. It is enough to show
that for n = (p − 1)(q − 1), Ram(p, q, n) does not hold. We will colour the edges of
the complete graph Kn on n vertices with colours red and blue such that there is
no red subgraph on p vertices and no blue subgraph on q vertices. Arrange the n
vertices in a rectangle with p − 1 rows and q − 1 columns. If u, v are in the same
row, colour the edge between them blue. Otherwise colour the edge between them
red. So clearly there is no subgraph on q vertices with all edges blue, because then
the vertices would have to be in the same row, but each row has only q − 1 elements.
Note also that there is no subgraph on p vertices with all edges red, because given
as there are only p − 1 rows, 2 of these p vertices would be in the same row (by the
pigeon-hole principle), so the edge between them would be coloured blue.
Apart from the easy values given by Proposition 9.12, here are all the Ramsey
numbers whose values are known exactly at time of writing:
R(3, 3) = 6, R(3, 4) = 9, R(3, 5) = 14, R(3, 6) = 18, R(3, 7) = 23, R(3, 8) = 28,
R(3, 9) = 36, R(4, 4) = 18, R(4, 5) = 25.
As you can see, finding the exact values of Ramsey numbers is extremely hard.
Paul Erdős is quoted as having said:
“Suppose aliens invade the earth and threaten to obliterate it in a year’s time
unless human beings can find the Ramsey number for red five and blue five. We
could marshal the world’s best minds and fastest computers, and within a year we
could probably calculate the value. If the aliens demanded the Ramsey number for
red six and blue six, however, we would have no choice but to launch a preemptive
attack.”
51
3. (a) Let A be a matrix with entries equal to 0 or 1. Let R be the set of
rows of A and let C be the set of columns of A (R and C are of course disjoint).
Assume given R’ CR and C C C such that R’ UC’ is of minimal size with the
property that every entry 1 in A lies on a row from R’ or on a column from C’.
Put p=R’| and q = |C’|.
i. Show that any set of entries 1 in A which all lie on different rows and on
different columns must have size
Purchase answer to see full
attachment
Tags:
graph theory
hall theorem
graph teory
set of entries
halls condition
User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.
Reviews, comments, and love from our customers and community: