1.11. Enumerative Combinatorics#

Main references for this section are [37, 60, 81].

1.11.1. Basic Counting#

There are two basic principles of counting: the sum rule and the product rule. The sum rule is derived from the cardinality of union of disjoint sets. The product rule is derived from the Cartesian product of two sets.

1.11.1.1. Product Rule#

Theorem 1.39 (Product rule)

Suppose an experiment has two aspects. First aspect can vary in \(n_1\) ways. Second aspect can vary in \(n_2\) ways. Assume that the two aspects are independent. Then the total number of outcomes is

\[ n = n_1 n_2. \]

If there are \(k\) different aspects each of which can vary in \(n_i\) ways for \(i \in 1,\dots,k\), then the total number of outcomes is

\[ n = n_1 n_2 \dots n_k = \prod_{i=1}^k n_i. \]

Examples

  1. If there are three possible coffee choices and two possible doughnut choices then the number of different combinations of coffee and doughnut are six.

  2. If a t-shirt comes in three different sizes and four different colors, then there are twelve different varieties of the same t-shirt.

1.11.1.2. Sum Rule#

Theorem 1.40 (Sum rule)

Suppose that the outcome of an experiment consists of two different cases \(A\) and \(B\) which are mutually exclusive of each other. In other words, if \(A\) occurs then \(B\) cannot occur and vice-versa. Suppose that there are \(n_1\) possible outcomes under the case \(A\) and \(n_2\) possible outcomes under the case \(B\). Then the total number of outcomes is

\[ n = n_1 + n_2. \]

In general, if the outcome of an experiment has \(k\) distinct cases which are mutually exclusive and if for each \(i\)-th case, there are \(n_i\) possible outcomes, then the total number of outcomes is

\[ n = n_1 + n_2 + \dots + n_k = \sum_{i=1}^k n_i. \]

Remark 1.23 (sum rule for sets)

The sum rule in set theory is as follows. If \(A\) and \(B\) are finite and disjoint sets, then

\[ | A \cup B| = |A | + |B|. \]

Examples

  1. Suppose a house has two floors. The first floor has \(3\) rooms and the second floor has \(5\) rooms. Then the number of ways you can select a room is \(3 + 5 = 8\).

  2. Suppose a team has \(3\) men and \(4\) women members. Then the number of ways you can select a team member is \(3 + 4 = 7\).

  3. Suppose we have to pick either a single letter or a single digit. There are \(26\) possible letters and \(10\) possible digits. Thus, there are \(36\) possible outcomes.

1.11.2. Permutations#

Permutations are different arrangements of a given finite set of objects.

Definition 1.119 (Permutation)

A permutation of \(n\) distinct objects is an arrangement of those objects in a definite order. If \(r \in \{1, \dots, n \}\) then an \(r\)-permutation of \(n\) objects is an arrangement of \(r\) of the \(n\) objects in a definite order.

Example 1.25 (Permutations of A,B,C)

Following are the possible permutations of the three letters A,B,C.

  1. A, B, C

  2. A, C, B

  3. B, A, C

  4. B, C, A

  5. C, A, B

  6. C, B, A

As we can see there are \(6\) different permutations.

Example 1.26 (\(2\)-permutations of A,B,C,D)

Following are the possible ways to pick and arrange 2 letters out of the set {A, B, C, D}.

  1. A, B

  2. A, C

  3. A, D

  4. B, A

  5. B, C

  6. B, D

  7. C, A

  8. C, B

  9. C, D

  10. D, A

  11. D, B

  12. D, C

There are 12 possible \(2\)-permutations.

Theorem 1.41 (Number of \(r\) permutations)

The number of \(r\) permutations of \(n\) objects, denoted by \(P(n, r)\), is given by

\[ P(n, r) = n (n -1) \dots (n - r + 1). \]

Proof. We proceed as follows.

  1. There are \(n\) ways of choosing the first object.

  2. Once the first object has been chosen, there are \(n-1\) ways of choosing the second object.

  3. Continuing in the same manner, once \(p\) objects have been chosen, there are \(n-p\) ways of choosing the \(p+1\)-th object.

  4. At the end, when \(r-1\) objects have been chosen, then there are \(n-(r-1)=n-r+1\) ways of choosing the last object.

  5. By the product rule, the total number of arrangements is

    \[ n (n -1) \dots (n -r +1). \]

Definition 1.120 (Factorial)

The factorial of \(n\), denoted as \(n!\) is defined as

\[ n! = n (n-1) \dots 1. \]

By convention, we define \(0! = 1\).

  1. It is clear that the number of permutations of \(n\) objects is \(n!\).

  2. We can also see that the number of \(r\)-permutations of \(n\) objects is given by the formula

    \[ P(n, r) = \frac{n!}{(n-r)!}. \]

Example 1.27 (Round table seating arrangement problem)

You are a family of three. You have invited five guests for a party. Your dining table can seat all the eight people. As hosts, you don’t want to sit together on the table. The guests can be seated arbitrarily. How many seating arrangements are possible?

  1. We note that only the relative arrangements of people matters.

  2. Let your family members be called A, B, C.

  3. Let the guests be labeled as 1,2,3,4,5.

  4. We shall identify an arrangement starting from where A is sitting and then going clockwise.

  5. An example arrangement is A,1,2,B,3,4,C,5.

  6. Note that neither B nor C can be at the end as that would lead to them sitting next to A.

  7. Since A is always at the beginning of an arrangement, we can drop it.

  8. Let us replace all the guests by *.

  9. Then the arrangement looks like *, *, B, *, *, C, *.

  10. Forget about the seats for the moment and just look at this arrangement.

  11. We see that B and C must be placed between the guests so that they are not sitting next to each other.

  12. We can think of slots between guests and denote them by |.

  13. Then the guests and slots between them can be represented as * | * | * | * | *.

  14. There are 4 slots between 5 guests.

  15. We can see that we can place B and C into either of these 4 slots.

  16. The number of ways the 2 slots out of 4 can be selected is \(P(4,2) = 4 \times 3 = 12\).

  17. Once the slots have been picked up by B and C, the guests can arrange themselves in \(5!=120\) different ways.

  18. By the product rule, the total number of arrangements is \(12 \times 120 = 1440\).

1.11.3. Combinations#

Definition 1.121 (\(r\)-combination)

Assume that we have \(n\) distinct objects (where \(n \in \Nat\)). An \(r\)-combination of \(n\) objects is a subset consisting of \(r\) of those objects where \(0 \leq r \leq n\).

The number of \(r\) combinations of \(n\) objects is denoted by \(\binom{n}{r}\) or \(C(n, r)\).

  1. There is only 1 way of to choose all \(n\) objects. Hence \(C(n, n) = 1\).

  2. There is only 1 way to choose none of the \(n\) objects. Hence \(C(n, 0) = 1\).

  3. There are \(n\) ways to choose one of the \(n\) objects. Hence \(C(n, 1) = n\).

  4. There are \(n\) ways to choose \(n-1\) of the \(n\) objects as it is equivalent to drop one of the \(n\) objects. Hence \(C(n, n-1) = n\).

Theorem 1.42 (Number of \(r\)-combinations)

The number of \(r\)-combinations of \(n\) objects is given by

\[ C(n, r) = \binom{n}{r} = \frac{n!}{r! (n-r)!}. \]

Proof. This follows from the fact that each \(r\)-combination results in \(r!\) \(r\)-permutations.

  1. By Theorem 1.41 the number of \(r\)-permutations of \(n\) objects is

    \[ P(n, r) = \frac{n!}{(n-r)!}. \]
  2. Let \(k = C(n, r)\).

  3. Then for each of the unordered subsets of \(r\) elements, there are \(r!\) ways that we can arrange them in a specific order.

  4. Hence

    \[ r! k = P(n, r) \]

    by the product rule.

  5. Hence we have

    \[ C(n, r) = k = \frac{P(n, r)}{r!} = \frac{n!}{r! (n-r)!}. \]

Example 1.28

What is the number of ways to choose 3 hearts from a deck of cards at least one of which is an ace or a king?

  1. There are 13 cards of the suit of hearts.

  2. The selection includes a king but no ace. Number of ways to choose 2 other cards is \(C(11, 2)\).

  3. The selection includes an ace but no king. Number of ways to choose 2 other cards is \(C(11, 2)\).

  4. The selection includes both king and ace. Number of ways to choose the third card is \(C(11, 1)\).

  5. Total number of ways is

    \[ C(11, 2) + C(11, 2) + C(11, 1) = 55 + 55 + 11 = 121. \]

Here is another way to divide the problem in two cases.

  1. The selection includes an ace. In this case the number of ways the remaining two cards can be selected is \(C(12, 2)\).

  2. The selection includes a king but not an ace. The number of ways the remaining two cards can be chosen is \(C(11, 2)\).

  3. Hence total number of cases is

    \[ C(12, 2) + C(11, 2) = 66 + 55 = 121. \]

Here is a third approach.

  1. Ace is included: \(C(12, 2)\) ways.

  2. King is included: \(C(12, 2)\) ways.

  3. Both of these cases include the case where both ace and king are included.

  4. The number of ways both ace and king can be selected is \(C(11, 1)\).

  5. Since this is double counted, hence we need to subtract it from total number of cases.

  6. Thus, we get

    \[ C(12, 2) + C(12, 2) - C(11, 1) = 66 + 66 - 11 = 132 - 11 = 121. \]

1.11.4. Binomial Theorem#

Definition 1.122 (Binomial coefficient)

For any nonnegative \(n\) and any integer \(r\) with \(0 \leq r \leq n\), the binomial coefficient is defined as

\[ {n \choose r} = \frac{n!}{r! (n-r)!}. \]

This is another name for the number of \(r\)-combinations of \(n\) objects.

Theorem 1.43 (Binomial theorem)

For any \(a\) and \(b\) and any nonnegative integer \(n\), we have

\[ (a + b)^n = \sum_{r=0}^n \binom{n}{r} a^r b^{n - r}. \]

As a special case, we have

\[ (1 + x)^n = \sum_{r=0}^n \binom{n}{r} x^r. \]

Setting \(x=1\), we get

\[ 2^n = \sum_{r=0}^n \binom{n}{r}. \]

Proof. Consider the simpler cases first:

  1. For \(n=0\), both L.H.S. and R.H.S. reduce to \(1\).

  2. For \(n=1\), the L.H.S. is \(a + b\). The R.H.S. is

    \[ {1 \choose 0}b + {1 \choose 1}a = b + a = a + b. \]

We now consider the general case.

  1. We can write

    \[ (a + b)^n = (a + b) \dots (a + b). \]
  2. There are \(n\) factors of \((a + b)\) on the R.H.S..

  3. Consider the term \(a^r b^{n - r}\).

  4. This is attained by choosing \(a\) from \(r\) factors and \(b\) from remaining \(n-r\) factors.

  5. There are \(\binom{n}{r}\) ways to choose \(r\) out of \(n\) factors from which \(a\) can be picked.

  6. Thus, the coefficient of \(a^r b^{n - r}\) in the expansion of \((a + b)^n\) is \(n \choose r\).

  7. The smallest power of \(a\) in the expansion is \(0\) and the largest power is \(n\).

  8. Thus, we have

    \[ (a + b)^n = \sum_{r=0}^n \binom{n}{r} a^r b^{n - r}. \]
  9. The special case is obtained by setting \(a=x\) and \(b=1\).

Theorem 1.44

For any natural number \(n\) we have

\[ \sum_{r=0}^n r \binom{n}{r}(-1)^{r -1} = 0. \]

Proof. From the binomial theorem we have:

\[ (1 + x)^n = \sum_{r=0}^n \binom{n}{r} x^r. \]

Differentiating on both sides, we get

\[ n(1 + x)^{n-1} = \sum_{r=0}^n r \binom{n}{r} x^{r-1}. \]

Putting \(x=0\), we get

\[ 0 = \sum_{r=0}^n r \binom{n}{r} (-1)^{r-1}. \]

1.11.5. Bijections#

Recall from Definition 1.82 that two sets are equivalent if there exists a bijection (a bijective function) between them. In that case, the sets are said to have the same cardinality. In this section, we are dealing with finite sets only. Recall from Definition 1.84 that a set is called finite if there is a bijection between the set and a segment of natural numbers. Thus, there exists a bijection between two finite sets, then they have the same number of elements. This gives us a powerful method of counting the number of elements in a finite set. If we can find a bijection between a given set and a set whose number of elements is known, then we know the number of elements of the given set is also known. This technique is known as the bijective technique.

We use this technique to provide a proof for the number of subsets of a finite set.

Theorem 1.45 (Number of subsets of a finite set)

Let \(A\) be a set of \(n\) elements. Then the number of subsets of \(A\) is \(2^n\).

Proof. We develop a bijection between subsets of \(A\) and binary strings of length \(n\).

  1. Let \(A = \{a_1, \dots, a_n \}\).

  2. Let each bit of a binary string \(b_1 b_2 \dots b_n\) correspond to an element of \(A\) (\(a_i \to b_i\)).

  3. For each subset \(X\) of \(A\), the let the corresponding binary string be formed as follows:

    1. If \(a_i \in X\), then \(b_i = 1\).

    2. Otherwise, \(b_i = 0\).

  4. It is easy to see that this is a bijection between the set of subsets of \(A\) and the set of binary strings of length \(n\).

  5. The number of possible binary strings of length \(n\) is \(2^n\) since each bit can take exactly \(2\) values.

  6. Hence, due to the equivalence of the two sets, the number of subsets of \(A\) is \(2^n\).

Following is an example of mapping between subsets of the set \(\{ x, y, z \}\) and binary strings of length \(3\).

subset

\(b_1\)

\(b_2\)

\(b_3\)

\(\EmptySet\)

0

0

0

\(\{ x \}\)

1

0

0

\(\{ y \}\)

0

1

0

\(\{ z \}\)

0

0

1

\(\{ x, y \}\)

1

1

0

\(\{ y, z \}\)

0

1

1

\(\{ x, z \}\)

1

0

1

\(\{ x, y, z \}\)

1

1

1

1.11.6. Combinatorial Proofs#

Theorem 1.46 (Combinatorial proofs)

If \(f(n)\) and \(g(n)\) are functions that count the number of solutions to some problem involving \(n\) objects, then \(f(n) = g(n)\) for every \(n\).

Definition 1.123 (Combinatorial identity)

Let there be two functions \(f, g : \ZZ_+ \to \ZZ\). Suppose there exists a problem about \(n\) objects such that the number of solutions to the problem in one way is given by \(f(n)\) and the number of solutions to the same problem in another way is given by \(g(n)\) for every \(n\). Then this is a combinatorial proof of the identity

\[ f(n) = g(n). \]

The equation \(f(n) = g(n)\) is known as a combinatorial identity.

Note

The concept of combinatorial proofs and identity is also applicable to the problems with multiple variables.

Theorem 1.47

\[ {n \choose r} = {n \choose n - r}. \]

Proof. Consider the problem of choosing \(r\) objects from a set of \(n\) objects.

  1. By definition, the number of ways \(r\) objects can be chosen from a set of \(n\) objects is given by \(n \choose r\).

  2. Choosing \(r\) objects from a set of \(n\) objects is equivalent to leaving the remaining \(n-r\) objects from the set.

  3. The number of ways to leave \(n-r\) objects out of \(n\) objects is \(n \choose n - r\).

  4. Hence we have the combinatorial identity

    \[ {n \choose r} = {n \choose n - r}. \]

1.11.6.1. Sum of Binomial Coefficients#

Example 1.29

For every natural number \(n\), we have

\[ \sum_{r=0}^n {n \choose r} = 2^n. \]

We already showed this result in Theorem 1.43. Let us look at a combinatorial proof.

  1. By Theorem 1.45, the number of subsets of a set of \(n\) elements is \(2^n\).

  2. Another way of counting the number of subsets of \(n\) elements is as follows.

    1. Each subset has \(r\) elements where \(0 \leq r \leq n\).

    2. The number of subsets of \(r\) elements is \(n \choose r\).

    3. Hence total number of subsets is \(\sum_{r=0}^n {n \choose r}\).

  3. By Theorem 1.46, the two must be equal.

1.11.6.2. Pascal’s Identity#

Theorem 1.48 (Pascal’s identity)

For integers \(n\) and \(k\), we have

\[ {n \choose k} = {n -1 \choose k - 1} + {n - 1 \choose k}. \]

Proof. Let there be \(n\) objects.

  1. The L.H.S. is the number of ways of choosing \(k\) objects from \(n\) object.

  2. Fix some object \(x\).

  3. In a selection of \(k\) objects, either \(x\) is selected or it is not selected.

  4. If \(x\) is selected, then there are \(n -1 \choose k - 1\) ways of selecting the remaining \(k-1\) objects.

  5. If \(x\) is not selected, then all the \(k\) objects must be selected from the remaining \(n-1\) objects.

  6. The two cases are mutually exclusive and add up to

    \[ {n -1 \choose k - 1} + {n - 1 \choose k} \]

    on the R.H.S..

1.11.6.3. Hockey Stick Identity#

Example 1.30 (Recursive rule for computing \(r\)-combinations)

Consider the problem of choosing \(r\) objects from a set of \(n\) objects. By definition, the number of ways we can choose \(r\) objects is \(n \choose r\). Following is another way to count the same.

  1. Let us label the objects as \(1,2, \dots, n\).

  2. We consider \(n\) different cases as follows.

  3. Each subset can be identified by the largest label inside it.

  4. In the \(k\)-th case, we consider all the subsets where the largest label is \(k\).

  5. Since this object is decided, the number of ways of choosing the remaining \(r-1\) objects from the \(k-1\) objects whose labels are smaller than \(k\) is given by \(k - 1 \choose r - 1\).

  6. Since any such subset must have at least \(r\) objects, hence the minimum possible value of \(k\) is \(r\).

  7. The maximum possible value of \(k\) is \(n\).

  8. Hence, the total number of elements is given by

    \[ \sum_{k=r}^n {k - 1 \choose r - 1}. \]
  9. We get the identity

    \[ {n \choose r} = \sum_{k=r}^n {k - 1 \choose r - 1}. \]
  10. By replacing \(k-1\) with \(i\), we get another version

    \[ {n \choose r} = \sum_{i=r-1}^{n-1} {i \choose r - 1}. \]

This identity enables us to compute \(n \choose r\) if \(i \choose r - 1\) is known for all values of \(r-1 \leq i \leq n-1\).

Theorem 1.49 (Hockey stick identity)

\[ \sum_{t = 0}^n {t \choose k} = {n + 1 \choose k + 1}. \]

Note that for \(0 \leq t < k\), \(t \choose k = 0\).

Proof. Consider the set of \(n+1\) numbers \(S = \{ 0,2,\dots, n\}\).

  1. The R.H.S. is the number of ways of choosing \(k+1\) numbers from \(S\).

  2. For any \(t \in S\), consider the ways of choosing \(k+1\) numbers from \(S\) such that \(t\) is the highest number in the subset.

  3. Since \(t\) is already chosen and all other numbers must be less than \(t\), hence there are \(t \choose k\) ways of choosing remaining \(k\) numbers.

  4. \(t\) can vary from \(0\) to \(n\).

  5. Each term on the L.H.S. is the number of ways of choosing \(k+1\) numbers so that \(t\) is the highest number.

  6. These are mutually exclusive cases.

  7. Hence, L.H.S. is also the number of ways \(k+1\) numbers can be chosen from \(S\).

1.11.6.4. Vandermonde’s Identity#

Theorem 1.50 (Vandermonde’s identity)

For any nonnegative integers \(r, m, n\), we have

\[ {m + n \choose r } = \sum_{k = 0}^r {m \choose k}{n \choose r - k} \]

Proof. We shall use a combinatorial double counting proof.

  1. Let \(S\) be a set of \(m + n\) objects which can be split into two disjoint subsets \(A\) of \(m\) objects and \(B\) of \(n\) objects.

  2. For example, a committee of \(m + n\) people can be split into subgroups of \(m\) men and \(n\) women.

  3. The number of ways to select \(r\) objects from \(S\) is given by \(m + n \choose r\).

  4. Each selection of \(r\) objects consists of \(k\) objects from the subset \(A\) and the remaining \(r - k\) objects from the subset \(B\) where \(0 \leq k \leq r\).

  5. There are \(m \choose k\) ways of selecting \(k\) objects from \(A\).

  6. There are \(n \choose r - k\) ways of selecting \(r - k\) objects from \(B\).

  7. Hence, by the product rule, there are

    \[ {m \choose k}{n \choose r - k} \]

    ways of selecting \(k\) objects from \(A\) and \(r - k\) objects from \(B\).

  8. By the sum rule over possible values of \(k\), the number of ways of choosing \(r\) objects from \(S\) is given by

    \[ \sum_{k = 0}^r {m \choose k}{n \choose r - k}. \]
  9. Hence, L.H.S. an R.H.S. must be identical.

Corollary 1.3

For any nonnegative integer \(n\), we have

\[ {2 n \choose n } = \sum_{k = 0}^n {n \choose k}^2. \]

Proof. In Theorem 1.50, Let \(m=n\) and \(r=n\). Then we have

\[ {2 n \choose n } = \sum_{k = 0}^n {n \choose k}{n \choose n - k}. \]

However

\[ {n \choose k} = {n \choose n - k} \]

for every \(k\). Hence

\[ {2 n \choose n } = \sum_{k = 0}^n {n \choose k}^2. \]

1.11.6.5. More Identities#

Example 1.31

Consider two sets \(A\) and \(B\) each of which consist of \(n\) objects. Assume that the sets \(A\) and \(B\) are disjoint. We wish to select \(r\) objects from \(A\) and \(B\) together with the condition that at least \(1\) object must be chosen from the set \(B\). Following are different ways of counting.

Method 1

  1. There are \(2 n \choose r\) ways of choosing \(r\) objects out of the \(2 n\) objects we have.

  2. There are \(n \choose r\) ways in which no object from the set \(B\) has been selected.

  3. Hence, the number of ways to select at least one object from \(B\) is given by

    \[ {2 n \choose r} - {n \choose r}. \]

Method 2

  1. We can consider the following \(r\) different cases. \(i\) objects from the set \(B\) have been chosen where \(1 \leq i \leq r\).

  2. For the \(i\)-th case, the total number of ways to choose \(i\) objects from the set \(B\) and the remaining \(r-i\) objects from the set \(A\) is given by

    \[ {n \choose i}{n \choose r -i}. \]
  3. Hence, the total number of ways of choosing \(r\) objects from \(A\) and \(B\) with at least one object from \(B\) is

    \[ \sum_{i=1}^r {n \choose i}{n \choose r -i}. \]
  4. By Theorem 1.46, we get the combinatorial identity

    \[ \sum_{i=1}^r {n \choose i}{n \choose r -i} = {2 n \choose r} - {n \choose r}. \]

1.11.7. Counting with Repetitions#

We now consider a different type of problem. Rather than selecting objects from a given set of \(n\) distinct objects, we assume that there are \(n\) types of objects available and we are free to choose as many objects of a particular type as we want. Thus, we allow for the repetition of objects of the same type.

Example 1.32 (Two digit numbers)

How many 2 digit numbers are possible?

  1. On the first digit, we can have one of the nine nonzero digits.

  2. On the second digit, we can have any of the ten digits.

  3. Hence, total number of two digit numbers is:

    \[ 9 \times 10 = 90. \]
  4. Readers can verify that this includes all the numbers from \(10\) to \(99\).

  5. We allow for repetition (e.g. \(11, 22, 33\)).

Example 1.33 (Four digit pins)

We often use \(4\) digit pins for different applications. In this case, \(0\) may be allowed as the first digit. Then the number of possible \(4\) digit pins is \(10^4 = 10,000\).

1.11.7.1. Combinations with Repetitions#

Theorem 1.51 (\(r\)-combinations with repetitions)

The number of ways of choosing \(r\) objects from \(n\) types of objects (with replacement or repetition) is

\[ {n + r - 1 \choose r}. \]

Proof. The \(n\) categories of objects can be separated by \(n-1\) bars like below

\[ a_1 | a_2 | \dots | a_{n-1} | a_{n}. \]
  1. We have \(n\) slots (or urns) for the \(n\) categories.

  2. There are \(n-1\) bars between them.

  3. Suppose that \(r\) objects that have been chosen can be split as \(r_1\) objects of the category \(a_1\), \(r_2\) objects of categories \(a_2\) and so on.

  4. Then we have

    \[ r = r_1 + r_2 + \dots + r_n. \]
  5. Note that some of the \(r_i\) may be zero also.

  6. Let each object be represented by the symbol \(-\) (dash).

  7. Then the arrangement may look something like

    \[ -- | - | | --- | - | \dots | | | | ----| -- \]
  8. \(r_1\) dashes followed by a bar, then \(r_2\) dashes followed by a bar and so on.

  9. Different arrangements of \(r\) dashes and \(n-1\) bars lead to different ways of selecting \(r\) objects from \(n\) categories of objects.

  10. We can think of the arrangement as a string of dashes and bars.

  11. Then, the problem is equivalent to the counting the number of strings consisting of \(r\) dashes and \(n-1\) bars.

  12. The total length of such strings is \(n + r - 1\).

  13. The number of strings with \(r\) dashes is \(n + r -1 \choose r\).

  14. This is also same as the number of strings with \(n-1\) bars which is \(n + r - 1 \choose n - 1\).

1.11.7.2. Multiset Permutations#

Theorem 1.52 (Multiset permutations)

Let \(A\) be a multiset of \(n\) objects of \(m\) types. Assume that \(A\) contains \(r_1\) objects of type \(1\), \(r_2\) objects of type \(2\) and so on. Accordingly

\[ n = r_1 + r_2 + \dots + r_m. \]

Then, the number of permutations of these \(n\) objects is

\[ \frac{n!}{r_1! r_2 ! \dots r_m!}. \]

Proof. Approach 1

  1. If all \(n\) objects were distinct, then the number of permutations will have been \(n!\).

  2. Now, when \(r_1\) of these objects are indistinguishable from each other, then all \(r_1!\) permutations of these \(r_1\) objects are identical.

  3. Hence, making the \(r_1\) objects indistinguishable reduces the number of arrangements to \(\frac{n!}{r_1!}\).

  4. Similarly, making the next \(r_2\) objects indistinguishable from each other reduces the number of arrangements to \(\frac{n!}{r_1! r_2!}\).

  5. Continuing in the fashion, making the last \(r_m\) objects indistinguishable from each other reduces the number of arrangements to \(\frac{n!}{r_1! r_2 ! \dots r_m!}\).

Approach 2

  1. In any arrangement of \(n\) objects, there are \(n\) slots.

  2. The number of ways we can choose \(r_1\) slots to place first \(r_1\) objects of type \(1\) (which are indistinguishable) is given by \(n \choose r_1\).

  3. The number of ways we can choose the next \(r_2\) slots to place next \(r_2\) objects of type \(2\) (which are indistinguishable) is given by \(n - r_1 \choose r_2\).

  4. Continuing in the same manner, the number of ways to choose the last \(r_m\) slots to place the last \(r_m\) objects of type \(m\) (which are indistinguishable) is given by \(n - r_1 - \dots - r_{m - 1} \choose r_m\).

  5. By the product rule, the total number of ways is

    \[ {n \choose r_1}{n - r_1 \choose r_2}\dots {n - r_1 - \dots - r_{m - 1} \choose r_m}. \]
  6. This expands to

    \[\begin{split} & \frac{n!}{r_1! (n-r_1)!} \frac{(n - r_1)!}{r_2! (n-r_1 - r_2)!} \dots \frac{(n - r_1 - \dots - r_{m-1})!}{r_m! 0!}\\ & = \frac{n!}{r_1! r_2 ! \dots r_m!}. \end{split}\]

Example 1.34 (Volunteer selection)

You have 3 projects A, B and C. You have a team of 20 volunteers available.

  • Project A needs 3 people.

  • Project B needs 5 people.

  • Project C needs 9 people.

How many ways, can you assign volunteers to projects?

  1. We can introduce a fourth category D of volunteers who are not assigned to any project.

  2. The D category has \(20 - 3 - 5 - 9 = 3\) volunteers.

  3. Thus, we need to split 20 volunteers into 4 groups of sizes 3,5,9, and 3 respectively.

  4. A particular assignment might look like

    \[ AAABBBBBCCCCCCCCCDDD. \]
  5. Another assignment might look like

    \[ CDCABCABCCBBCCCDADCB. \]
  6. We can see that the assignment consists of permutations of 20 labels which are of 4 types (A, B, C, D).

  7. Hence the number of permutations is given by

    \[ \frac{20!}{3! 5! 9! 3!}. \]

1.11.7.3. Multinomial Theorem#

Definition 1.124 (Multinomial coefficient)

For nonnegative integers \(r_1, r_2, \dots, r_m\) such that \(\sum_{i=1}^m r_i = n\), the multinomial coefficient is defined as

\[ {n \choose r_1, r_2, \dots, r_m} = \frac{n!}{r_1! r_2 ! \dots r_m!}. \]

It is clear that the multinomial coefficient is equal to the number of permutations of \(n\) objects of \(m\) types where \(r_1\) objects are of type \(1\), \(r_2\) objects are of type \(2\) and so on.

Also note that for \(n=0\), we have \(r_i=0\) and the multinomial coefficient becomes

\[ \frac{0!}{0! 0! \dots 0!} = 1. \]

Theorem 1.53 (Multinomial theorem)

For arbitrary \(x_1, x_2, \dots, x_m\) and some nonnegative integer \(n\), we have

(1.4)#\[(x_1 + x_2 + \dots + x_m)^n = \sum_{k_1 + k_2 + \dots + k_m = n} {n \choose k_1, k_2, \dots, k_m} \prod_{1 \leq r \leq m}x_r^{k_r}.\]

The number of terms of the sum on the R.H.S. is given by

\[ {n + m - 1 \choose n}. \]

Proof. Consider the basic cases first.

  1. For \(m=1\), the result is trivial as there is exactly one term on the R.H.S. given by \(x_1^n\).

  2. For \(m=2\), (1.4) reduces to

    \[\begin{split} (x_1 + x_2)^n &= \sum_{k_1 + k_2 = n}{n \choose k_1, k_2}x_1^{k_1} x_2^{k_2}\\ &= \sum_{k=0}^n {n \choose k}x_1^k x_2^{n - k}. \end{split}\]

    We replaced \(k_1\) by \(k\) and \(k_2\) by \(n-k\).

  3. This is the binomial theorem already proven in Theorem 1.43.

  4. For \(n=0\), both L.H.S. and R.H.S. reduce to \(1\).

We now consider the general case.

  1. Each term on the R.H.S. is of the form

    \[ C \prod_{1 \leq r \leq m}x_r^{k_r} \]

    such that \(k_1 + \dots + k_m = n\) and \(C\) is some coefficient.

  2. There are \(m\) types of variables \(x_1, \dots, x_m\).

  3. In the product, there are total \(n\) variables.

  4. The number of ways \(n\) objects of \(m\) types can be selected is given by

    \[ n + m - 1 \choose n \]

    due to Theorem 1.51.

  5. This establishes the number of terms on the R.H.S..

  6. It remains to calculate \(C\).

  7. We can see that \(\prod_{1 \leq r \leq m}x_r^{k_r}\) arises on the R.H.S. from the different permutations of \(r_1\) variables of type \(x_1\), \(r_2\) variables of type \(x_2\) and so on up to \(r_m\) variables of type \(x_m\).

  8. The number of permutations of \(n\) variables of \(m\) types \(x_1, \dots, x_m\) with counts \(r_1, \dots, r_m\) is given by

    \[ \frac{n!}{r_1! r_2 ! \dots r_m!} \]

    due to Theorem 1.52.

  9. Hence

    \[ C = {n \choose r_1, r_2, \dots, r_m} \]

    as per the definition of the multinomial coefficient.

1.11.8. Recursion#

Definition 1.125 (Recursively defined sequence)

We say that a sequence \(\{t_n \} = t_1, t_2, \dots, t_n, \dots\) is recursively defined if for every integer \(n\) greater than or equal to some bound \(b \geq 0\), the value of \(t_n\) depends on at least some of the values \(t_1, \dots, t_{n -1}\). The values for \(t_1, \dots, t_{b-1}\) are given explicitly. These are referred to as the initial conditions for the recursively defined sequence. The equation that defines \(t_n\) from \(t_1, \dots, t_{n-1}\) is called a recursive relation.

Definition 1.126 (Fibonacci sequence)

The famous Fibonacci sequence, denoted by \(f_0, f_1, f_2, \dots\), is defined as follows:

  1. \(f_0 = 1\).

  2. \(f_1 = 1\).

  3. for every \(n \geq 2\), we have

    \[ f_n = f_{n-1} + f_{n - 2}. \]

The sequence is attributed to the Italian mathematician Leonardo Pisano (“Fibonacci”) of the 13th century. However, it has been known to Indian mathematicians as early as the 6th century.

For recursive sequences, we aim to solve the following types of problems:

  1. Devise a formula for \(t_n\) such that one doesn’t have to compute every term in the recursion to compute \(t_n\).

  2. If such a closed form solution is not possible, then develop some upper or lower bounds on the value of \(t_n\).

1.11.9. Induction#

The principle of mathematical induction plays a major role in many important results in enumerative combinatorics. This is especially relevant for recursively defined sequences. We suggest the readers to review the definitions and results in Principle of Mathematical Induction.

Example 1.35

We shall show that

\[ n! \geq 2^n - 2 \]

for every integer \(n \geq 2\).

  1. Base case: for \(n=2\) we have \(2! = 2 = 2^2 - 2\).

  2. Inductive hypothesis: assume that the statement is true for some \(k \geq 2\); i.e.,

    \[ k! \geq 2^k - 2. \]
  3. Inductive step:

    1. For \(k+1\), we have

      \[ (k+1)! = (k+1) k! \geq (k + 1) (2^k - 2). \]
    2. Since \(k \geq 2\), hence \(k+1 \geq 3\).

    3. Hence

      \[ (k+1)! \geq 3 (2^k - 2) = 2 \cdot 2^k + 2^k - 6 = 2^{k+1} + 2^k - 6. \]
    4. Since \(k \geq 2\), hence \(2^k \geq 4\).

    5. Hence \(2^k -6 \geq 4 - 6 = -2\).

    6. Hence

      \[ (k+1)! \geq 2^{k+1} + 2^k - 6 \geq 2^{k+1} - 2. \]
    7. Thus, the inequality is true for \(k+1\) also.

  4. Hence, by mathematical induction, the inequality holds for every \(n \geq 2\).

Example 1.36 (Sum of first \(n\) natural numbers)

The sum for first \(n\) natural numbers is given by

\[ 1 + 2 + \dots + n = \frac{n (n + 1)}{2}. \]

We show this equality using a recursively defined sequence.

  1. Let \(s_1 = 1\).

  2. Let \(s_n = s_{n - 1} + n\) for every \(n \geq 2\).

  3. We can see that \(\{ s_n \}\) is a recursively defined sequence where the \(n\)-th term represents the sum of first \(n\) natural numbers.

We now prove that \(s_n = \frac{n (n + 1)}{2}\) using mathematical induction.

  1. For \(n=1\), \(s_1 = 1 = \frac{1 \cdot 2}{2}\).

  2. Assume that the equality is true for some \(n\).

  3. Then for \(n+1\), we have

    \[\begin{split} s_{n + 1} = s_n + (n+1) &= \frac{n (n + 1)}{2} + n + 1 \\ &= \frac{n (n + 1) + 2 (n + 1)}{2} \\ &= \frac{(n + 1)(n + 2)}{2}. \end{split}\]
  4. We can see that the equality holds for \(n+1\) also.

  5. Hence, by mathematical induction, it holds for all \(n \geq 1\).

1.11.9.1. Strong Induction#

The following examples require the principle of strong induction (Theorem 1.22).

Example 1.37

Consider the sequence \(\{ t_n \}\) defined as follows:

  1. \(t_1 = 2\).

  2. For every \(n \geq 2\),

    \[ t_n = \sum_{i=1}^{n-1} t_i. \]

We can see that

  1. \(t_2 = t_1 = 2\).

  2. \(t_3 = t_1 + t_2 = 2 + 2 = 4\).

  3. \(t_4 = t_1 + t_2 + t_3 = 2 + 2 + 4 = 4 + 4 = 8\).

We now claim that \(t_n = 2^{n-1}\) for every \(n \geq 2\). We shall use the strong induction principle (Theorem 1.22).

  1. We are given that \(t_1 = 2\).

  2. By strong induction hypothesis, let us assume that \(t_i = 2^{i - 1}\) for every \(i\) with \(2 \leq i \leq k\).

  3. By the recursive relation, we have

    \[\begin{split} t_{k + 1} &= \sum_{i=1}^k t_i \\ &= t_1 + \sum_{i=2}^k t_i \\ &= 2 + \sum_{i=2}^k 2^{i-1} \\ &= 1 + 2^0 + \sum_{i=1}^{k-1} 2^i\\ &= 1 + \sum_{i=0}^{k-1} 2^i\\ &= 1 + 2^k - 1\\ &= 2^k. \end{split}\]
  4. Thus, the equality is true for \(k+1\) also.

  5. Hence, by principle of strong induction, it is true for every \(n \geq 2\).

Example 1.38 (Stacking \(n\) disks)

You are given \(n\) disks. You need to arrange them into a single tower of disks. You can make small individual towers of disks and then stack one tower on top of the some other one. You are free to make as many towers of disks as you need. Each move consists of stacking one tower over the other. Once a tower has been formed, you are not allowed to break it. How many moves will it take to form the single tower?

We claim that it will take \(n-1\) moves.

  1. Base case: for \(n=1\), there is a single disk and it is the tower in itself. Hence nothing more is to be done. Number of moves required is \(0\).

  2. Strong induction hypothesis: Assume that for some \(n=k\), you can make the tower of \(i\) disks in \(i-1\) moves for every \(i \in 1,\dots,k\).

  3. Induction step: Consider the problem of stacking \(k+1\) disks.

    1. We can see that the last step consists of stacking the last remaining two towers into a single tower.

    2. Assume that the two remaining towers \(A\) and \(B\) have \(j\) and \(k+1 - j\) disks respectively.

    3. Both towers consist of at least \(1\) disk.

    4. Hence neither tower can have \(k+1\) disks.

    5. Clearly, \(j \leq k\) and \(k + 1 - j \leq k\).

    6. It will take \(j-1\) moves to form the tower \(A\) by induction hypothesis.

    7. It will take \(k - j\) moves to form the tower \(B\) by induction hypothesis.

    8. It will take \(1\) more move to stack \(A\) and \(B\) together.

    9. Hence total number of moves required is

      \[ j - 1 + k - j + 1 = k. \]
    10. Hence to stack a tower of \(k+1\) disks, we need \(k\) moves.

  4. We can see that by strong induction, for every \(n\), it takes \(n-1\) moves to stack the \(n\) disks into a single tower.

1.11.10. Pigeonhole Principle#

Pigeonhole principle is one of the most powerful counting arguments.

Theorem 1.54 (Pigeonhole principle)

If there are \(n\) objects that fall into \(m\) different categories and \(n > m\), then at least two of the objects fall into the same category.

If there are \(m\) pigeonholes and \(n\) pigeons with \(n > m\), then at least one pigeonhole must hold more than one pigeons. The name of this principle comes from the example of pigeons as objects and pigeonholes as categories.

Proof. We argue as follows.

  1. Consider the first \(m\) objects.

  2. If any two of these objects belong to the same category, then we are done.

  3. Otherwise, there is exactly one object for each of the \(m\) categories from this set of first \(m\) objects.

  4. There are \(n - m > 0\) objects still left. Hence there is at least one more object.

  5. This object must fall into one of the \(m\) categories.

  6. This category will then have at least \(2\) items.

Example 1.39 (Sock-picking)

Assume that a drawer contains socks of two colors (blue and black). The socks can be worn on either foot. We are pulling the socks from the drawer without looking at them. What is the minimum number of socks we need to pull so that we are guaranteed to have a pair of same color?

  1. Socks are the objects.

  2. Each color can be considered as a category.

  3. There are thus \(m=2\) categories.

  4. We need more than one sock in at least one category.

  5. If \(n > m = 2\), then we have at least one color for which we have two or more socks.

  6. Minimum value of \(n\) satisfying \(n > m\) is \(3\).

  7. Thus, we need to draw at least \(3\) socks to guarantee that we have one pair of same color.

Example 1.40 (Hand-shaking)

Let there be \(n\) people in a room with \(n >1\) They are shaking hands with each other. Then there is always a pair of people who will shake hands with the same number of people.

  1. One person can shake hands with at the minimum \(0\) number of people and at the maximum \(n-1\) number of people.

  2. We define the categories as the number of people to which one has shaken hands.

  3. Thus, the categories are labeled \(0,1,2,\dots,n-1\).

  4. We can see that there are \(n\) categories and \(n\) people.

  5. However, the category \(0\) and category \(n-1\) cannot both have a person simultaneously.

    1. Suppose there is someone who has shaken hands with no one.

    2. Then there cannot be anyone who has shaken hands with everyone.

    3. Hence if category \(0\) is nonempty, then category \(n-1\) must be empty.

    4. Now suppose there is someone who has shaken hands with everyone.

    5. Then there cannot be anyone who has shaken hands with no one.

    6. Hence if category \(n-1\) is nonempty, then category \(0\) must be empty.

  6. Thus, either \(0\) or \(n-1\) or both must be empty.

  7. Hence the \(n\) people must be fitted into at most \(n-1\) categories.

  8. By pigeonhole principle, at least one of these categories must have more than one person.

Example 1.41 (Hair counting)

At any given time there live at least two people in California with the same number of hairs on their heads.

  1. The current population of California is around \(40\) million.

  2. Typical human head has an average of about \(150,000\) hairs.

  3. An upper bound on the number of hairs on human head is \(1\) million.

  4. Hence by pigeonhole principle, there must be people in California with the same number of hairs on their heads.

Example 1.42 (Subset sum)

Consider the set \(S = \{1,2,3,4,5,6,7,8,9 \}\). We claim that every \(6\) element subset of \(S\) must contain at least two numbers whose sum is \(10\).

  1. Let us first identify which pairs of numbers sum to \(10\).

  2. We can see that \(1+9=2+8=3+7=4+6=10\). There are \(4\) such pairs.

  3. We now split the \(9\) numbers to \(5\) categories as follows:

    1. \(1,9 \to A\)

    2. \(2,8 \to B\)

    3. \(3,7 \to C\)

    4. \(4,6 \to D\)

    5. \(5 \to E\).

  4. Consider any subset of \(S\) of \(6\) elements.

  5. Let its members be \(\{ n_1, n_2, n_3, n_4, n_5, n_6 \}\).

  6. Put these numbers into the \(5\) categories above based on rule that a number should go into the category mapped for it.

  7. There are \(5\) categories and \(6\) numbers.

  8. By pigeonhole principle, at least one of the categories should have more than one number.

  9. Category \(E\) cannot have two numbers. It can only have \(5\).

  10. Hence, at least one of \(A, B, C, D\) has two numbers.

  11. But each of these categories consist of two numbers whose sum is \(10\).

  12. Hence every \(6\) element subset of \(S\) must contain at least two numbers whose sum is \(10\).

Example 1.43 (Same color rectangle)

Let there be \(n\) given colors. Assume that each point of the plane has been assigned one of these \(n\) colors. Then there exists a rectangle whose four corners have the same color.

To prove this, we need to discretize the problem first so that we can apply the pigeonhole principle. We can do this by creating a grid of points.

  1. Consider an \(r \times c\) grid of points.

  2. If there are enough number of rows, then we can guarantee that some of points in each column have same color.

  3. Similarly, if there are enough number of columns, we can guarantee that some of the points in each row have same color.

  4. If we can guarantee that two rows and two columns contain points of same color, we have found a rectangle of same color corners.

  5. How many rows and columns do we need?

  6. We consider a grid of points with \(n+1\) rows and \(n {n + 1 \choose 2} + 1\) columns.

  7. Let \(p = {n + 1 \choose 2}\). Hence, we have \(n p + 1\) columns.

  8. Let us first examine the points each column.

  9. There are \(n+1\) points in each column.

  10. There are only \(n\) colors which can be assigned to point.

  11. Hence, in each column, there exists a pair of points which has the same color.

  12. Now examine the pair of points in each column.

  13. Since there are \(n+1\) points, hence there are \(p = {n + 1 \choose 2}\) pairs of points in each column.

  14. There are \(p\) pairs of points and \(n\) colors.

  15. Hence there are \(n p\) ways of choosing a unique pair of points and a unique color.

  16. Thus, in at most \(n p \) columns, a pair of points doesn’t appear at the same location with the same color.

  17. Hence if we have \(n p + 1\) columns, then there exist at least two columns such that the same color occupies the same two locations in both the columns.

  18. These four points form a rectangle whose points have the same color.

1.11.10.1. Generalized Pigeonhole Principle#

Theorem 1.55 (Generalized pigeonhole principle)

If there are \(n\) objects that fall into \(m\) different categories and \(n > k m\), for some positive integer \(k\), then at least \(k+1\) of the objects fall into the same category.

Proof. .

  1. Consider the first \(km\) objects.

  2. If \(k+1\) of them fall into the same category, we are done.

  3. Otherwise, exactly \(k\) of them must fall into the each category.

  4. Since \(n > km\), hence there is one more object.

  5. This object must fall into one of these \(m\) categories.

  6. Since every category already has \(k\) objects, hence there will be \(k+1\) objects in the category to which the \(km + 1\)-th object falls.

We show the application of the generalized pigeonhole principle to prove Erdős–Szekeres theorem.

Theorem 1.56 (Erdős–Szekeres theorem)

For every pair of integers \(a,b \geq 1\), if \(S\) is a sequence of \(ab + 1\) distinct real numbers, then there is either an increasing subsequence of length \(a + 1\) or a decreasing subsequence of length \(b + 1\) in \(S\).

Proof. Let \(n = a b + 1\). Define a function \(f\) which maps every element \(x \in S\) to the length of the longest increasing subsequence that begins with \(x\).

  1. If there exists some \(x \in S\) such that \(f(x) \geq a + 1\) then we are done.

  2. Otherwise, we have that \(f(x) \leq a\) for every \(x \in S\).

  3. We note that for every \(x \in S\), there exists an increasing subsequence of length \(1\).

  4. Hence \(f(x) \geq 1\) for every \(x \in S\).

  5. Hence we have \(1 \leq f(x) \leq a\) for every \(x \in S\).

  6. Hence, there are \(a\) possible values of \(f(x)\).

  7. Since \(n > a b\), hence there exists at least \(b + 1\) values of \(S\) which have the same value of \(f(x)\) due to the generalized pigeonhole principle.

  8. Let \(T = \{t_1, t_2, \dots, t_{b+1} \}\) be a subsequence of \(S\) such that \(f(x)\) is identical for every \(x \in T\).

  9. We now claim that \(T\) is a decreasing subsequence of \(S\).

  10. Let \(x, y \in T\) such that \(x\) appears before \(y\) in \(S\).

  11. Since all values in \(S\) are distinct, hence \(x \neq y\).

  12. Assume that \(x < y\).

  13. Then by taking \(x\) and the increasing subsequence of \(S\) of length \(f(y)\) starting at \(y\), we can form a subsequence of length \(f(y) + 1\) starting from \(x\).

  14. This contradicts the fact that \(f(x) = f(y)\).

  15. Hence \(x > y\) must hold.

  16. Hence for every \(t_i, t_j \in T\), such that \(i < j\), we have \(t_i > t_j\).

  17. Hence we have \(t_1 > t_2 > \dots > t_{b + 1}\).

  18. Hence there exists a decreasing subsequence of length \(b + 1\).

1.11.11. Inclusion Exclusion Principle#

Recall from Remark 1.23 that for finitely many mutually disjoint finite sets, the number of elements of their union is equal to the sum of the number of elements of individual sets.

Inclusion-exclusion principle is a way of counting the number of elements of unions of finite sets which may be overlapping.

  1. For two sets, we add the counts of individual sets, then subtract the count of their intersection.

  2. For three sets:

    • Add the counts of individual sets

    • Subtract the counts of their pairwise intersections

    • Add the count of their triple intersection.

  3. For four sets:

    • Add individual counts

    • Subtract pairwise intersection counts

    • Add counts of intersections of every selection of three sets

    • Subtract the count of the intersection of all sets.

  4. And so on.

Theorem 1.57 (Inclusion-exclusion principle for 2 sets)

Let \(A\) and \(B\) be two finite sets. Then

\[ | A \cup B | = |A | + |B | - | A \cap B |. \]

Proof. .

  1. Recall that \(A \setminus B\) and \(A \cap B\) are disjoint and

    \[ A = (A \setminus B) \cup (A \cap B). \]
  2. Hence \(|A| = |A \setminus B| + |A \cap B|\).

  3. Similarly \(|B| = |B \setminus A| + |A \cap B|\).

  4. Also recall that

    \[ A \cup B = (A \setminus B) \cup (B \setminus A) \cup (A \cap B) \]

    and these three sets are disjoint.

  5. Hence \(|A \cup B | = |A \setminus B| + |B \setminus A| + |A \cap B|\).

  6. Substituting from previous identities

    \[ |A \cup B | = |A| - |A \cap B| + |B| - |A \cap B| + |A \cap B| = |A | + |B | - | A \cap B | \]

    as desired.

We can now generalize the principle for a finite number of sets

Theorem 1.58 (Inclusion-exclusion principle for \(n\) sets)

Let \(\AAA = \{ A_1, A_2, \dots, A_n \}\) be a family of \(n\) finite sets which are not necessarily disjoint. Then the size (cardinality) of their union is given by:

\[\begin{split} | A_1 \cup A_2 \cup \dots \cup A_n | &= S_1 - S_2 + S_3 - \dots + (-1)^{n + 1} S_n\\ &= \sum_{k=1}^n (-1)^{k+1}S_k \end{split}\]

where \(S_k\) is the sum of the cardinalities of all \(k\)-set intersections among the sets \(A_1, \dots, A_n\). In particular,

\[\begin{split} & S_1 = \sum_{i=1}^n | A_i| \\ & S_2 = \sum_{1 \leq i < j \leq n} | A_i \cap A_j | \\ & S_3 = \sum_{1 \leq i < j < k \leq n} | A_i \cap A_j \cap A_k |\\ & \vdots\\ & S_n = | A_1 A_2 \dots A_n |. \end{split}\]

In general for every \(k \in 1,\dots,n\), we can write:

\[ S_k = \sum_{1 \leq i_1 < i_2 < \dots < i_k \leq n} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_k} |. \]

One can see that there are \(n \choose k\) ways of choosing \(k\) sets from the family \(\AAA\) of \(n\) sets. Hence, each \(S_k\) is a sum of \(n \choose k\) terms.

Proof. We prove this result by mathematical induction.

  1. Base case: The statement is true for \(n=1\), since

    \[ | A_1 | = S_1 = | A_1|. \]
  2. Inductive hypothesis: Assume that the statement is true for some \(n\).

  3. We shall now prove it for \(n+1\).

  4. Let \(\AAA = \{ A_1, \dots, A_{n+1}\}\) be a family of \(n+1\) sets.

  5. Let \(A = A_1 \cup A_2 \cup \dots \cup A_n\).

  6. Then \(A_1 \cup A_2 \cup \dots \cup A_n \cup A_{n+1} = A \cup A_{n + 1}\).

  7. By Theorem 1.57, we have

    \[ | A \cup A_{n+1} | = |A | + | A_{n+1} | - | A \cap A_{n+1} |. \]
  8. By inductive hypothesis:

    \[ |A| = \sum_{k=1}^n (-1)^{k+1}S_k \]

    where

    \[ S_k = \sum_{1 \leq i_1 < i_2 < \dots < i_k \leq n} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_k} |. \]
  9. Let \(B_i = A_i \cap A_{n+1}\) for \(i=1,\dots,n\).

  10. Then \(A \cap A_{n+1} = B_1 \cup B_2 \cup \dots \cup B_n\).

  11. By inductive hypothesis:

    \[ |A \cap A_{n+1}| = \sum_{l=1}^n (-1)^{l+1}T_l \]

    where

    \[\begin{split} T_l &= \sum_{1 \leq i_1 < i_2 < \dots < i_l \leq n} | B_{i_1} \cap B_{i_2} \cap \dots \cap B_{i_l} | \\ &= \sum_{1 \leq i_1 < i_2 < \dots < i_l \leq n} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_l} \cap A_{n+1}|. \end{split}\]
  12. We can now write

    \[\begin{split} &| A_1 \cup A_2 \cup \dots \cup A_n \cup A_{n+1} | \\ &= |A | + | A_{n+1} | - | A \cap A_{n+1} |\\ &= \sum_{k=1}^n (-1)^{k+1}S_k + | A_{n+1} | - \left ( \sum_{l=1}^n (-1)^{l+1}T_l \right) \\ &= S_1 + | A_{n+1} | + \sum_{k=2}^n (-1)^{k+1}S_k + \sum_{l=1}^{n-1} (-1)^{l+2} T_l + (-1)^{n + 2} T_n\\ &= S_1 + | A_{n+1} | + \sum_{k=2}^n (-1)^{k+1} (S_k + T_{k -1}) + (-1)^{n + 2} T_n. \end{split}\]
  13. We now define \(U_j\) for \(j=1,\dots,n+1\) as follows:

    1. For \(j=1\) we define

      \[ U_1 = S_1 + | A_{n+1} | = |A_1 | + |A_2| + \dots + |A_n| + | A_{n+1} |. \]
    2. For \(j=2,\dots,n\) we define

      \[\begin{split} U_j &= S_j + T_{j -1} \\ &= \sum_{1 \leq i_1 < i_2 < \dots < i_j \leq n} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_j} |\\ &+ \sum_{1 \leq i_1 < i_2 < \dots < i_{j-1} \leq n} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_{j-1}} \cap A_{n+1}| \\ &= \sum_{1 \leq i_1 < i_2 < \dots < i_j \leq n+1} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_j} |. \end{split}\]
      1. We can see that the terms in \(S_j\) are the intersections of \(j\) sets from \(\AAA\) not including \(A_{n + 1}\)

      2. The terms in \(T_{j-1}\) are intersections of \(j\) sets from \(\AAA\) including \(A_{n+1}\).

      3. Hence the terms in \(U_j\) are the intersections of \(j\) sets from \(\AAA\).

      4. \(U_j\) is indeed the sum of cardinalities of all \(j\)-set intersections of \(\AAA\).

    3. For \(j=n+1\) we define

      \[ U_{n+1} = T_n = | A_1 \cap A_2 \cap \dots \cap A_n \cap A_{n+1}|. \]

      This is nothing but the one and only intersection of all sets in \(\AAA\).

  14. We can see that

    \[ | A_1 \cup A_2 \cup \dots \cup A_n \cup A_{n+1} | = \sum_{j=1}^{n+1} (-1)^{j+1}U_j \]

    where

    \[ U_j = \sum_{1 \leq i_1 < i_2 < \dots < i_j \leq n+1} | A_{i_1} \cap A_{i_2} \cap \dots \cap A_{i_j} |. \]
  15. Hence the inclusion-exclusion principle statement is true for \(n+1\) also.

  16. Hence by mathematical induction, the statement is true for all \(n\).

1.11.12. Generating Functions#

A generating function is a way of encoding an infinite sequence of numbers \(\{ t_n \}\) by treating them as the coefficients of a formal power series.

Definition 1.127 (Ordinary generating function)

The ordinary generating function of a sequence \(t_0, t_1, \dots, t_n, \dots\) is given by

\[ G(t_n; x) \triangleq \sum_{n=0}^{\infty} t_n x^n = t_0 + t_1 x^1 + t_2 x^2 + \dots + t_n x^n + \dots. \]

A generating function is not evaluated for any specific value of \(x\). It is a formal structure which enables us to manipulate the sequences more conveniently.

Definition 1.128 (Exponential generating function)

The exponential generating function of a sequence \(t_0, t_1, \dots, t_n, \dots\) is given by

\[ E(t_n; x) \triangleq \sum_{n=0}^{\infty} t_n \frac{x^n}{n!}. \]

1.11.12.1. Generalized Binomial Theorem#

Recall from Theorem 1.43 that for a nonnegative integer \(n\), we have

\[ (1 + x)^n = \sum_{r=0}^n \binom{n}{r} x^r. \]

We now wish to generalize it for negative \(n\) also.

Definition 1.129 (Generalized binomial coefficient)

For any real \(n\) and any nonnegative integer \(r\), the generalized binomial coefficient is defined as

\[ {n \choose r} = \frac{n (n-1) \dots (n - r + 1)}{r!}. \]

Note that for a nonnegative \(n\) and \(r \leq n\), the definition agrees with the binomial coefficient as defined in Definition 1.122.

Theorem 1.59 (Binomial coefficient for negative integers)

If \(n\) is a positive integer, then

\[ {-n \choose r} = (-1)^r {n + r - 1 \choose r}. \]

Proof. We have

\[ {-n \choose r} = \frac{-n (-n-1) \dots (-n - r + 1)}{r!}. \]
  1. Factoring \(-1\) out of each term in the R.H.S. numerator gives us

    \[\begin{split} & (-1)^r \frac{n (n+1) \dots (n + r - 1)}{r!} \\ &= (-1)^r \frac{(n + r - 1) \dots (n+1) n }{r!}\\ &= (-1)^r \frac{(n + r - 1) \dots (n+1) n (n-1) \dots 1 }{r! (n-1)!}\\ &= (-1)^r \frac{(n +r - 1)!}{r! (n-1)!} \\ &= (-1)^r {n + r - 1 \choose r}. \end{split}\]

Theorem 1.60 (Generalized binomial theorem)

For any \(n \in \RR\), we have

\[ (1 + x)^n = \sum_{r=0}^{\infty} {n \choose r} x^r. \]

Example 1.44

We shall show that

\[ (1-x)^{-1} = 1 + x + x^2 + \dots. \]
  1. Let \(y = -x\).

  2. Then \((1-x)^{-1} = (1 + y)^{-1}\).

  3. By Theorem 1.60,

    \[ (1 + y)^{-1} = \sum_{r=0}^{\infty} {-1 \choose r} y^r \]
  4. By Theorem 1.59

    \[ {-1 \choose r} = (-1)^r {1 + r - 1 \choose r} = (-1)^r { r \choose r} = (-1)^r. \]
  5. Hence

    \[ (1 + y)^{-1} = \sum_{r=0}^{\infty} (-1)^r y^r. \]
  6. But \(y^r = (-x)^r = (-1)^r x^r\).

  7. Putting back, we get

    \[ (1 - x)^{-1} = \sum_{r=0}^{\infty} (-1)^r (-1)^r x^r = \sum_{r=0}^{\infty} x^r. \]