The “New” Taylor Swift: A Study

For the first time in three years, Taylor Swift is releasing new music, and at least according to her, it’s supposed to be new music. In her own words, the old Taylor is dead. This post takes a look at all of her music to date and tries to figure out (with help from the Spotify API and some lyrics scraping) how it’s changed over time, and whether or not the two singles from her new album really represent a radical departure from her earlier stuff.

Before we look at anything, I should be clear about exactly which songs are under consideration here: everything from the albums Taylor Swift, Fearless, Speak Now, Red, and 1989 as well as the two singles from Reputation that have been released (“Look What You Made Me Do” and “Ready For It”). The deluxe tracks from all those albums are fair game as well, except the ones from Taylor Swift because they’re not on Spotify. What’s not fair game is the music from her holiday collection or anything she recorded for a soundtrack (including “I Don’t Wanna Live Forever.”) These lines are somewhat arbitrary, but I think the albums are the best benchmarks of her evolution since they’re polished final products. Taken together, this gives 86 songs under consideration.

Code for all of this can be found here.

1. The Music

For this section, we consider only the sound of the songs, ignoring any lyrics. We quantify the sound of songs using Spotify’s audio features. Specifically, I took what they call the acousticness, danceability, energy, liveness, loudness, tempo, and valence for each song.

1.1. Records and Rankings

The first thing I did was look at the extremes: the songs that had the highest and lowest value for each of the seven features. I thought that if one of the new singles showed up on this list, it would be a sign that her music was moving in a new direction.

Feature    Lowest   Highest
Acousticness State Of Grace Never Grow Up
Danceability This Love Hey Stephen
Energy Never Grow Up Haunted
Liveness The Story Of Us Better Than Revenge
Valence This Love Shake It Off
Tempo This Love Long Live
Loudness Sad Beautiful Tragic Picture To Burn

 

But neither of them did. The two things I did notice were that “This Love” is clearly out there for her and that the only song on here that was a big hit was “Shake It Off.” Maybe extremes aren’t good for chart success.

The other thing I did right off the bat was, for each feature, rank each of her full albums by the average value for that feature. This doesn’t say anything about her new music, but does give some sense of how she’s changed in the past.

Rank    Acousticness   Danceability   Energy   Liveness
1 Speak Now 1989 1989 Speak Now
2 Taylor Swift Red Speak Now Taylor Swift
3 Fearless Fearless Taylor Swift Fearless
4 1989 Speak Now Fearless 1989
5 Red Taylor Swift Red Red

 

Rank    Valence   Tempo   Loudness
1 Red Taylor Swift Speak Now
2 1989 Speak Now Taylor Swift
3 Taylor Swift 1989 Fearless
4 Speak Now Fearless 1989
5 Fearless Red Red

What really stood out to me here was that Red and 1989 nearly always appeared next to each other, with two exceptions: tempo, where they were separated by Fearless, and Energy, where they appeared on opposite ends of the list! So there’s something going on here where Red and 1989 are very similar, but different in some key way (a way that’s captured by the energy feature).

Also, in the acousticness and liveness features, the albums generally go from old to new, which is to be expected since her music became more manufactured over time.

1.2. Visualization

To get a feel for the data set as a whole, I found a two-dimensional representation using t-SNE. Here’s a plot (with song names truncated at 15 characters to improve readability):

tswift_album.png

That plot is pretty tough to read even with the truncation, so here’s a PDF that should be more accessible, especially if you zoom in.

In this graph, you can clearly see that 1989 is her most consistent album (there’s 10 songs in a relatively small area), followed by Red. Her first album on the other hand is all over the place. So, interpreted with some generosity, the graph shows her developing a sense of style over time.

Looking more closely at the area 1989 is concentrated in, I noticed two things. One is that this area also contains a lot of her big hits, like “You Belong With Me,” “We Are Never Ever Getting Back Together” and “Mine.” But more notably, it also contains both of the new singles! In two dimensions, Taylor Swift’s new music is indistinguishable from her old music.

1.3 Clustering

To make concrete the notions of clusters in the previous section, I did some clustering. Specifically, I ran k-means on the seven dimensional points and then visualized the results in a plot similar to the previous ones. I chose to look at 2 and 5 clusters, hoping that they would correspond to the old Taylor/new Taylor and her 5 full albums, respectively.

The two clusters respect the structure in the visualization almost exactly:

cluster2.png

Here’s the PDF. What’s more important is how this clustering interacts with the albums, which can be seen in this graph:

cluster2bar.png

This isn’t too informative. We’d hope that one of the clusters increases steadily from album to album, but the ratios seem to be roughly constant. Let’s recreate these same two graphs with 5 clusters:

cluster5scatter

cluster5.png

Again, the PDF. These graphs tell a much more interesting story: the first three albums are acoustically scattershot, drawing roughly equally from all the clusters. Then, in Red, she seems to split, toying equally with clusters 0 and 1. Between the two of them, it’s cluster 1 that wins out, and forms the bulk of 1989, with some small appearances from the others. And (as expected based on our earlier observations), the songs so far from Reputation also fit into cluster 5, making them nothing too new.

It’s worth taking a second to translate back from “clusters” to more meaningful terms. Looking at the scatter plot, cluster 0 seems to correspond to upbeat pop songs (like “Red” and “Shake It Off”) while cluster 1 has slower, mellower, songs (like “The Lucky One” and “Teardrops on My Guitar”). And on Red, all of the hit songs were from cluster 0, not cluster 1, maybe explaining her decision to switch over for her next album.

(The other clusters are harder to interpret, but if you figure anything out, let me know!)

2. The Lyrics

As any avid Taylor Swift fan knows though, the real meat of her songs is in the lyrics, so I turned to those next. Thankfully, most of the scraping work had already been done, so I just piggy-backed off that.

2.1. Word Clouds

I started by making word clouds for each of her full albums, just because it was easy and cute:

This slideshow requires JavaScript.

They’re not too informative, but they are fun to look at.

2.2. Topic Models

For more substantive analysis, I fit a topic model to the collection of lyrics. I tried latent Dirichlet allocation and non-negative matrix factorization, with the latter giving better results. Somewhat arbitrarily, I fit models with 5 topics. Here is, for each of the 5 topics the model found, the top 20 words:

  • Topic #1: love beautiful cause back baby time say way everything one home take eyes smile feel right better could little well
  • Topic #2: come back movie would gone sinks york miss could figured leave today spelling guess feeling dream new somethings somehow id
  • Topic #3: stay mad hey palm time lock quite want easy lovin hand grow funny late people let beautiful say best well
  • Topic #4: ever ooh talk together getting 22 friends grow back telling yeah nights feels alright mean called remember night used one
  • Topic #5: girl thats works trying want strong alone ever everybody worse yeah knows forever goes wait heart would place get rain

These aren’t too easy to make sense of, so let’s bring in some visualizations. First, treating the percentage contribution of each topic to a song as a feature, I used t-SNE to visualize the songs again:

tswift_lyric_albums.png

Another tough to read plot, so here’s the PDF. The relatively clusters are pretty promising, as is the fact that 1989 is again the most tightly clustered album – it’s both acoustically and thematically unified. The big swath through the bottom has a lot of canonical Taylor Swift love songs (“Fifteen” and “Speak Now” for example) that definitely belong together, but it’s not clear why something like “The Story of Us” was pushed off to the other cluster, since it’s clearly a love song as well.

But it turns out that this plot is a little misleading. If we recolor it by dominant topic (i.e. a songs color is determined by the topic that contributes most heavily to it), we get a different story:

topic_scatter.png

(The PDF.) Now it’s clear that (as many people correctly conclude without any analysis at all) nearly all of her songs are predominantly love songs drawing from Topic 1, and the clusters found by t-SNE are mostly artificial.

It’s interesting to look at the songs that aren’t dominated by this topic. A couple of them are clearly not love songs (“Welcome to New York” and “22”) but some are clearly love songs that the model misinterpreted. For example, the scatter plot suggests that the model put “Stay Stay Stay” and “All You Had To Do Was Stay” together into a topic of songs about staying, and looking at the key words from the topic confirms this.

Similarly, topic 5 picked up on the word “girl” in both “How You Get The Girl” and “Girl At Home,” as well as a line from the chorus of “A Place In This World” and thrust those together.

None of these other topics picked up on the new singles though. They are, at least according to the model, standard Taylor Swift songs.

I had hoped that the model would be able to separate, for example, songs about falling in love and songs about breaking up, but some combination of short songs and non-distinctive vocabulary prevents this from happening.

For one last graph, I plotted the topic composition across albums:

topic_bar.png

Obviously this graph is a little suspect because of the issues I mentioned above, but it does suggest that the canonical “Taylor Swift love song” became significantly less important to her work as she aged, in parallel with her shift toward more pop music.

3. Conclusion

Based on both the audio and lyrics of her songs, there’s a clear pivot of some kind in Taylor Swift’s music as she moved away from her country roots. But there’s no reason to believe that that shift continues in her new work, which is extremely similar to 1989. Contrary to what she’d have us believe, the old Taylor isn’t dead at all.

Advertisements

Percolation VI: Dependent Percolation

This is the final post of the series on percolation. It concludes the proof of Harris-Kesten by using a technique called dependent percolation to conclude that {p_T\geq \frac{1}{2}}. In conjunction with the result of the previous post, that {p_H=\frac{1}{2}}, and the obvious fact that {p_T\leq p_H}, this allows us to conclude that {p_T=\frac{1}{2}} as well.

1. Dependent Percolation

Let {G} be a graph and {\tilde{\mathop{\mathbb P}}} a probability measure on the set of site configurations of {G}. We call this measure {k}-independent if

\displaystyle |S-T|=\min_{s\in S, t\in T} |s-t| \geq k

means that the states of {S} are independent of the states of {G}. (The distance used here is the one induced by {G}.) The {k=1} case of this is just independent percolation, but {k>1} is a bit more interesting.

The result we’ll need is the following lemma:

Lemma 1. Suppose {k, \Delta\geq 2} are positive integers and {G} is a graph with maximum degree {\Delta}. There exist constants {p_1} and {a}, both depending on {k} and {\Delta}, such that if {\tilde{\mathop{\mathbb P}}} is a {k}-independent percolation measure in which each site is open with probability at most {p_1},

\displaystyle \tilde{\mathop{\mathbb P}}(|C_v|\geq n)\leq e^{-an}

for all {v\in V(G)}.

Proof: Suppose {|C_v|\geq n}. Then the open subgraph contains a tree with {n} vertices, include {v}. There are at most {(e\Delta)^{n-1}} such trees, so fix one. If {w} is some other vertex of {G}, at most

\displaystyle 1+\Delta+\cdots+\Delta^{k-1}\leq \Delta^k

vertices of {G} are within {k-1} units of {w}. So there is a subset {S} of size at least {n\Delta^{-k}} of the vertices of {T} so that any {a,b\in S} are at least {k} apart. (This can be constructed greedily.)

These vertices in {S} are open independently, so the probability of every vertex of {T} being open is at most {p_1^{|S|}}. Thus,

\displaystyle \tilde{\mathop{\mathbb P}}(|C_v|\geq n)\leq (e\Delta)^{n-1}p_1^{n\Delta^{-k}}\leq\left(e\Delta p_1^{1/\Delta^k}\right)^n.

Taking {p_1} small enough to make {r=e\Delta p_1^{1/\Delta^k}<1} and {a=-\log r}, we have the result. \Box

2. Removing Dependence

The last step in our proof is to make the leap from the result about dependent percolation to an analogous result about standard bond percolation.

Theorem 2. For bond percolation on {{\mathbb Z}^2}, if {p<\frac{1}{2}}, there is a constant {a} such that

\displaystyle \mathop{\mathbb P}_p(|C_0|\geq n)\leq e^{-an}.

Proof: Take some {p<\frac{1}{2}} and let {p_1} be the constant for which the result of the preceding lemma holds with {k=5} and {\Delta=4}. For convenience, set {c=\sqrt[4]{1-p}}. As usual, {\Lambda^*} is the dual of {\Lambda={\mathbb Z}^2}. Our result from the last post showing that {h_p(\rho n, n)} could be bounded below by {1-n^{-\gamma}} tells us that there is an {m} such that {h_{1-p}(3m,m)>c}.

Let {s=m+1}, and consider an {s\times s} square {S}. We can arrange four {3m\times m} rectangles to form an annulus whose interior is a square, which we can take to be {S}. The probability of each of these rectangles being crossed the long way to form an open path through the annulus is at least {c^4=1-p_1} by Harris’s lemma. If {B(S)} is the event that a site in {S} is connected by an open path to a site at distance {s} (this is the graph distance) from {S}, then the fact that this open path could not cross the open dual cycle gives {\mathop{\mathbb P}_p(B(S))\leq p_1}.

Now, define a site percolation measure {\tilde{\mathop{\mathbb P}}} in the following way. Take each {(x,y)\in {\mathbb Z}^2} to be open iff {B(S)} holds for the square

\displaystyle S_v=[sx+1, sx+s]\times [sy+1, sy+s].

Note that the event {B(S_v)} only depends on the bonds within distance {s} of {S_v}, so {\tilde{P}} is 5-independent and each {v} is open with probability {\leq p_1} (by the argument of the previous paragraph). Now let {C_0} be the open cluster containing the origin in bond percolation and {C_0'} the open cluster containing the origin in this new percolation.

From the lemma, we have

\displaystyle \tilde{P}(|C_0'|\geq n)\leq e^{-an}.

On the other hand, if {|C_0|> (4s+1)^2}, each site {w} of {C_0} is connected by an open path of length {2s} to another site. If {w\in S_v}, this means that {B(S_v} holds, and so if {|C_0|> (4s+1)^2}, {B(S_v)} holds for all the {v} such that {S_v} contains sites of {C_0}.

That is, a site {v} is open in the new percolation we defined if and only if {S(B_v)} hold sin bond percolation. In particular, every {v} such that {S_v} and {C_0} do not intersect is open in our new percolation, so that set is a subset of {C_0'}. Each {S_v} only contains {s^2} sites, so for {n\geq (4s+1)^2}, we have

\displaystyle \mathop{\mathbb P}_p(|C_0|\geq n)\leq \tilde{\mathop{\mathbb P}}(|C_0'|\geq n/s^2)\leq e^{-an/s^2},

giving the result. \Box

This now implies that {\chi(p)<\infty} for {p<\frac{1}{2}}, since exponential decay guarantees a finite expected cluster size. Since {p_T\leq p_H=\frac{1}{2}}, we finally conclude that for bond percolation on {{\mathbb Z}^2},

\displaystyle p_T=p_H=\frac{1}{2}.

Percolation V: Kesten’s Theorem

Last time we outlined the four steps of the proof of Harris-Kesten and worked through the first. This time, we’ll do the second: showing that for {p>1/2}, there is an infinite open cluster with probability 1, allowing us to conclude that {p_H=\frac{1}{2}}. We begin with two lemmas.

1. Some Lemmas

Recalling our discussion of the combinatorial hypercube, we say that a bond {e} is pivotal for an event {E} in a configuration {\omega} if switching the state of {e} takes {\omega} into or out of {E}. The influence of {e} on {E} is

\displaystyle I_p(e,E)=\mathop{\mathbb P}_p(e\text{ is pivotal for }E).

For {E} increasing, this simplifies to

\displaystyle \mathop{\mathbb P}_p(\omega^+\in E, \omega^{-}\not\in E),

where {\omega^{\pm}} are configurations that differ from {\omega} only at {e}.

Lemma 1. Let {R} be an {m\times n} rectangle in {{\mathbb Z}^2}. For a bond {e} in {R},

\displaystyle I_p(e, H(R))\leq 2\mathop{\mathbb P}_{1/2}(r(C_0)\geq \min\{m/2-1, (n-1)/2\})

for {0<p<1}.

Proof: For this proof, we ignore all bonds outside {R}. The argument is a little weird: we’ll prove that

\displaystyle I_p(e, H(R))\leq 2\mathop{\mathbb P}_p(r(C_0)\geq m/2-1)

and

\displaystyle O_p(e, H(R))\leq 2\mathop{\mathbb P}_{1-p}(r(C_0)\geq (n-1)/2).

Then, noting that {\mathop{\mathbb P}_p(r(C_0)\geq a)} is an increasing function of {p}, the first allows us to conclude for {p\leq 1/2} and the second for {p\geq 1/2}.

Since {\omega^+\in H(R)} and {\omega^-\not\in H(R)}, any horizontal crossing of {R} uses {e}. Thus, in {\omega}, each endpoint of {e} is joined by an open path to either end of {R}, at least one of which must have length {\geq m/2-1}, so the first bound given above follows.

Since {\omega^-\not \in H(R)}, a previous lemma tells us that {\omega^-\in V(R^h)}. Similarly, {\omega^+\not \in V(R^h)}, so {\omega^-} contains an open dual path crossing {R^h} and using the dual edge of {e}. So that edge has an endpoint which is the start of an open dual path of length {(n-1)/2}, and so we get the second bound stated above. \Box

Lemma 2. Fix {p>1/2} and an integer {\rho>1}. There exist constants {\gamma} (that depends on {\rho}) and {n_0} (that depends on {p} and {\rho}) such that

\displaystyle h_p(\rho n,n)\geq 1-n^{-\gamma}

for all {n\geq n_0}.

Proof: Recall that in a remark last time, we mentioned that

\displaystyle h(m_1+m_2-2n, 2n)\geq \frac{h(m_1, 2n)h(m_2, 2n)}{2}.

Setting {m_1=m, m_2=3n} gives

\displaystyle h(m+n, 2n)\geq \frac{h(m,2n)h(3n,2n)}{2}\geq \frac{h(m,2n)}{2^8},

and it follows from this that

\displaystyle h(kn,2n)\geq 2^{17-8k}

for {k\geq 3}. (We’re using some bounds on {h(3n, 2n)} from last post here.) Since {h(m, 2n+1)\geq h(m, 2n)}, there must be a constant {h_k} such that {h(kn,n)\geq h_k}. The consequence of this we need here is that {\mathop{\mathbb P}_{1/2}(H(R))\geq h_{\rho}.}

Now, in the proof of Harris’s theorem, we showed that

\displaystyle \mathop{\mathbb P}_{1/2}(r(C_0)\geq n)\leq n^{-c},

and combining this with the previous lemma gives

\displaystyle I_{p'}(e, H(R))\leq n^{-a}=\delta

for all bonds {e\in R} and {p'\in [0.5,p]}. Let {f(p')=\mathop{\mathbb P}_{p'}(H(R))}. Using a lemma proved in the second post in this series, we get the bound

\displaystyle \sum_{e\in H(R)} I_{p'}(e, H(R))\geq cf(p')(1-f(p'))\log(1/\delta).

But another result from that second post tells us that the sum on the LHS is the derivative of {f(p')}, so we let

\displaystyle g(p')=\log\frac{f(p')}{1-f(p')},

and write

\displaystyle \frac{d}{dp'}g(p')=\frac{1}{f(p')(1-f(p'))}\frac{d}{dp'}f(p')\geq c\log(1/\delta)=ac\log n.

Our observation above that {\mathop{\mathbb P}_{1/2}(H(R))\geq h_{\rho}} says that {g(1/2)} is bounded below by a constant depending on {\rho}. So for {n\geq n_0}, we get {g(p)\geq ac(p-0.5)\log n}, and the result follows. \Box

2. Kesten’s Theorem

This is the last piece in the proof that {p_H({\mathbb Z}^2)=\frac{1}{2}}.

Theorem 3 (Kesten’s Theorem). Let {E_{\infty}} be the event that there is an infinite open cluster. Then for bond percolation in {{\mathbb Z}^2}, if {p>1/2}, we have {\mathop{\mathbb P}_p(E_{\infty})=1.}

Proof: Take {p>1/2} and choose {\gamma} (depending on {p}) and {n_0} be as in the second lemma with {\rho=2}. Choose some large {n}, and let {R_1,\cdots} be a family of rectangles whose bottom left corner is the origin and which have sides of length {2^kn} and {2^{k+1}n}. The longer side is the vertical for even {k} and the horizontal for odd {k}.

Suppose {E_k} is the event that {R_k} is crossed in the longer direction by an open path. Any two such paths must meet, so if every {E_k} holds, so does {E_{\infty}} (since the open cluster containing the origin has unbounded radius). Now we write

\displaystyle \mathop{\mathbb P}_p(E_{\infty})\geq \mathop{\mathbb P}_p\left(\bigcap_{k=0}^{\infty} E_k\right)=1-\mathop{\mathbb P}_p\left(\bigcup_{k=0}^{\infty} E_k^c\right)\geq 1-\sum_{k=0}^{\infty} \mathop{\mathbb P}_p(E_k^c).

But by the final lemma of the previous section, and the fact that {n} is arbitrarily large,

\displaystyle \sum_{k=0}^{\infty} \mathop{\mathbb P}(E_k^c)\leq \sum_{k=0}^{\infty} (2^kn)^{-\gamma}=\frac{n^{-\gamma}}{1-2^{-\gamma}}<1.

So {\mathop{\mathbb P}_p(E_{\infty})>0}, and thus 1. \Box

The result proven at the end of the last post was that {p_H>\frac{1}{2}} while this result shows that {p_H\leq \frac{1}{2}}, so we conclude that in fact {p_H=\frac{1}{2}}. In the next and final post, we’ll turn our attention to {p_T}.

Percolation IV: Harris’s Theorem

We’re ready to start proving the Harris-Kesten theorem, so from now on, everything we say is about bond percolation in {{\mathbb Z}^2}. The global structure of the argument is in three steps:

  1. Show that {\theta(1/2)=0} (this implies {p_H\geq 1/2})
  2. Show that for {p>1/2}, there is an infinite open cluster with probability 1 (this implies {p_H\leq 1/2})
  3. Show that {\chi(p)<\infty} for {p<1/2} (this implies that {p_T\geq 1/2})
  4. Conclude that {p_T=p_H=1/2}

This post will accomplish the first step.

1. Crossing Probabilities

We start with some more definitions: a rectangle {R} in {{\mathbb Z}^2} is a set of sites of the form {[a,b]\times [c,d]}. We say this rectangle has size {k\times \ell}, where {k=b-a+1, \ell=d-c+1}. It contains {k\ell} sites and {2k\ell-k-\ell} bonds.

This rectangle has two duals: a horizontal dual

\displaystyle R^h=[a+0.5, b-0.5]\times [c-0.5, d+0.5]

and a vertical dual

\displaystyle R^v=[a-0.5, b+0.5]\times [c+0.5, d-0.5].

We have that {R^h} is a {k-1\times \ell+1} rectangle and {R^v} is a {k+1\times \ell-1} rectangle. Additionally,

\displaystyle \left(R^h\right)^v=\left(R^v\right)^h=R.

Finally, we define an open horizontal crossing of a rectangle {R} to be an open path in {R} connecting a site {(a,x)} to a site {(b,y)}, and similarly for an open vertical crossing. We write {H(R)} and {V(R)} for the events that such crossings exist.

The following lemma is, according to Bollobas, the “reason” why {p_H=1/2}. It is intuitively clear, but working through all the details is a bit of slog, so let’s just take it at face value.

Lemma 1. Let {R} be a rectangle in {{\mathbb Z}^2} or its dual. Then exactly one of {H(R)} and {V(R^h)} holds.

Why exactly is this lemma the “reason” cited above? The following corollary sheds some light on that.

Corollary 2. If {R} and {R'} are {k\times \ell-1} and {k-1\times \ell} rectangles in {{\mathbb Z}^2}, then

\displaystyle \mathop{\mathbb P}_p(H(R))+\mathop{\mathbb P}_{1-p}(V(R'))=1.

Proof: The bonds of the dual are open when the bonds of the original graph are closed and vice versa. So by Lemma 1, each configuration of bonds will satisfy exactly one of {H(R)} and {V(R^h)}, so

\displaystyle \mathop{\mathbb P}(H(R))+\mathop{\mathbb P}(V(R^h))=1.

But {R^h} is indistinguishable from {R'}, so the result follows. \Box

Now if we take {k=\ell} in this corollary, we get that for an {n+1\times n} rectangle {R},

\displaystyle \mathop{\mathbb P}_{1/2}(H(R))+\mathop{\mathbb P}_{1/2}(V(R'))=1.

But because the rectangle {R'} has dimension {n\times n+1}, the event {V(R')} has the same probability as {H(R')}, so they are both equal to {1/2}.

So the intuition is somehow that when {p>1/2}, it’s more likely you have a horizontal crossing than not, so you get an infinite open cluster. When {p<1/2}, it’s less likely than not, so you don’t. The critical behavior is at exactly {p=1/2}.

More importantly, we get that if {S} is an {n\times n} square,

\displaystyle \mathop{\mathbb P}_{1/2}(V(S))=\mathop{\mathbb P}_{1/2}(H(S))\geq \frac{1}{2},

where we have the inequality because this event must be more likely than the crossing for an {n\times n+1} rectangle. This is a bound on the crossing probabilities of a square. We now have to extend this result to rectangles.

2. Squares to Rectangles

This next lemma relates crossings of squares to crossings of rectangles. We use the notation {[n]=[0,n]}.

Lemma 3. Consider {R=[m]\times [2n]}, an {m\times 2n} rectangle with {m\geq n}. Let {X(R)} be the event that there are open paths {P_1, P_2} such that {P_1} crosses the {n\times n} square {S=[n]\times [n]} from top to bottom and {P_2} lies within {R} and joins some site on {P_1} to the right hand side of {R}. Then

\displaystyle \mathop{\mathbb P}_p(X(R))\geq \frac{\mathop{\mathbb P}_p(H(R))\mathop{\mathbb P}_p(V(S))}{2}.

Proof: Suppose {V(S)} occurs, so an open path {P_0} crosses {S} from top to bottom. Such a path cuts {S} into two pieces, one on the left and on the right. Let {LV(S)} be the left-most open vertical crossing (when one exists). For any {P_1}, the event {\{LV(S)=P_1\}} does not depend on bonds to the right of {P_1}.

The key claim is that

\displaystyle \mathop{\mathbb P}_p(X(R)\mid LV(S)=P_1)\geq \frac{\mathop{\mathbb P}(H(R))}{2}.

Since {V(S)} is a disjoint union of events of the form {\{LV(S)=P_1\}}, this implies that

\displaystyle \mathop{\mathbb P}_p(X(R)\mid V(S))\geq\frac{\mathop{\mathbb P}_p(H(R))}{2},

which rearranges to

\displaystyle \mathop{\mathbb P}_p(X(R)\cap V(S))\geq \frac{\mathop{\mathbb P}_p(H(R))\mathop{\mathbb P}_p(V(S))}{2}.

But {X(R)} occurring means that {V(S)} does as well, so {\mathop{\mathbb P}_p(X(R)\cap V(S))=\mathop{\mathbb P}_p(X(R))}, and the lemma is proven.

For the claim, let {P} be the path (that may or may not be open) formed by the union of {P_1} and it’s reflection {P_1'} in the horizontal axis of {R}, with a bond joining {P_1} to {P_1'}. This path is a vertical crossing of the rectangle {[n]\times [2n]}.

Now, there is an open path {P_3} crossing {R} horizontally with probability {\mathop{\mathbb P}_p(H(R))}. This path must meet {P}, and the probability it does so at a site of {P_1} is at least {\mathop{\mathbb P}_p(H(R))/2}. Thus the event that there is an open path {P_2} in {R} to the right of {P} joining a site of {P_1} to the right end of {R} has probability at least {\mathop{\mathbb P}_p(H(R))/2}. This event depends only bonds to the right of {P}, and all such bonds in {S} are to the right of {P_1}. These bonds are independent of {LV(S)=P_1}, so

\displaystyle \mathop{\mathbb P}_p(Y(P_1)\mid LV(S)=P_1)=\mathop{\mathbb P}_p(Y(P_1))\geq \mathop{\mathbb P}_p(H(R))/2.

If {Y(P_1)} and {LV(S)=P_1} both occur, then {X(R)} does as well. So

\displaystyle \mathop{\mathbb P}_p(X(R)\mid LV(S)=P_1)\geq \mathop{\mathbb P}_p(H(R))/2,

proving the claim and thus the lemma. \Box

Remark 1. If we instead take an {m_!\times 2n} rectangle {R} and an {m_2\times 2n} rectangle {R'}, this proof will show that

\displaystyle h(m_1+m_2-n, 2n)\geq \frac{h(m_1, 2n)h(m_2,2n)}{2^5},

and we will use this formulation later on.

(This argument is much simpler than it seems – drawing the picture goes a long way.)

3. Thinning the Rectangles Out

With this lemma, we can now bound the crossing probabilities for non-square rectangles. In particular, we look at a {3n\times 2n} rectangle. For convenience, let {h_p(m,n)=\mathop{\mathbb P}_p(H(R))}, and since it’s the case we really care about, {h(m,n)=h_{1/2}(m,n)}.

Corollary 4. We have {h(3n, 2n)\geq 2^{-7}}.

Proof: Take two square {2n\times 2n} rectangles {R, R'} that overlap in an {n\times 2n} strip. This {n\times 2n} strip contains two {n\times n} squares; call {S} the smaller of these. (Again, draw the picture!) Let {X'(R')} be the event {X(R)} described above, but reflected horizontally. Then we apply Lemma 4 to {R} (which is coincidentally a square) to get

\displaystyle \mathop{\mathbb P}_p(X'(R'))=\mathop{\mathbb P}_p(X(R))\geq \frac{\mathop{\mathbb P}_p(H(R))\mathop{\mathbb P}_p(V(S))}{2}.

Now, we call back to Harris’s lemma from the previous post: the events {X(R), X'(R'), H(S)} are all increasingly and positive correlated. If they all hold, so does {H(R\cup R')}, and so we have

\displaystyle h(3n,2n)=\mathop{\mathbb P}_{1/2}(H(R\cup R'))\geq \mathop{\mathbb P}_{1/2}(X'(R'))\mathop{\mathbb P}_{1/2}(X(R))\mathop{\mathbb P}_{1/2}(H(S)).

This is then

\displaystyle \geq \frac{\mathop{\mathbb P}_{1/2}(H(R))^2\mathop{\mathbb P}_{1/2}(V(S))^2\mathop{\mathbb P}_{1/2}(H(S))}{4}.

But by our observation at the end of Section 1, this is at least

\displaystyle \frac{(0.5^2)^3}{4}=2^{-7},

as desired. \Box

A {3n\times 2n} rectangle isn’t quite thing enough for our purposes, but the rest of the thinning can be done using the remark after Lemma 3. We have

\displaystyle h(5n,2n)\geq \frac{h(3m,2n)^2}{2^5}\geq 2^{-19},

and so

\displaystyle h(6n,2n)\geq \frac{h(5n,2n)h(2n,2n)}{2^5}\geq 2^{-25},

and this is the bound that our proof of {\theta(1/2)=0} will hinge on.

4. Harris’s Theorem

Theorem 5. For bond percolation on {{\mathbb Z}^2}, {\theta(1/2)=0}.

Proof: Let

\displaystyle r(C_0)=\sup\{d(x,0)\mid x\in C_0\}.

The actual result we’ll show is that

\displaystyle \mathop{\mathbb P}_{1/2}(r(C_0)\geq n)\leq n^{-c}

for a constant {c}, and the stated result follows from this.

When {p=1/2}, bonds in the dual lattice are open with probability {1/2}, so by the computation above, the probability a {6n\times 2n} rectangle has a crossing the long way is at least {2^{-25}}. Consider 4 such rectangles oriented to form an annulus around the origin, with the “hole” in the annulus being a {2n\times 2n} square centerd at the origin.

By Harris’s lemma, the probability each of these rectangles is crossed the long way by an open path in the dual is at least {\epsilon=2^{-100}}. The union of these paths is an open dual cycle around the annulus.

Let {A_k} be the square annulus centered on {(0.5, 0.5)} with inner radius {4^k} and outer radius {3\cdot 4^k}. Let {E_k} be the event that {A_k} contains an open dual cycle surrounding the inside of {A_k}, and thus the origin.

By our argument above, {\mathop{\mathbb P}(E_k)\geq \epsilon} for each {k}. All of the {E_k} are independent, and if {E_k} holds, then there is no path from inside {A_k} to outside {A_k}, so {r(C_0)\leq 1+3\cdot 4^k\leq 4^{k+1}.}

Then,

\displaystyle \mathop{\mathbb P}\left(r(C_0)\geq 4^{\ell+1}\right)\leq \mathop{\mathbb P}_{1/2}\left(\bigcap_{k=1}^{\ell} E_k^c\right)=\prod_{k=1}^{\ell} \mathop{\mathbb P}_{1/2}(E_k^c)\leq (1-\epsilon)^{\ell},

which is the desired bound. \Box

Percolation III: Some Lemmas

The next chapter of Bollobas is a collection of probabilistic tools used throughout the book; this post presents two of these results that will be key to our proof of the Harris-Kesten theorem. Both of these results are set in the framework of weighted hypercubes, which we now introduce.

Take a set {S} and its power set {\mathcal{P}(S)}. This power set can be identified with the set of functions {f: S\rightarrow \{0,1\}} by identifying {A\in \mathcal{P}(S)} with its characteristic function. In the case where {S} is finite (and this is the case we care about), we can think of {S} as {\{1,\cdots, n\}} and {\mathcal{P}(S)=\mathcal{P}(n)} as the cube {Q^n=\{0,1\}^n.} This cube is the set of binary sequences of length {n}, where a subset {A\subset S} is associated to a sequence that has a 1 in position {i} if {i\in A} and a 0 otherwise.

More geometrically, we can think of {Q^n} as a graph whose vertices are {\mathcal{P}(n)}. Two sets {A, B} are connected in this graph if {|A\Delta B|=1} (i.e. {A} and {B} differ at one entry in their sequences). Put yet another way, if {\mathbf{a}, \mathbf{b}\in \{0,1\}^n} are two vertices, they are connected if {a-b=\pm \mathbf{e}_i}, where {\mathbf{e}_i} is a standard basis vector. We can make {Q^n} a directed graph by saying the edge {\mathbf{a}\mathbf{b}} is oriented from {\mathbf{a}} to {\mathbf{b}} if {\mathbf{b}-\mathbf{a}=\mathbf{e}_i}. We write this as {\mathbf{a}\leq \mathbf{b}}.

Now, shifting gears, the points of {Q^n} are associated with the outcome of a sequence of {n} coin tosses. If {\mathbf{p}=(p_1,\cdots, p_n)} is the sequence of probabilities of success for these {n} coin tosses, then { \mathbf{p}} induces a natural probability measure on {Q^n}.

For example, if {\mathbf{p}=(0.5,\cdots, 0.5)} we obtain the uniform measure, and if {\mathbf{p}=(p,\cdots, p)}, we obtain {\mathop{\mathbb P}(A)=p^{|A|}(1-p)^{n-|A|}}. We write {Q_p^n} for the cube with this measure, and it is equivalent to the standard percolation models.

We write {Q_{\mathbf{p}}^n} for the probability space {(\mathop{\mathbb P}_{\mathbf{p}}, Q^n)} and distinguish two kinds of events. A subset {U\subset Q^n} is an up-set if {a,b\in Q^n, a\leq b\implies b\in U}. Similarly, {U} is a down-set if {a,b \in Q^n, a\geq b\implies b\in U}. (Keep in mind that {U} is a subset of {Q^n}, which is identified with {\mathcal{P}(S)}, so {a, b} are subsets of the original set {S}.) The corresponding events are referred to as increasing and decreasing respectively.

With that set up out of the way, we can state our first lemma.

1. Harris’s Lemma

This lemma is the intuitive fact that increasing events are positively correlated. Let’s do a bit more set up.

For a set {A\subset Q^n} and {t=0,1} we can define

\displaystyle A_t=\{(a_i)_{i=1}^{n-1}\mid (a_1,\cdots, a_{n-1}, t)\in A\}\subset Q^{n-1}.

If {A} is an up-set, {A_0\subset A_1}, and if {A} is a down-set, {A_1\subset A_0} (because {A_1, A_0} always differ at only one position, so their difference is always a basis vector, so it’s just a matter of sign).

If we write {\mathop{\mathbb P}} for the probability measure on both {Q_{(p_1,\cdots, p_n)}^n} and {Q_{(p_1,\cdots, p_{n-1})}^{n-1}}, we have

\displaystyle \mathop{\mathbb P}(A)=(1-p_n)\mathop{\mathbb P}(A_0)+p_n\mathop{\mathbb P}(A_1).

Finally, the result:

Lemma 1. Let {A, B} be subsets of {Q_{\mathbf{p}}^n}. If they are both up-sets or both down-sets, then {\mathop{\mathbb P}(A\cap B)\geq \mathop{\mathbb P}(A)\mathop{\mathbb P}(B)}. If {A} is an up-set and {B} is a down-set, then {\mathop{\mathbb P}(A\cap B)\leq \mathop{\mathbb P}(A)\mathop{\mathbb P}(B)}.

(This implies the correlation statement, since the first condition rearranges to {\mathop{\mathbb P}(A\mid B)\geq \mathop{\mathbb P}(A)} and similarly for the second.)

Proof: We induct on {n}. The case {n=1} is trivial, so suppose the result holds for {n-1}.

If {A} and {B} are both up-sets or both down-sets, then either {A_0\subset A_1} and {B_0\subset B_1} or {A_1\subset A_0} and {B_1\subset B_0}. This means that

\displaystyle (\mathop{\mathbb P}(A_0)-\mathop{\mathbb P}(A_1))(\mathop{\mathbb P}(B_0)-\mathop{\mathbb P}(B_1))\geq 0.

Our argument will rely on the equivalent fact that

\displaystyle \mathop{\mathbb P}(A_0)\mathop{\mathbb P}(A_1)+\mathop{\mathbb P}(B_0)\mathop{\mathbb P}(B_1)\geq \mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_0)+\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_1).

Now we write

\displaystyle \mathop{\mathbb P}(A\cap B)=(1-p_n)\mathop{\mathbb P}(A_0\cap B_0)+p_n \mathop{\mathbb P}(A_1\cap B_1)

which is

\displaystyle \geq (1-p_n)\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_0)+p_n\mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_1).

We cleverly rewrite this as

\displaystyle (1-p_n)^2\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_0)+p_n^2\mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_1)+(1-p)p(\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_0)+\mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_1)),

(yes, they’re really equal!) and use our earlier inequality to see that this is

\displaystyle \geq (1-p_n)^2\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_0)+p_n^2\mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_1)+(1-p)p(\mathop{\mathbb P}(A_0)\mathop{\mathbb P}(B_1)+\mathop{\mathbb P}(A_1)\mathop{\mathbb P}(B_0).

But this last line is equal to

\displaystyle \left[(1-p_n)\mathop{\mathbb P}(A_0)+p_n\mathop{\mathbb P}(A_1)\right]\left[(1-p_n)\mathop{\mathbb P}(B_0)+p_n\mathop{\mathbb P}(B_1)\right]=\mathop{\mathbb P}(A)\mathop{\mathbb P}(B),

so we conclude.

From this first piece, the other inequality follows easily: if {A} is an up-set and {B} is a down-set, then {B^c} is an up-set, so

\displaystyle \mathop{\mathbb P}(A\cap B)\leq\mathop{\mathbb P}(A)-\mathop{\mathbb P}(A\cap B^c)\leq \mathop{\mathbb P}(A)-\mathop{\mathbb P}(A)\mathop{\mathbb P}(B^c),

from the previous result. This then is

\displaystyle \mathop{\mathbb P}(A)-\mathop{\mathbb P}(A)(1-\mathop{\mathbb P}(B))=\mathop{\mathbb P}(A)\mathop{\mathbb P}(B),

as desired. \Box

2. The Russo-Margulis Lemma

Take an event {A} in the weighted cube {Q_{\mathbf{p}}^n}. We want to understand how {\mathop{\mathbb P}(A)} varies with {\mathbf{p}}.

For a point {\omega\in Q^n}, we say that entry {i}, {\omega_i}, is pivotal for {A} if only one of {\omega} and {\tilde{\omega}} are in {A}, where {\tilde{\omega}} is {\omega} with the entry at {\omega_i} flipped. Then, we define the influence of entry {i} as

\displaystyle \beta_i(A)=\mathop{\mathbb P}(\omega_i\text{ is pivotal for } A)=\mathop{\mathbb P}(\{\omega\mid \omega_i\text{ is pivotal for }A\}).

Lemma 2. Let {A} be an increasing event in {Q_{\mathbf{p}}^n} where {\mathbf{p}=(p_1,\cdots, p_n)}. Then

\displaystyle \frac{\partial}{\partial p_i} \mathop{\mathbb P}(A)=\beta_i(A)\implies \frac{d}{dp}\mathop{\mathbb P}(A)=\sum_{i=1}^n \beta_i(A).

Proof: Assume without the loss of generality that {i=n}. If we have {\mathbf{x}=(x_1,\cdots, x_k)\in Q^k} for {k\leq n}, write {\mathbf{p}^{\mathbf{x}}} for the probability of {x} in the measure induced by {Q_{\mathbf{p}}^n} on {Q^k}. In particular, if {k=n}, then {\mathbf{p}^{\mathbf{x}}=\mathop{\mathbb P}(\{x\})}.

Also, for {\mathbf{x}\in Q^{n-1}}, let {\mathbf{x}_+} and {\mathbf{x}_-} be the events formed by extending {\mathbf{x}} with a 1 or 0 in the last slot, respectively. We have

\displaystyle \mathbf{p}^{\mathbf{x}_+}+\mathbf{p}^{\mathbf{x}_-}=\mathbf{p}^{\mathbf{x}}

by construction.

Now take an up-set {A}, and let {A_a} be the set of {\mathbf{x}\in Q^{n-1}} such that {\mathbf{x}_+} and {\mathbf{x}_-} are both in {A}, and {A_b} be the same set but such that only {\mathbf{x}_+} is. Then,

\displaystyle \mathop{\mathbb P}(A)=\sum_{\mathbf{x}\in A_a} \left(\mathbf{p}^{\mathbf{x}_+}+\mathbf{p}^{\mathbf{x}_-}\right)+\sum_{\mathbf{x}\in A_b}\mathbf{p}^{\mathbf{x}_+},

which, by our earlier observation, becomes

\displaystyle \sum_{\mathbf{x}\in A_a} \mathbf{p}^{\mathbf{x}}+p_n\sum_{\mathbf{x}\in A_b} \mathbf{p}^{\mathbf{x}}.

It’s clear now that

\displaystyle \frac{\partial}{\partial p_n} \mathop{\mathbb P}(A)= \sum_{\mathbf{x}\in A_b} \mathbf{p}^{\mathbf{x}},

but since {A_b} is exactly the set of {\mathbf{x}} for which {n} is pivotal, this sum is {\beta_n(A)}. \Box

That proof is a notational mess, but the idea is simple: think of an element of {Q^n} as an element of {Q^{n-1}}, plus a bit of information that tells you whether it is pivotal, and then use this decomposition to directly take the derivative.

Of course, this result is only useful if you actually understand things about the {\beta_i(A)}. I won’t prove it, but we have this bound: if {\mathop{\mathbb P}(A)=t} and {\beta_i(A)\leq \delta} for each {i}, then

\displaystyle \sum_{i=1}^n \beta_i(A)\geq ct(1-t)\log(1/\delta).

This bound will be the second ingredient necessary in our proof of Harris-Kesten, which we will finally begin with in the next post.

Percolation II: The Integer Lattice

This post presents a couple of results regarding percolation on {{\mathbb Z}^d} (and especially {{\mathbb Z}^2}), which are valuable mainly because they tell us that percolation on these graphs is non-trivial i.e. none of the critical probabilities is 0 or 1. But on a concrete level, they actually do more: they give explicit bounds on the critical probabilities.

Proposition 1. For bond percolation on {{\mathbb Z}^2}, we have {p_T\geq \frac{1}{3}}

This proof hinges on counting {\mu_n({\mathbb Z}^2)}, the number of self-avoiding walks of length {n} (in the language of graph theory, paths) on {{\mathbb Z}^2}. We will make use of the obvious bound

\displaystyle \mu_n({\mathbb Z}^2)\leq 4\cdot 3^{n-1}.

Proof: The middle inequality was discussed in the previous post, so it suffices to do the first and last ones. Let’s start with the first.

Choose some {p<\frac{1}{3}} and consider the bond percolation on {{\mathbb Z}^2} where every bond is open with probability {p}. Let {C_0} be the cluster containing the origin and note that any site in {C_0} must be connected to the origin by an open path starting at the origin. (It’s a little hard to see, but this is where the self-avoiding condition comes in: if the walk from the origin to the site intersected itself, we could “excise” the loop and then get another path that’s more likely to be open, because it is shorter.) The size of {C_0} is bounded by the number {X} of such paths and the probaibility of a path of length {n} being open is {p^n}, so

\displaystyle \mathop{\mathbb E}(|C_0|)\leq \mathop{\mathbb E}(X)=\sum_{n=0}^{\infty} p^n\mu_n\leq 1+\sum_{n=1}^{\infty} p^n\cdot 4\cdot 3^{n-1}<\infty,

because {p<\frac{1}{3}}. Since the expected size of {C_0} is finite for any {p<\frac{1}{3}}, we conclude that {p_T\geq \frac{1}{3}}. \Box

Proposition 2. For bond percolation on {{\mathbb Z}^2}, we have {p_H\leq \frac{2}{3}}

This proof hinges on studying the dual graph of {{\mathbb Z}^2} (more on this in a second) and using a technical lemma that I won’t prove.

For a graph {\Lambda}, the dual {\Lambda^*} is a graph whose vertices correspond to faces of {\Lambda} with edges connecting adjacent faces in {\Lambda}. For example,

\displaystyle \left({\mathbb Z}^2\right)^*={\mathbb Z}^2+(0.5,0.5).

In the context of percolation, we call the sites and bonds of {\Lambda^*} dual sites and dual bonds, with a dual bond taken to be closed when the original bond is open, or vice versa. (We’re associating bonds to dual bonds here by {e\mapsto e^*} where {e^*} is the edge of {\Lambda^*} connecting the faces of {\Lambda} that meet at {e}).

In {{\mathbb Z}^2}, if {H} is a finite subgraph of {{\mathbb Z}^2} with vertex set {C}, then {{\mathbb Z}^2\setminus C} must have some infinite component {C_{\infty}}. We define {\partial^{\infty} C} to be the bonds of {\left({\mathbb Z}^2\right)^*} that are dual to bonds of {{\mathbb Z}^2} joining {C} and {C_{\infty}}.

Put another way, you take some finite subgraph, and look at all the edges connecting it to the infinite component of it’s complement. This set of edges maps to some set of edges in the dual, and that set is {\partial^{\infty} C}. The lemma says that {\partial^{\infty} C} looks like the boundary of {H}.

Lemma 3. {\partial^{\infty} C} is a cycle containing {C}.

Proof: As before, take some {p>\frac{2}{3}}, and consider bond percolation on {{\mathbb Z}^2} where each bond is open with probability {p}. Take dual bonds {e^*} to be open if {e} is closed and vice versa, and call a cycle in the dual consisting of open bonds an open dual cycle.

Let {L_k} be the segment connecting the origin to {(k,0)} and {S} some dual cycle of length {2\ell} surrounding {L_k}. {S} must cut the {x}-axis between {k+0.5} and {0.5(2\ell-3)}, so there are fewer than {\ell} choices for the dual bond that cuts the {x}-axis. The rest of {S} is a path of length {2\ell-1}, and since the dual of {{\mathbb Z}^2} is isomorphic to {{\mathbb Z}^2}, there are {\mu_{2\ell-1}} possibilities for this portion of {S}. In total, we have fewer than {\ell \mu_{2\ell-1}} possibilities for {S}.

Let {Y_k} be the number of open dual cycles surrounding {L_k}. An open dual cycle of length {2\ell} occurs with probability {(1-p)^{2\ell}}, so

\displaystyle \mathop{\mathbb E}(Y_k)\leq \sum_{\ell=k+2}^{\infty} \ell\mu_{2\ell-1}(1-p)^{2\ell}\leq \sum_{\ell =k+2}^{\infty} \ell\cdot 4\cdot 3^{2\ell-2}(1-p)^{2\ell}<\infty,

because {p>\frac{2}{3}\implies 3(1-p)<1}.

This means that {\mathop{\mathbb E}(Y_k)\rightarrow 0} as {k\rightarrow \infty}, and in particular, {\mathop{\mathbb E}(Y_k)<1} eventually. If {A_k} is the event that {Y_k=0}, this means that {\mathop{\mathbb P}(A_k)>0} (if {P(A_k)} were never zero, then integrating over a set of measure 1 would have to give something that is at least 1). Now let {B_k} be the event that all the bonds in {L_k} are open, which occurs with probability {p^k}, which is independent of {A_k}.

If {A_k} and {B_k} both occur, then {C_0} cannot be contained in a cycle of the dual graph (since {Y_k=0}), so it must be infinite by our lemma. Thus

\displaystyle \theta_0(p)=\mathop{\mathbb P}(|C_0|=\infty)\geq \mathop{\mathbb P}(A_k\cap B_k)= \mathop{\mathbb P}(A_k)\mathop{\mathbb P}(B_k)\geq \mathop{\mathbb P}(A_k)p^k>0,

so there is an infinite open cluster with probability 1. This means that {p_H\geq p}, and since {p} is any probability {>\frac{2}{3}}, we have {p_H\leq \frac{2}{3}}. \Box

Remark 1. \leavevmode

  1. Recalling that {p_T\leq p_H}, we can chain these together into

    \displaystyle \frac{1}{3}\leq p_T\leq p_H\leq \frac{2}{3},

    showing that the critical probabilities for bond percolation on {{\mathbb Z}^2} are non-trivial.

     

  2. The bound of {\mu_n\leq 4\cdot 3^{n-1}} is crude, and these inequalities can be sharpened by using tighter inequalities. For example, noting that {\mu_{m+n}\leq \mu_m\mu_n} (you can just concatenate paths), we see by Fekete’s lemma that {\mu_n^{1/n}\rightarrow \lambda}. This {\lambda} is called the connective constant, and our result becomes

    \displaystyle \frac{1}{\lambda}\leq p_T\leq p_H\leq 1-\frac{1}{\lambda},

    so we just need to estimate this constant, which I won’t get into but can certainly be done.

     

  3. Since {{\mathbb Z}^d} contains {{\mathbb Z}^2} within it for {d\geq 2}, we can recall the result at the end of the previous post to obtain

    \displaystyle \frac{1}{2d-1}\leq p_T({\mathbb Z}^d)\leq p_H({\mathbb Z}^d)\leq \frac{2}{3},

    giving non-triviality of critical probabilities for {{\mathbb Z}^d}.

 

These results are only for bond percolation. We can use them to say things about site percolation as well though, by showing that percolation is “more likely” in the site model than the bond model.

Theorem 4. For any graph {\Lambda}, we have {p_H^{\mathrm{s}}\geq p_H^{\mathrm{b}}} and {p_T^{\mathrm{s}}\geq p_T^{\mathrm{b}}}.

Proof: The key claim is that for any site {x}, integer {n}, and probability {p},

\displaystyle \mathop{\mathbb P}^{\text{s}}(|C_x|\geq n)\leq p \mathop{\mathbb P}^{\text{b}}(|C_x|\geq n).

(Here, {\mathop{\mathbb P}^{\text{b}}} and {\mathop{\mathbb P}^{\text{s}}} are the probability measures induced by bond and site percolation where bonds and sites are open with probability {p}, respectively. Throughout this proof, we use such superscripts to indicate the type of percolation being considered.)

From this claim, we let {n\rightarrow \infty} to obtain {\theta_x^{\text{s}}\leq p\theta_x^{\text{b}}}. Then, if {\theta_x^{\text{b}}=0}, we have {\theta_x^{\text{s}}=0} as well, so {p_H^{\text{s}}\geq p_H^{\text{b}}}.

For {p_T}, we argue similarly:

\displaystyle \chi_x^{\text{s}}=\sum_{n=1}^{\infty} \mathop{\mathbb P}^{\text{s}}(|C_x|\geq n)\leq \sum_{n=1}^{\infty} \mathop{\mathbb P}^{\text{b}}(|C_x|\geq n)\leq p\chi_x^{\text{b}},

and so {\chi_x^{\text{b}}<\infty\implies \chi_x^{\text{s}}<\infty}, meaning that {p_T^{\text{s}}\geq p_T^{\text{b}}.}

It remains to show the claim. We begin by writing {\mathop{\mathbb P}^{\text{s}}(|C_x|\geq n)} as

\displaystyle \mathop{\mathbb P}^{\text{s}}(|C_x|\geq n\mid x\text{ is open})\mathop{\mathbb P}^{\text{s}}(x\text{ is open})+\mathop{\mathbb P}^{\text{s}}(|C_x|\geq n\mid x\text{ is closed})\mathop{\mathbb P}^{\text{s}}(x\text{ is closed}).

If {x} is closed, {C_x} is empty, so the second term is zero. Thus we see that

\displaystyle \mathop{\mathbb P}^{\text{s}}(|C_x|\geq n)=p\mathop{\mathbb P}^{\text{s}}(|C_x|\geq n\mid x\text{ is open}),

and so our claim is equivalent to

\displaystyle p\mathop{\mathbb P}^{\text{s}}(|C_x|\geq n\mid x\text{ is open})\geq \mathop{\mathbb P}^{\text{b}}(|C_x|\geq n).

Next, we note that since the event {|C_x^{\text{s}}|\geq n} depends only on vertices that are within {n} of {x}, it suffices to consider the subgraph {\Lambda_n} of points within {n} of {x}. The idea now is to construct a sequence of subsets of {V(\Lambda_n)} (the vertex set of {\Lambda_n)} denoted {\mathcal{T}=(R_t, D_t, U_t)_{t=1}^{\ell}} such that {R_{\ell}= C_x^{\text{s}}}. The constant {\ell} is random, and the letters {R, D, U} stand for reached, dead, and untested respectively.

The construction is as follows:

  1. Let {R_1=\{x\}, U_1=V(\Lambda)\setminus \{x\}, D_1=\emptyset}
  2. Pick a bond {e_t=y_tz_t} where {y_t\in R_t} and {z_t\in U_t}. Remove {z_t} from {U_t}. If {y_t} is open, add it to {R_t} and leave {D_t} alone; otherwise add it to {D_t} and leave {R_t} alone. This gives the new triplet.
  3. Repeat 2 until no such bond {e_t} exists, and set {t=\ell}, completing the process. (This must happen eventually because {\Lambda_n} is finite.)

In the end, {R_{\ell}} is a connected set of open sites with no neighbors in {U_{\ell}} that is disjoint from the sites of {D_{\ell}} which are all closed, so it is exactly {C_x^{\text{s}}}. Note that the conditional probability of {z_t} being open is always {p}.

Now, consider an analogous sequence for {C_x^{\text{b}}}, {\mathcal{T}'}. The difference here is that after picking {e_t}, we check whether {e_t} is open (conditioned on the sequence up to that open). This probability is always {p} as well, so {\mathcal{T}} and {\mathcal{T}'} have the same distribution.

In particular, {|C_x^{\text{s}}|=|R_{\ell}|} and {|R'_{\ell'}|} have the same distribution. Since {R'_{\ell'}} is contained in {C_{x}^{\text{b}}} (it is the vertex set of the subgraph spanned by the bond we found to be open), the claim follows, concluding the proof. \Box

So this shows that {p_H^{\text{s}}, p_T^{\text{s}}} are not zero. To show that they are also not 1, we need another inequality, namely:

Theorem 5. For percolation on an infinite connected graph with maximum degree {\Delta},

\displaystyle p_H^{\mathrm{s}}\leq 1-(1-p_H^{\mathrm{b}})^{\Delta-1}.

The proof is similar, but more involved, so I’ll skip it here. The upshot of all this is that {p_T^{\text{s}}, p_T^{\text{b}}, p_H^{\text{b}}, p_H^{\text{s}}} are all non-trivial for {{\mathbb Z}^2}. In the next few posts, we’ll take the next step and actually determine {p_T^{\text{b}}} and {p_H^{\text{s}}}.

Percolation I: Critical Probabilities

This is the first in a series of posts about percolation theory, which I’m studying this summer. The main text I’m using is Bollobas’s book on the subject, and I plan to follow it pretty closely. There should be around six posts, culminating in the proof of the Harris-Kesten theorem, a major result about percolation on {{\mathbb Z}^2}. This material corresponds to the first three chapters of Bollobas, with some omissions and rearrangement.

1. The Big Picture

The idea is to consider some graph (prototypically {{\mathbb Z}^d} but theoretically any graph), and choose a random subgraph of this graph by selecting each edge to be part of the subgraph with probability {p}. Alternatively, do the same thing with vertices instead of edges, and take the subgraph to be the one induced by the chosen vertices.

Some terminology:

  • The edges are bonds
  • The vertices are sites
  • The chosen edges/vertices are open
  • The chosen graph is the open subgraph
  • The connected components of the open subgraph are open clusters

According to this language, the two models described above are bond percolation and site percolation.

The physical motivation for this model is a liquid percolating through a solid lattice, where the open bonds/sites represent the bonds/sites that the liquid can pass through. The main question suggested by this interpretation is whether the liquid will be able to pass through the lattice completely, which corresponds to the existence of an infinite open cluster.

The Harris-Kesten theorem mentioned above answers this question for bond percolation on {{\mathbb Z}^2}. It shows that for {p>\frac{1}{2}}, there is an infinite open cluster with probability 1, and for {p<\frac{1}{2}} there is an infinite open cluster with probability 0.

This post takes the first step toward proving this result by defining two critical probabilities that are closely related to this question, and then computing them for some simple graphs. Before we can even take that step though, we must construct a probability space {(\Sigma, \mathop{\mathbb P})} in which to work. For a graph {G=(V,E)}, we take:

  • {\Omega=\{0,1\}^{|E|}} to be the set of all arrangements of open edges, where 0 represents a closed edge and 1 represents an open edge; 
  • {\Sigma} to be the {\sigma}-algebra on {\Omega} generated by the cylinders

    \displaystyle C(F, \sigma)=\{\omega \in \Omega\mid \omega_f = \sigma_f\text{ for } f\in F\},\quad F\subset E\text{ and }\sigma =\{0,1\}^{|F|};

     

  • {\mathop{\mathbb P}} to be the probability measure on {\Sigma} induced by

    \displaystyle \mathop{\mathbb P}(C(F, \sigma))=\left(\prod_{\substack{f\in F\\ \sigma_f=1}} p\right)\left(\prod_{\substack{f\in F\\ \sigma_f=0}} 1-p\right).

In everything that follows, all probabilities are defined according to this measure.

2. Critical Probabilities

For a site {x}, let {C_x} be the open cluster containing {x}, and let {\theta_x(p)} be the probability that {C_x} is infinite.

Proposition 1. There is a critical probability {p_H} such that {\theta_x(p)=0} if {p<p_H} and {\theta_x(p)>0} if {p<p_H}, for all {x}.

(The notation {p_H} is in honor of Hammersley, who first proposed this model.)

Proof: Take any two sites {x, y} such that {|x-y|=d} (this is the distance induced by {G}). Then {\theta_x(p)\geq p^d \theta_y(p)}, since if there is an infinite open cluster containing {y} and a path from {y} to {x}, there will be an infinite open cluster containing {x} as well. Thus, if {\theta_x(p)=0} for one site, then {\theta_x(p)=0} for all sites, and if {\theta_x(p)>0} for one site, then {\theta_x(p)>0} for all sites. If we show that {\theta_x(p)} is an increasing function of {p}, the existence of {p_H} follows.

To see the increasing behavior (which is intuitively clear), consider assigning each edge {e} an independent random variable {X_e\sim \text{Uni}[0,1]}, and then taking it to be open if {X_e<p}. If {p_1<p_2}, the subgraph obtained with probability {p_1} will always be contained in the subgraph obtained with probability {p_2}, so {\theta_x(p)} must be increasing. \Box

We now connect this {p_H} to our original question about the existence of an infinite open cluster.

Corollary 2. Let {E} be the event that there is an infinite open cluster. Then {\mathop{\mathbb P}(E)=1} if {p>p_H} and {\mathop{\mathbb P}(E)=0} if {p<p_H}.

Proof: The event {E} is a tail event, so by the Kolmogorov 0-1 law, it can only take the values 0 and 1. If {p<p_H}, then {\mathop{\mathbb P}(E)\leq \sum_{\text{sites }x} \theta_x(p)=0,} so {\mathop{\mathbb P}(E)=0}. If {p>p_H}, then {\mathop{\mathbb P}(E)\geq \theta_x(p)>0} for at least one site, so {\mathop{\mathbb P}(E)=1}. \Box

So to understand when there is an infinite open cluster, it suffices to understand {p_H}.

When {p<p_H}, any {C_x} is almost surely finite, but it’s expected size might not be. Set {\chi_x(p)=\mathop{\mathbb E}(|C_x|)}. By the same arguments as before, {\chi_x(p)} increases with {p}, and it is either finite for all sites or for no sites. Thus we define

\displaystyle p_T=\sup\{p\mid \chi_x(p)<\infty\},

with the notation in honor of Temperley this time.

We have been deliberately ambiguous until now about whether we are working with bond percolation or site percolation in this section, and in fact we mean both. That is, there are actually four critical probabilities: {p_T^{\text{b}}, p_T^{\text{s}}, p_H^{\text{b}}, p_H^{\text{s}}}. And since each of these depends on the choice of graph, we actually need to write {p_T^{\text{s}}(G)}. This notation is a bit cumbersome though, so we refrain from using it except where absolutely necessary.

One obvious relationship between {p_T} and {p_H} is that {p_T\leq p_H}. Some more difficult relations are known, but we’ll look at those in a later post. For now, let’s do an example.

3. Computing the Critical Probabilities

Determining {p_T} and {p_H} in general is quite hard (after all, the Harris-Kesten theorem basically boils down to doing this for {{\mathbb Z}^2}, and is already highly non-trivial), but there are some simple graphs for which it can be done. We do an example with bond percolation.

Suppose {G=T_k} is the tree in which every vertex has degree {k+1}, except for the root which has degree {k}. Call the root {v_0} and let {T_{k,n}} be the subgraph obtained by truncating this graph {n} levels down from the root. We write {\pi_n} for the probability that, in bond percolation with probability {p}, there is an open path of length {n} from {v_0} to a leaf of {T_{k,n}}.

For there to be an open path of length {n} from {v_0} to a leaf of {T_{k,n}}, there must be an open edge between {v_0} and some other vertex, and then an open path from that vertex to a leaf. This gives

\displaystyle \pi_n=1-(1-p\pi_{n-1})^k.

Consider the function {f(x)=1-(1-px)^k}. The {\pi_n} result from iterating this function starting with {\pi_0=1}, so we need to study it’s fixed points. We have {f(0)=0} and {f(1)<1}. If {f'(0)>1}, then {f(\epsilon)>\epsilon} for some small {\epsilon>0}, but {f(1)<1}, so there is some fixed point {x_0}. Otherwise, the unique fixed point is at {x=0}. Concretely, {f'(0)=pk}, so this condition reduces to {p>\frac{1}{k}}.

In the case where {p>\frac{1}{k}}, we see that {\pi_{n-1}\geq x_0\implies \pi_n\geq x_0}, and since {\pi_0=1\geq x_0}, we always have {\pi_n\geq x_0}. But this {\pi_n} is a lower bound on {\theta_{v_0}(p)}, since if there is an infinite open cluster containing {v_0} there must be a path to the bottom, so {\theta_{v_0}(p)>0}, and so {p_H\leq \frac{1}{k}}.

On the other hand, if {p\leq \frac{1}{k}}, the {\pi_n} converge to the unique fixed point at 0, meaning that {\theta_{v_0}(p)=0}. It follows that {p_H=\frac{1}{k}} for {T_k}.

We can also do {p_T}: a site at depth {\ell} is part of {C_{v_0}} with probability {p^{\ell}}, so

\displaystyle \chi_{v_0}(p)=\sum_{y\in T_k} \mathop{\mathbb P}(y\in C_{v_0})=\sum_{\ell=0}^{\infty} k^{\ell}p^{\ell},

which is finite if and only if {pk<1}. Thus {p_T=\frac{1}{k}} as well.

A few remarks about this calculation: the fact that {p_T=p_H} here is not unusual. This is true for a very general class of graphs.

Also, it turns out that this tree {T_k} has, in some sense, the lowest critical probability possible. To see this, take {G} to be a graph with maximum degree {\Delta}. If {y\in C_x}, then there is an open path from {y} to {x}, so we can bound {\chi_x(p)} by the number of paths in {G} that start at {x}. Using our information about the maximum degree, this gives

\displaystyle \chi_x(p)\leq 1+\sum_{\ell=1}^{\infty} \Delta(\Delta-1)^{\ell-1} p^{\ell},

which converges exactly when {p<\frac{1}{\Delta-1}}. This means that {p_T\geq \frac{1}{\Delta-1}}, and since {p_H\geq p_T}, this bound holds for {p_H} as well. In {T_k}, {\Delta=k+1}, and this bound is attained.

So the tree {T_k} is special in more ways than one. It turns out that the more “tree-like” a graph is, the more easily {p_T} and {p_H} can be computed. Unfortunately, {{\mathbb Z}^2} is nothing like a tree, so a different approach is required to tackle it.