For the first time in three years, Taylor Swift is releasing new music, and at least according to her, it’s supposed to be new music. In her own words, the old Taylor is dead. This post takes a look at all of her music to date and tries to figure out (with help from the Spotify API and some lyrics scraping) how it’s changed over time, and whether or not the two singles from her new album really represent a radical departure from her earlier stuff.
Before we look at anything, I should be clear about exactly which songs are under consideration here: everything from the albums Taylor Swift, Fearless, Speak Now, Red, and 1989 as well as the two singles from Reputation that have been released (“Look What You Made Me Do” and “Ready For It”). The deluxe tracks from all those albums are fair game as well, except the ones from Taylor Swift because they’re not on Spotify. What’s not fair game is the music from her holiday collection or anything she recorded for a soundtrack (including “I Don’t Wanna Live Forever.”) These lines are somewhat arbitrary, but I think the albums are the best benchmarks of her evolution since they’re polished final products. Taken together, this gives 86 songs under consideration.
Code for all of this can be found here.
1. The Music
For this section, we consider only the sound of the songs, ignoring any lyrics. We quantify the sound of songs using Spotify’s audio features. Specifically, I took what they call the acousticness, danceability, energy, liveness, loudness, tempo, and valence for each song.
1.1. Records and Rankings
The first thing I did was look at the extremes: the songs that had the highest and lowest value for each of the seven features. I thought that if one of the new singles showed up on this list, it would be a sign that her music was moving in a new direction.
|Acousticness||State Of Grace||Never Grow Up|
|Danceability||This Love||Hey Stephen|
|Energy||Never Grow Up||Haunted|
|Liveness||The Story Of Us||Better Than Revenge|
|Valence||This Love||Shake It Off|
|Tempo||This Love||Long Live|
|Loudness||Sad Beautiful Tragic||Picture To Burn|
But neither of them did. The two things I did notice were that “This Love” is clearly out there for her and that the only song on here that was a big hit was “Shake It Off.” Maybe extremes aren’t good for chart success.
The other thing I did right off the bat was, for each feature, rank each of her full albums by the average value for that feature. This doesn’t say anything about her new music, but does give some sense of how she’s changed in the past.
|1||Speak Now||1989||1989||Speak Now|
|2||Taylor Swift||Red||Speak Now||Taylor Swift|
|1||Red||Taylor Swift||Speak Now|
|2||1989||Speak Now||Taylor Swift|
What really stood out to me here was that Red and 1989 nearly always appeared next to each other, with two exceptions: tempo, where they were separated by Fearless, and Energy, where they appeared on opposite ends of the list! So there’s something going on here where Red and 1989 are very similar, but different in some key way (a way that’s captured by the energy feature).
Also, in the acousticness and liveness features, the albums generally go from old to new, which is to be expected since her music became more manufactured over time.
To get a feel for the data set as a whole, I found a two-dimensional representation using t-SNE. Here’s a plot (with song names truncated at 15 characters to improve readability):
That plot is pretty tough to read even with the truncation, so here’s a PDF that should be more accessible, especially if you zoom in.
In this graph, you can clearly see that 1989 is her most consistent album (there’s 10 songs in a relatively small area), followed by Red. Her first album on the other hand is all over the place. So, interpreted with some generosity, the graph shows her developing a sense of style over time.
Looking more closely at the area 1989 is concentrated in, I noticed two things. One is that this area also contains a lot of her big hits, like “You Belong With Me,” “We Are Never Ever Getting Back Together” and “Mine.” But more notably, it also contains both of the new singles! In two dimensions, Taylor Swift’s new music is indistinguishable from her old music.
To make concrete the notions of clusters in the previous section, I did some clustering. Specifically, I ran k-means on the seven dimensional points and then visualized the results in a plot similar to the previous ones. I chose to look at 2 and 5 clusters, hoping that they would correspond to the old Taylor/new Taylor and her 5 full albums, respectively.
The two clusters respect the structure in the visualization almost exactly:
Here’s the PDF. What’s more important is how this clustering interacts with the albums, which can be seen in this graph:
This isn’t too informative. We’d hope that one of the clusters increases steadily from album to album, but the ratios seem to be roughly constant. Let’s recreate these same two graphs with 5 clusters:
Again, the PDF. These graphs tell a much more interesting story: the first three albums are acoustically scattershot, drawing roughly equally from all the clusters. Then, in Red, she seems to split, toying equally with clusters 0 and 1. Between the two of them, it’s cluster 1 that wins out, and forms the bulk of 1989, with some small appearances from the others. And (as expected based on our earlier observations), the songs so far from Reputation also fit into cluster 5, making them nothing too new.
It’s worth taking a second to translate back from “clusters” to more meaningful terms. Looking at the scatter plot, cluster 0 seems to correspond to upbeat pop songs (like “Red” and “Shake It Off”) while cluster 1 has slower, mellower, songs (like “The Lucky One” and “Teardrops on My Guitar”). And on Red, all of the hit songs were from cluster 0, not cluster 1, maybe explaining her decision to switch over for her next album.
(The other clusters are harder to interpret, but if you figure anything out, let me know!)
2. The Lyrics
As any avid Taylor Swift fan knows though, the real meat of her songs is in the lyrics, so I turned to those next. Thankfully, most of the scraping work had already been done, so I just piggy-backed off that.
2.1. Word Clouds
I started by making word clouds for each of her full albums, just because it was easy and cute:
They’re not too informative, but they are fun to look at.
2.2. Topic Models
For more substantive analysis, I fit a topic model to the collection of lyrics. I tried latent Dirichlet allocation and non-negative matrix factorization, with the latter giving better results. Somewhat arbitrarily, I fit models with 5 topics. Here is, for each of the 5 topics the model found, the top 20 words:
- Topic #1: love beautiful cause back baby time say way everything one home take eyes smile feel right better could little well
- Topic #2: come back movie would gone sinks york miss could figured leave today spelling guess feeling dream new somethings somehow id
- Topic #3: stay mad hey palm time lock quite want easy lovin hand grow funny late people let beautiful say best well
- Topic #4: ever ooh talk together getting 22 friends grow back telling yeah nights feels alright mean called remember night used one
- Topic #5: girl thats works trying want strong alone ever everybody worse yeah knows forever goes wait heart would place get rain
These aren’t too easy to make sense of, so let’s bring in some visualizations. First, treating the percentage contribution of each topic to a song as a feature, I used t-SNE to visualize the songs again:
Another tough to read plot, so here’s the PDF. The relatively clusters are pretty promising, as is the fact that 1989 is again the most tightly clustered album – it’s both acoustically and thematically unified. The big swath through the bottom has a lot of canonical Taylor Swift love songs (“Fifteen” and “Speak Now” for example) that definitely belong together, but it’s not clear why something like “The Story of Us” was pushed off to the other cluster, since it’s clearly a love song as well.
But it turns out that this plot is a little misleading. If we recolor it by dominant topic (i.e. a songs color is determined by the topic that contributes most heavily to it), we get a different story:
(The PDF.) Now it’s clear that (as many people correctly conclude without any analysis at all) nearly all of her songs are predominantly love songs drawing from Topic 1, and the clusters found by t-SNE are mostly artificial.
It’s interesting to look at the songs that aren’t dominated by this topic. A couple of them are clearly not love songs (“Welcome to New York” and “22”) but some are clearly love songs that the model misinterpreted. For example, the scatter plot suggests that the model put “Stay Stay Stay” and “All You Had To Do Was Stay” together into a topic of songs about staying, and looking at the key words from the topic confirms this.
Similarly, topic 5 picked up on the word “girl” in both “How You Get The Girl” and “Girl At Home,” as well as a line from the chorus of “A Place In This World” and thrust those together.
None of these other topics picked up on the new singles though. They are, at least according to the model, standard Taylor Swift songs.
I had hoped that the model would be able to separate, for example, songs about falling in love and songs about breaking up, but some combination of short songs and non-distinctive vocabulary prevents this from happening.
For one last graph, I plotted the topic composition across albums:
Obviously this graph is a little suspect because of the issues I mentioned above, but it does suggest that the canonical “Taylor Swift love song” became significantly less important to her work as she aged, in parallel with her shift toward more pop music.
Based on both the audio and lyrics of her songs, there’s a clear pivot of some kind in Taylor Swift’s music as she moved away from her country roots. But there’s no reason to believe that that shift continues in her new work, which is extremely similar to 1989. Contrary to what she’d have us believe, the old Taylor isn’t dead at all.