Version 2.0 of Daniels' work was just released in June, and compares the lyrical range of 85 different artists (since the analysis requires at least 35,000 words in an artists' recorded body of work, it doesn't include newer acts, or those with limited recordings). As Daniels explains, "I used a research methodology called token analysis to determine each artist’s vocabulary. Each word is counted once, so pimps, pimp, pimping, and pimpin are four unique words. To avoid issues with apostrophes (e.g., pimpin’ vs. pimpin), they’re removed from the dataset. It still isn’t perfect. Hip hop is full of slang that is hard to transcribe (e.g., shorty vs. shawty), compound words (e.g., king shit), featured vocalists, and repetitive choruses."
The analysis is published on the project website, but you can purchase it in poster form from Pop Chart Lab. Aesop Rock is the runaway lyrical champion of the world, his vocabulary so impressive that Daniels had to change the x-axis to accommodate him. Of note, as Daniels says, "Wu-Tang Clan at #6 is fucking impressive given that ten members, with vastly different styles, are equally contributing lyrics."
On the other end of the scale, DMX sits in dead last, joined by a number of hip hop's biggest names, including Kanye, Lil Wayne, Snoop, and 2Pac. Goes to show, I guess, that lyrical virtuosity may not be the most important factor in moving records. Which is something Sean Carter not only figured out, but let us know on The Black Album cut, "Moment of Clarity":
I dumbed down my audience to double my dollars
They criticized me for it, yet they all yell 'holla'
If skills sold, truth be told, I'd probably be
Lyrically Talib Kweli
Truthfully I wanna rhyme like Common Sense
But I did 5 mil - I ain't been rhyming like Common since
So Jay-Z is already down with data science. What did I tell you about owning the world?