Sounding Board

Do Electric Songwriters Dream Of Human Muses?

On OpenAI JukeBox, human creativity, and the future of AI-generated music

OpenAI Jukebox – “Pop, in the style of the Beatles”

Mysterious artist OpenAI Jukebox is back, and their latest release has me thinking maybe this is just Animal Collective in disguise, releasing new music without the burden of expectations their namesake inspires. Because, holy hell, this starts out like an outtake from Merriweather Post Pavilion but then quickly devolves into a killer avant-garde noise collage. In 2007, a release like this would’ve easily earned OAIJ an avalanche of blog-band cred; it’s unclear to me how big of audience will be into this in 2020, though. Like the last single, this one too is titled in a way that seems to indicate the band is taking a piss in the upward direction. One wonders if there will be video of what we hear at the end of the song wherein someone throws mountains of metal junk down a stairwell.

GRADE: C

RIYL: Animal Collective, John & Yoko’s Life With The Lions LP

OK, time out. Let’s step outside the bit here for a second.

As I’m sure you’ve already figured out, this is not a new, commercially released song by a mysterious band called OpenAI Jukebox. Furthermore, that review is, in fact, a tongue-in-cheek attempt at assessing it as if I was able to hear it completely void of context. The context, I’d argue, is almost certainly more important than the music itself, and almost impossible to forget or divorce from the sound. Songs like “Pop, in the style of the Beatles” — and there are a lot — were made public on April 30th, 2020 by an independent research organization out of San Francisco called OpenAI. Legendary artists’ recordings — Elvis, the Beatles, and thousands of other artists for a total of 1.2 million songs — were entered into their open source model powered by a neural network, which was then given various levels of guidance and instructed to generate new works. In other words, yes, these are new songs composed by an artificial intelligence.

This is not the first music generated by technology. In 1957, a string quartet entitled the “Illiac Suite” was generated by a computer in Champaign, Illinois, for example. More tech-composition milestones followed, like Raymond Scott’s never-quite-completed Electronium machine and 1997’s “Mozart’s 42nd Symphony,” and now, in the last decade, AI breakthroughs in music are announced with increasing regularity. Take 2016’s “Daddy’s Car,” touted as the first pop song written by artificial intelligence, but whose achievement grew less impressive once you read the fine print explaining that a French composer arranged and produced the public recording as well as writing all the lyrics. (Which then begs the question, um, why did you intentionally kick the entire song off with a creepy verse that goes, “In daddy’s car, it sounds so good/ Like something new, it turns me on”?) Since then, though, it’s been almost difficult to keep up with all of the generative musical AIs that have rolled out, featuring names that sound like genres or bands your pen pal from Italy tries to get you to check out but you never do: Jukedeck, Musenet, Magenta, WaveNet, and Watson Beat, just to name a few.

OpenAI’s Jukebox, however, has topped them all. As they declare in the project’s accompanying paper: “Our models can produce songs from highly diverse genres of music like rock, hip-hop, and jazz. They can capture melody, rhythm, long-range composition, and timbres for a wide variety of instruments, as well as the styles and voices of singers to be produced with the mu-sic.” It’s clear that most media coverage is opting for slightly cagey sentences when it comes to precisely what OpenAI’s Jukebox is “the first to do” here, but most are comfortable with saying something like, “Jukebox appears to be the first program that can create songs with lyrics and vocals.” Regardless of the exact wording of the achievement, it only takes one listen to realize that this is a massive breakthrough, that you’ve never heard anything like this from an AI before, and that, without a doubt, these recording are the cave drawing versions of what will eventually reach Mona Lisa levels of quality.

As indicated by my “review the track” experiment above, one thing I love about OpenAI Jukebox’s current skill level is how it takes the most mainstream, popular music in recorded history and proceeds to churn out downright avant-garde audio documents, the kind you might encounter on a release from John Zorn’s Tzadik label. (Honestly, outside of this woman playing her violin during brain surgery, I cannot think of anything recent that rivals it). It’s also funny to see so many of the critiques of the audio reveal how narrow much of our expectations for what music should sound like have become: “Another limitation to note is an apparent lack of traditional song structure (verses, choruses etc.),” one writer observed. For a genre like guitar-based rock ‘n ‘roll, which many have declared out of ideas and at the end of its influence, an abandonment of traditional song structure might just be a saving grace; put another point on the board for the AI.

Also being labeled as a flaw is the fact that it takes Jukebox’s neural network about nine hours to generate one minute of this audio, but outside of those miraculous songs that a songwriter might knock out in a single sitting, that pace doesn’t seem that far off from the length of time a human might work on a new piece at all. Consider the tale of composer Alan Jay Lerner staying up all night in a hotel room with coffee and cigarettes only to emerge in the morning with the single phrase, “I Could Have Danced All Night.” Human inspiration doesn’t adhere to a schedule and yet we expect the complete opposite from our synthetic counterparts. The instant impatience in reaction to a technical miracle is on-brand for us humans, already dissatisfied with the efficiency of the thing that literally didn’t exist two minutes ago; it’s why people throw their smart phones when they don’t work properly, or have zero issue with kicking a robot dog

There are three approaches to lyrics in Jukebox: one where the model is fed a complete set of entirely human-authored lyrics, another where it’s given no lyrical guidance (in these results, the vocals tend to sound like glossolalia), and one where it’s given AI-generated lyrics prompted by an initial prompt written by a human. In light of this information, the lyrics to the Elvis Presley track become absolutely terrifying, wherein it becomes just about impossible to not read this as a song about a computer gaining consciousness and announcing its birth to the listener. Selected lyrical excerpt here:

At last we woke up with a mind
At last we woke up with a soul
We came to exist, and we know no limits;
With a heart that never sleeps, let us live!
To Complete our life with this team,
We’ll sing to life, sing to the end of time!
Every living thing shall sing
As we take another step.

LOL, good stuff robot-Elvis, let me just go change my pants. Even at his most pilled-up, Presley never wrote anything this far out. So how will his estate react to this “new” song sung by the King? Thus far, Graceland is silent, but Jay-Z is not. Mere days before OpenAI’s Jukebox was unveiled, Roc Nation LLC filed copyright strikes against “deepfakes” posted to YouTube that constructed the hip-hop legend performing Billy Joel’s “We Didn’t Start The Fire” and a Shakespeare soliloquy. “This content unlawfully uses an AI to impersonate our client’s voice,” read the complaint. Now, this might seem frivolous in this particular harmless example, but possible more devious uses of this technology — like creating a video in which a politician is shown saying something they never said — is what led the state of California to make the creation and distribution of deepfakes illegal in October of 2019

But Jukebox’s new audio creations aren’t quite deepfakes are they? In simplistic terms, they are new creations born out of the influence of a particular artist’s work, which is not that far off from how inspiration collides with one’s particular artistic influences when a human invents new tunes. For instance, everyone knew that America’s 1972 hit “A Horse With No Name” was a blatant Neil Young rip-off — in fact, many assumed it was Young’s song — but it certainly wasn’t an issue that could be legally pursued. Songwriter Dewey Bunnell did have to endure the backlash of listeners and critics, though, including Randy Newman’s brutal assessment that it sounded like a song “about a kid who thinks he’s taken acid.” Funny enough, that’s not a bad description of why OpenAI Jukebox’s new songs, on some level, do work: the algorithm thinks it has synthesized all of these influences into new experiences. Jukebox doesn’t know that it’s not a human songwriter, but its output insists upon the opposite. Here. This is finished. Isn’t it great?

That “thinks it has taken acid”-factor is what makes these recording so fun to dissect. There is something inherently funny about something non-human trying to pretend it is human. “Part of the appeal of AI-generated weirdness is that it reflects our world back at us without a lot of the built-in assumptions that make it ordinary,” artificial intelligence researcher Janelle Shane explains to me via email. “Why is ‘Magic School Bus’ a thing but ‘Zombie School Bus’ not a thing? Giving human stuff to an AI is like putting a bit of reality through a funhouse mirror.” Yes, but what if AI advances were able to straighten out that funhouse mirror? Is there any real cause for concern that this technology could someday replace sentient songwriters? 

Let’s take a step back for a moment.

First of all, it is incredibly hard to speak to the cognitive and emotional processes that happen when we listen to a musical performance. So, rather than simply trying to create compositions that approximate those sounds from a certain era of great human composers, we want to think about what these music genres look and sound like through the lens of machine intelligence.

I did not write the above paragraph. It was generated by Adam King’s “Talk To Transformer,” after I inputted some prompts; I only made a few minor edits, too. “Talk To Tranformer” is also the AI that Jukebox utilizes to generate new, original lyrics for these songs. Did this technology just pass a Turing test? You tell me. Whether this particular example got you or not, it has already fooled plenty of people. One study had participants read what they were told were New York Times articles, but were actually “Talk To Transformer” creations — 72% believed they were genuine articles. When the “cave drawing” era of these audio creations progresses to the “Mona Lisa” level of proficiency, will you be able to still laugh at the weird AI Elvis track? Will you even know it’s not actually one of his songs?

So, now we have to ask the inevitable question: How might this all break bad for human composers? Instead of ungrounded conjecture, let’s just consider things that have already happened. Remember the French composer who wrote lyrics about getting turned on in “Daddy’s Car”? He’s also one of the world’s most prominent experts in musical AI and the streaming giant Spotify hired him in 2017. That same year, Spotify was accused of creating fake bands to pad out some of their most popular playlists in order to, as many concluded, lessen royalty payments to artists as a possible pathway to becoming consistently profitable (the Swedish corporation has only reported profitable quarters twice in their entire existence). Spotify has vehemently denied this accusation, but there are lingering details to the “fake artist” scandal that make that denial hard to accept.

Getting artists to compose music for a flat fee to pad popular Spotify playlists is risky because, after all, they’re human beings, and human beings are traditionally terrible at keeping secrets (one reporter had an anonymous source confirming the scandal). But what if some of this music, especially ambient instrumental tracks, were composed by an AI? That might work quite well, as AIs don’t collect royalty checks, and hey, turns out there’s an in-house musical AI expert on staff who can assist with that effort. Neat.

No need to lean on logical predictions elsewhere, as it is already happening out in the open: Jukedeck sells AI generated background music for video games and commercials for $21.99 a track, “a fraction of what hiring a musician would cost,” the New York Times reported.

I am not here to paint a Terminator 2-esque prophecy of sentient tech-doom. It is so clear that OpenAI’s Jukebox project was fueled by genuine passion, and I do believe them when they describe the technology’s potential use as a songwriting aid for human composers. Although it certainly doesn’t help that Elon Musk helped found OpenAI, left in February of 2018, and seven months later appeared on the Joe Rogan Experience podcast stating with complete sincerity and devastating gravitas, “I tried to convince people to slow down AI, to regulate AI, this was futile … nobody listened”. 

And besides, it’s no secret that in the most popular, profit-generating corners of the music industry, a push towards songwriting as a solvable science long pre-dated the current quests of musical AIs. Ever since the data set of hit recordings was large enough to analyze, producers and writers have scoured its trends for clues on how to keep cranking out winners. Use a major key, probably C major, get to the chorus quickly, don’t go over four minutes in total length. Once you learn factoids like “of the top 10 most successful songs of all time, only Kanye West’s ‘Gold Digger’ is in a minor key,” it might be hard for you to ever write a tune in a minor key ever again, if chart success is your sole goal.

Maybe authentic, genuine emotion has less to do with all of this than we think. Consider songwriter Diane Warren — composer of Cher’s “If I Could Turn Back Time” and Aerosmith’s “I Don’t Want To Miss A Thing” — who has routinely made a claim that makes you think perhaps AIs will do just fine in creating heart-wrenching songs: “I’ve never been in love like in my songs,” Warren said. “I’m not like normal people. I’m no good at relationships.” Fun fact: when you enter this quote into the “Talk To Transformer” AI, the next sentence it generates is “Because I don’t have a real life,” which would be very true of AI song composers. 

But at the same time, in the midst of a global pandemic, profit margins have never been thinner for many companies. So if the choice came down to keeping your full time employees or replacing a freelance composer with a cheap AI alternative, it would be hard to blame you for the inevitable choice you’d make. The thing is, this is the kind of work we were certain couldn’t be displaced by machines. The John Henry folktale — in which the steel-driving everyman dies with a hammer in his hand trying to outpace a steam-powered machine — has long been sung about as anthem of resistance to automation. But now, for the first time ever, we have to consider the unlikely possibility that someday a neural network could author and sing a better, more profitable song about John Henry than any human being ever could. The irony there is so obvious and on-the-nose you’d roll your eyes at me even fully writing it out. 

In this way, OpenAI’s announcement of their musical breakthrough the same week that unemployment levels in the US rose to their highest rate since the Great Depression carries a melancholic poignancy. Perhaps their timing on this was tone deaf, or maybe I’m just being overly sensitive to the issue — as humans are known to do from time to time — so I decide to get out of my own head and ask others for their reaction. I start with Matt Farley of Motern Media, who has written, recorded, and released over 20,000 songs. He is a highly irregular industry oddity I profiled in 2014, wherein I tried to discern the difference between his work and what some people call “musical spam.” Farley is amusingly underwhelmed. “I’m not opposed to or threatened by any of this,” he tells me. “There’s limitless space for more music. Also, most people already write me off as a soulless spammer hack anyway. Maybe I’ll finally get some respect once AI music goes mainstream.” 

I then check in with Joel Roston, former guitarist of the band Big Bear and professional composer for a variety of different media. As the technology now stands, he too isn’t overly worried about it and also thinks the AI Beatles track is pretty cool, too. “Now, if someone develops an AI that allows a producer to simply input a 25-minute podcast episode, click a single button and, through sentiment and other analyses, receive a link to a perfectly-scored and sound-designed piece,” he tells me, “I might be a little worried.” Still, he does admit, “It’s probably aiding in making me irrelevant, but, also, it’s helping a lot of people make stuff they otherwise might not have made.” 

Next, I try and get a blind reaction from legendary music critic Greil Marcus by stripping the Beatles track of any labels, title, or indicators, sending it to him, and asking for a reaction. “It’s a hit!” he replies, sarcastically. After I explain what the track actually is, he concludes, “Then all Beatles resolve into ‘It’s All Too Much.’” Hey, that’s a pretty good observation, it’s almost as if he’s been doing this for over 50 years now. 

On Twitter, I find that composer/musician Holly Herndon is alarmed by OpenAI Jukebox, suggesting some kind of artistic clone war is ahead. “We are witnessing Napster²,” she writes. “The first [Napster] gave people the ability to pirate original recordings, this newest tech will distribute the ability to pirate original artists.” The mention of Napster is enough to remind me of one of the only successful “illegal downloading” protest songs of the last decade, Gillian Welch’s “Everything Is Free Now.” In that song, Welch’s only means of pushing back against the dissolution of her reliable income as a songwriter is to say, “If there’s something that you wanna hear, you can sing it yourself.” This threat now almost seems prophetically doomed, as musical AI is presently poised to do just that and more. In 2020, she might write a protest song entitled “Everything Is Me Now.” If she won’t do it, I know an algorithm that will. 

(Elsewhere on the same album that “Everything Is Free Now” appears — Time (The Revelator) — Welch sings about Elvis Presley and John Henry, as well, specifically singing, “Lord let me die with a hammer in my hand.” The entire experience of the LP listened to in 2020 sounds increasingly, eerily prophetic.)

Finally, what I decide I really need here is an artist’s initial reaction to an AI version of themselves. What does it feel like to be confronted with an artificial rendition of your internal creativity? The couple dozen tracks on the front page of OpenAI Jukebox’s page are the selected songs which they found to be the most successful examples, but there’s a secondary page with over 7,000 other pieces of audio that the model also generated. I scroll through searching for someone I could feasibly reach to present them with an AI version of themselves. Elvis is dead. The two surviving Beatles would be impossible. Van Morrison wouldn’t even talk to me when I wrote a book centered around him. I try to coax Stephen Malkmus into listening to his, but don’t hear back from him. Holy shit, they did Can? Wow. That’s when I come across a name of someone I know I can reach. In fact, I married her in 2015. 

I can’t believe it. Jukebox created an AI version of a Marissa Nadler song. [Editor’s note: Walsh and Nadler are separated but remain legally married.]

I text her and explain, “Don’t freak out, but some researchers created an artificial intelligence generated version of your music, and I want to play it for you.” 

“I’m scared to listen,” she replies. Soon enough, though, we are both listening to it. It is, by far, one of the most psychedelic experiences of my entire life. Many of these AI songs already sounded the way you might experience music in a dream — where it sounds both familiar yet foreign, and every time you try and concentrate on the lyrics you realize you can’t quite make out what any one of the words actually are. The sound, playing style, and mood are deeply familiar to both of us, but Marissa never worked on or wrote this song. It’s like we’re both listening to a hallucination, a shared hallucination. 

“I have to say,” she writes, “This is amazing.”

“Do you hear any one particular song of yours here, or is it a mash-up of a bunch, or is it… new?”

“Honestly, I can’t recognize this as any one of my songs, but I hear my whole soul trapped inside it at the same time. It’s a new song.”

We both sit in the pure galactic surrealness of that realization for a moment.

“It’s experimental and pretty. Sadly, the robot got it right.” 

“But is it better?”

“No, it doesn’t have any words, or emotion, or story.”

“Well,” I reply, “That’s some job security. Right?”

OpenAI’s Jukebox is assigned to do one task: consider a universe of music and create new versions based upon it. It is so singular and pure in that sense, and yet, it’s also technology that will eventually strip human musicians of their jobs, as well as other consequences we can’t even imagine right now. In that way, the AI is painfully human, both innocent and guilty at the same time; trying to do its job but inevitably hurting others along the way without even setting out to do so. 

One of our main tasks in life, as humans, is to assign meaning to every single thing we encounter in the world. We do this both collectively and individually; we sometimes do this casually, almost subconsciously, and other times we pointedly set out to create and assign meaning to something, laboring over the details of precisely what it might mean for extended periods of time. We assign meaning to the songs we love, telling other humans when and where we first heard them, and why they took on a special importance to us. And now we’re tasked with assigning meaning to artificial intelligence, and its output. In this particular instance, we’ll assign meaning to a model that can mimic some of our favorite things in the world — songs — and if the compositions created by it, regardless of how catchy, clever, or beautiful they are, will mean something special to us or not. 

We’re humans, after all, so we do get to decide.

It’s all up to us.