Experience Our Biggest Web App Update with 5,000+ New Genres! ???? Discover Now

The 4 Applications of AI in the Music Industry

The 4 Applications of AI in the Music Industry

A couple of weeks ago, Cyanite co-founder Jakob, gave a lecture in a music publishing class at Berlin’s BIMM Institute. The topic was to show and give concrete examples of AI’s real use cases in today’s music industry. The goal was to get away from the overload of buzzwords surrounding the AI topic and shed more light on AI’s actual applications and benefits.

This lecture was well received by the students, so we decided to publish its main points on the Cyanite blog. We hope you enjoy the read!

Introduction

Many people, when they hear about “AI and music”, think of robots creating and composing music. This understandably comes together with a very fearful and critical perception of robots replacing human creators. But music created by algorithms merely represents a fraction of AI applications in the music industry. 

AI Robot & Music
Picture 1. AI Robot Writing Its Own Music
This article is intended to explore:

1. Four different kinds of AI in music.

2. Practical applications of AI in the music industry. 

3. Problems that AI can solve for music companies.

4. Pros and cons of each AI application.

How does AI work? 

Before we dive into the four kinds of AI in the music industry, here are some basic concepts of how AI works. These concepts are not only valuable to understand but they can help come up with new applications of AI in the future. 

Just like humans, some AI methods like deep learning need data to learn from. In that regard, AI is like a child. Children absorb and learn to understand the world by trial and error. As a child, you point your finger at a cat and say “dog”. You then get corrected by your parents who say, “No, that’s a cat”. The brain stores all this information about the size, color, looks, and shape of the animal and identifies it as a cat from now on. 

AI is designed to follow the same learning principle. The difference is that AI is still not even close to the magical capacity of the human brain. A normal AI neural network has around 1,000 – 10,000 neurons in it, while the human brain contains 86 billion!

This means that AI can currently perform only a limited number of tasks and needs a lot of high-quality data to learn from.

One example of how data is used to train AI to detect objects in pictures is a process called reCAPTCHA. This is a system that asks you to select traffic lights in a picture to “prove you are human”.

The system collects highly valuable training data for neural networks to learn how traffic lights look like.

ai learning
Picture 2. AI Learning with reCAPTCHA
If you are interested to learn more about how this process works for detecting genres in music, you can check out this article.

The 4 types of AI in music

Now that you understand the basic AI concept, here is an overview of the four main applications of AI in the music industry. Keep in mind that there are many more possible applications.

1. AI Music Creation

2. Search & Recommendation

3. Auto-tagging

4. AI Mastering

Let’s have a closer look at what problems each area addresses, how the solutions work, and also explore their pros and cons!

Application 1: AI-Generated Music

Problem

Problems that AI can solve in the AI creation field are not very apparent. AI-generated music is, firstly, a creative and artistic field. However, if we look at it from a business context we can identify existing problems. When the music needs to adapt to changing situations, for instance, in video games or other interactive settings, AI-created music can adapt more natively to changing environments. 

Solution

AI can be trained to create custom music. For that AI needs input data and then it needs to be taught to make music. Just like a human.

To understand current AI creation capabilities here are a couple of real-world examples:

Yamaha company analyzed many hours of Glenn Gould’s performance to create an AI system that can potentially reproduce the famous pianist’s music style and maybe even create an entirely new Glenn Gould’s piece.

A team of Australian engineers won AI “Eurovision Song Contest” by creating a song with samples of noises made by koalas and Tasmanian devils. The team trained a neural network on animal noises for it to produce an original sound and lyrics. 

Who is AI-generated music for?

  • Game Studios
  • Art Galleries
  • Brands  
  • Commercials  
  • Films  
  • YouTubers  
  • Social Media Influencers

Implementation Examples

Pros of this solution

  • Cheap to produce new content
  • Customizable
  • Great potential for creative human & AI collaboration
  • Creative tools for artists.

Cons of this solution

  • The quality of fully synthesized AI music is still very low
  • No concrete application in the traditional music industry
  • Legal issues over the copyright including rights to folklore music 
  • Most AI creation models are trained on western music and can reproduce western sound only
  • Very high development cost.

Bottom line

It will take some time for AI-created music to sound adequate or have a straight use case. However, hybrid approaches that use AI to compose music with pre-recorded samples, loops, and one-shots show that the AI-generated future is not far away.

Application 2. Search & Recommendation

Problem

It can be hard to find that one song that fits the moment perfectly, whether it is a movie scene or a podcast. And the more music a catalog contains, the harder it is to efficiently search it. With 500 million songs online and 300,000 new songs uploaded to the internet every day (!!), this can easily be called an inhuman task. Platforms like Spotify develop great recommendation algorithms for seamless and enjoyable listening experiences for music consumers. However, if we look at sync, it gets a lot more difficult. Imagine a music publisher who administers around 50,000 copyrights. Effectively they can oversee maybe 10% of that catalog leaving a lot of potential unused. 

Solution

AI can be trained to detect sonic similarities in songs.  

Who are Similarity Searches for?

  • Music publishers: using reference songs to search their catalog
  • Production music libraries and beat platforms
  • DSPs that don’t have their own AI team
  • Radio apps
  • More use cases in A&R (artist and repertoire) and etc.
  • DJs needing to hold the energy high after a particularly well-received track (in the post-Covid world)
  • Basically, anyone who starts sentences like “That totally sounds like…”
  • Managers targeting look-alike audiences. 

Implementation Examples

Pros of this solution

  • Finding hidden gems in a catalog which goes far beyond the human capacity for search. Here both AI-tagging and AI search & recommendation are employed
  • Low entry barrier when working with big catalogs
  • Great and intuitive search experiences for non-professional music searchers.

Cons of this solution

  • Technical similarity vs. perceived similarity – there is still quite a lot of difference in how a human and AI function. Human perception is highly subjective and may assign higher or lower similarity to two songs, which may be different to what AI thinks. 

Bottom line

All positive. Everyone should use Similarity Search algorithms every day.

Application 3. Auto-tagging

Problem

To find and recommend music, you need a well-categorized library to deliver the tracks that exactly correspond to a search request. The artist and the song name are “descriptive metadata”, while genre, mood, energy, tempo, voice, language are “discovery metadata”. More on this topic here. The problem is that tagging music manually is one of the most tedious and subjective tasks in the music industry. You have to listen to a song and then decide the mood it evokes in you. Doing that for one song might be ok, but forget about it at scale. At the same time, tagging requires extreme accuracy and precision. Inconsistent and wrong manual tagging leads to a poor search experience, which results in music that can’t be found and monetized. Imagine tagging the 300,000 new songs uploaded to the internet every day. 

Solution

Tagging music is a task that can be done with the help of AI. Just like in the example in the first part of this article, where an algorithm detects traffic lights, neural networks can be trained to learn how, for example, rock music differs from pop or rap music.

Here is a Peggy Gou’s song, analyzed and tagged by Cyanite: 

YouTube

By loading the video, you agree to YouTube's privacy policy.
Learn more

Load video

AI-tagged song
Who is AI-tagging for? 

For every music company that knows the pain of manual tagging. If you work in music, chances are pretty high that you had or will have to tag songs. If you pitch a song on Spotify for Artists, you have to tag a song. If you ever made a playlist – you most probably had to deal with its categorization and tagging. If you’re an A&R and present a new artist to your team and say something like, “This is my rap artist’s new party song,” you literally just tagged a song. In all these cases it is good to have an objective AI companion to tag a song for you. 

AI-tagging is a really powerful tool at scale. You just bought a new catalog with tons of untagged songs but want to utilize it for sync: AI-tagging is a way to go. You’re a distributor tired of your clients uploading unfinished or false metadata: AI-tagging can help. You’re a production music library that picked up tons of legacy from years of manual tagging: the answer is also AI-tagging.  

Implementation Example

In the BPM Supreme library, you can see the different moods, energy levels, voice presence, and energy dynamics neatly tagged by an AI.

BPM Supreme Interface
Picture 3. BPM Supreme Cyanite Search Interface
Pros of this solution

  • Speed 
  • Consistency across catalog
  • Objectivity / reproducibility
  • Flexibility. Whenever something changes in the music industry, you can re-tag songs with new metadata at a lightning speed.

Cons of this solution

  • Development cost and time (luckily, Cyanite has a ready-to-go solution)
  • High energy consumption of deep learning models, but still less resource-heavy compared to manual tagging.

Bottom line

Tagging can not replace human work completely. But it’s a powerful and practical tool to dramatically reduce the need for manual tagging. AI-based tagging can increase the searchability of a music catalog with little to no effort.

Application 4. AI Mastering

Problem

Mastering your own music can be very expensive, especially for all DIY and bedroom producers. These categories of musicians often resort to technology to create new music. But in order to distribute music to Spotify or similar platforms, the music needs to meet certain criteria of sound quality. 

Solution

AI can be used to turn a mediocre-sounding music file into a great sound. For that, AI is trained on popular mastering techniques and on what humans have learned to recognize as a good sound. 

Who is AI mastering for?

  • DIY and bedroom producers
  • Professional musicians
  • Digital distributors 

Implementation Example

One company that is leading the field of AI mastering is LANDR. The Canada-based company has a huge community of creators and already mastered 19 million songs. Other players include eMastered and Moises.

LANDR AI Mastering
Picture 4. LANDR AI Mastering
Pros of this solution

  • Very affordable ($48/year for unlimited mastering of LO-MP3 files plus $4.99/ track for other formats vs. professional mastering starting at $30/song)
  • Fast
  • Easy for non-professionals. 

Cons of this solution

  • A standardized process that doesn’t allow room for experiments and surprises
  • Some say AI mastering is “lower quality compared to human mastering”.

Bottom line

AI mastering is an affordable tool for musicians with low budgets. For up-and-coming artists, it’s a great way to get your professionally edited music out to DSPs. For professional songwriters it’s the perfect means to let demos sound reasonably good. Professional mastering experts usually serve a different target group, so these fields are complementing each other rather than AI taking over human jobs.

Summary

To sum it up, we presented 4 different concrete use cases for AI, that work for almost every part of the value chain in the music industry. Still, the practical applications and quality differ.  AI is far from having the same complex thinking and creativity as a professional music tagger, mastering expert, or musician. But it can already help creatives do their work or even completely take over some of the expensive and tedious tasks. 

One of the biggest problems that prevents us from embracing new technology is wrong expectations. There are often two extremes: on the one side, people overestimate and expect more from AI than it can currently deliver e.g. tagging 1M songs without a single mistake or always being spot-on with music recommendations. The other camp has a lot of fear about AI taking over their jobs.

The answer may lie somewhere in between. We can embrace technology and at the same time remain critical and not blindly rely on algorithms, as there are still many facets of the human brain that AI can not imitate. 

We hope you enjoyed this read and learned more about the 4 different use cases of AI in music. If you have any feedback, questions, or contributions, you are more than welcome to reach out to jakob@cyanite.ai. You can also contact our content manager Rano if you are interested in collaborations. 

An Analysis of Club Sounds with Cyanite

An Analysis of Club Sounds with Cyanite

If we asked you to describe the vibe of your favourite nightclub, could you? Today, we show you how we would describe the sounds of some of our favourite clubs, with the help of the Cyanite music AI analysis software. 

We analysed album compilations of 9 well-loved clubs across Germany. From Berlin, these were Berghain, Griessmühle, About Blank, Golden Gate and Kater Blau. In addition to these, we analysed music from Hamburgs Golden Pudel, Leipzigs Institut für Zukunft, and Omen and Robert Johnson in Frankfurt. 

The mood multi-label classifier provides the following labels:

tenseupliftingrelaxingmelancholicdarkenergetichappy

Each label has a score reaching from 0-1, where 0 (0%) indicates that the track is unlikely to represent a given mood and 1 (100%) indicates a high probability that the track represents a given mood.

Since the mood of a track might not always be properly described by a single tag, the mood classifier is able to predict multiple moods for a given song instead of only one. A track could be classified with dark (Score: 0.9), while also being classified with aggressive (Score: 0.8).

The mood can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition the score the API also exposes a list which includes the most likely moods, or the term ambiguous in case of none of the audio not reflecting any of our mood tags properly.

Insights from Instrumental and Voice analysis: Berlin clubs lead the way for greatest use of Instrumental in their tracks.

Based on results from the CYANITE Instrument and Voice machine learning analysis, we see that while most of these club compilationstracks are extremely dominated by instrumentals, the top four clubs which contained the most instrumental-heavy tracks were from Berlin. 

About Blank contained the highest amount of instrumental, followed by Griessmühle, Berghain and Golden Gate. For all of these clubs, our analysis showed that instrumentals made up more than 80% of the tracks. 

When we look at results of the voice analysis, we see that the clubs with the most use of female voices in their tracks are clubs outside of Berlin. In first place, we have Institut für Zukunft, followed by Omen, and then Golden Pudel. Funny enough, we also found that the four clubs with the least use of female vocals in their tracks were from Berlin! These clubs are: Kater Blau, About Blank, Griessmühle, and Golden Gate. 

When looking at the presence of male vocals, we see that Golden Pudel is the club using the most male vocals amongst the clubs we are studying today. This is followed by Omen and Golden Gate. 

Based on the results from this analysis, data from Golden Pudel intrigued us the most. We observe that Golden Pudels music, unlike the rest of the clubs, has a slightly more even balance between instrumental and vocals that is almost a 50/50 split. 

Insights from the CYANITE mood tagging technology: Berghain is the gloomiest of them all. 

Looking at the results, heres what we found: 

With its grim industrial aesthetic, its fitting that our analysis found Berghain to be the most melancholic club. Berghain ties with Institut für Zunkunft for having the most Dark sound. 

Golden Gate, a favourite of ours for a good night of House music, takes the prize for being the most uplifting club. Our mood analysis also showed that Frankfurts Omen club is at once the most tense and most energetic, while Golden Pudel in Hamburg was found to be the most happy and relaxing. Our mood analysis also showed that the compilation from Frankfurt’s now-defunct Omen club, is at once the most tense and most energetic out of all the clubs’ compilations. A very apt result indeed- Omen was a prominent symbol of the unrestrained, pure fervour of 90’s rave culture, and one whose sound we definitely miss greatly.

Talking about the Berlin Sound

Comparing the clubs, we see that clubs in Berlin have a distinct, extreme skew towards the dark and melancholic, indicating the very characteristic moodiness that we so love and miss in these times!

Looking at the clubs elsewhere, we see that while dark and melancholic moods are still very much present, there isn’t as clear of a skew towards these two only. Instead, our data from the 4 clubs outside of Berlin show more diverse moods, with no clear skew in a certain direction. 

Genre Tagging with our music AI: Some interesting insights 

We see that for Berlin clubs, the CYANITE AI analysis of their club compilations reveals a strong skew towards Techno and Tech House. The top 3 places with the most amount of Techno in their songs are: in top place, About Blank, followed by Berghain, and then Griessmühle. For Tech House, the top 3 clubs are Golden Gate, Kater Blau, and About Blank. 

Outside of Berlin, we see a more varied mix of genres in the club compilations. Omen ranks highest in the amount of trance in the selection, a genre that was almost not found at all in the Berlin clubs we studied.  

You can listen to some of the compilations we analysed here:

About blank: :// About Blank ( 2018) , :// About Blank 002 (2017), :// About Blank 004 (2018), :// About Blank 006 ( 2019) and :// About Blank 007 (2019) 

Kater Blau: Katermukke 150 Compilation (2017)

Berghain / Panorama Bar: Ostgut Ton – Zehn (2015)

Golden Gate: Compilation (2012)

Golden Pudel: Operation Pudel (2001)

Omen: Moka DJ Compilation (1996)

Institut Fur Zukunft : Various 5IVE ( 2019)

Robert Johnson:  Livesaver Compilation 2 (2015) & Livesaver Compilation 3 (2017)

 

Overall, our quick research into these clubs with AI showed us some very interesting things. It seems that with a larger data set, it might be possible to quantify the Berlin sound and perhaps also sounds for other key party cities.

Ellen Allien, Roman Flügel and Dub Isotope: An AI Analysis of Techno and Drum ‘n’ Bass

Ellen Allien, Roman Flügel and Dub Isotope: An AI Analysis of Techno and Drum ‘n’ Bass

As a team of music lovers, the Cyanite team has been tuning in regularly to Berlin’s lockdown livestream DJ sets over the last few months. 

Some of us might be of the opinion that recording technology should be kept firmly away from the dancefloor in order for party people to truly revel in the night’s atmosphere. Right now, it seems the opposite is true. 

These livestream recordings have made it possible for music fans in Berlin (and the world) to experience electronic music and to feel connected to the electronic music community. Technology has proven itself as very much needed and welcome in the music space, and in this case, instrumental in keeping club culture alive during the coronavirus restrictions.

Decoding Electronic Music with AI

In this spirit of club culture, we ran some of our favourite mixes through our analysis models.

Set #1: Ellen Alliens’s Griessmuehle set

A heavyweight in the techno scene, Ellen Allien’s combination of classic techno with a side of experimental and IDM is one-of-a-kind, and a definite favorite at Cyanite. 

Ellen Allien’s no prisoner taking set @ Griessmuehle Berlin

In this set, techno alternated between dancey and contemplative. Our genre analysis results revealed that the set was predictably profiled as consisting mostly of “electronic dance”. In moments where the electronic dance genre was detected at a low level, the ambient genre was inversely detected as the dominant genre.

Result of Cyanite’s AI analysis on the emotional dynamics of Ellen Allien’s set

Looking at our emotion analysis results, the top quality detected in her hour-long set was “dark”. This was followed closely by “tense”, and then “energetic”. Characteristically, we observed that her set opened with the level of darkness at a high point, before hovering at a more or less at a consistent mid-to-high level after, before ending high again. 

Tenseness, however, was a different story altogether. In Ellen Allien’s set, techno is a tightrope walk. Listeners alternate between feeling almost about to tip over the edge and occasional moments of stability at the peak.

Sustained periods of high musical tension were found at the beginning, middle, and end of her set. Outside of those intervals, the level of tenseness peaked and plunged all throughout the set, often in sharp, spiky drops and rises. 

In her set, tension is also characterized by a frenetic level of energy: we saw that the level of energy detected very closely paralleled the pattern of tenseness.

Set #2: Roman Flügel’s Wilde Renate stream 

Roman Flügel’s Wilde Renate stream is another hot favorite. 

Roman Flügel’s never disappointing curation of eclectic sounds @ Wilde Renate

While the Ellen Allien set we listened to earlier veered towards the heavier side, this set takes us to a gentler side of techno. Flügel treated our ears to an hour of electro, techno and occasional ambient. 

A softer brand of techno does not mean happy techno though (if there can ever be such a concept). While the previously discussed set was ruled by high-strung techno energy, Flügel’s set is more muted

Result of Cyanite’s AI analysis on the emotional journey of Roman Flügel’s set

Topmost of the qualities detected was ‘melancholia’. The atmosphere of melancholy hovered at a consistently high level throughout the entire set, with brief intervals of dips. In those moments where melancholy dropped, tenseness – which was at a base level throughout most of the set, climbed up slightly. 

Almost as much as his set was melancholic, it was calm. The smooth melodies and synth swells in the set gave it an air of sereneness. Calmness was the second highest quality detected. The level of calm closely mirrored the level of melancholy throughout the set, although it had more well-defined plateaus during the most calm moments. 

Flügel’s set was also comfortingly brooding (exactly how we love our techno). Underscoring the calmness and melancholy was darkness, which was profiled as the third top quality in the set. 

The haunting, sad effect of minor keys seem to be well favored in techno. 

Both these techno sets were detected to be mostly in minor keys: Ellen Allien’s one was predominantly B Flat minor, and Roman Flügel’s in F minor.

Set #3: Dub-Isotope’s VOID mix 

Pivoting away from techno, our third set analyzed was a Drum N Bass one. We analyzed Dub Isotope’s set at VOID Berlin- a stellar venue for non-techno and techno music alike. 

YouTube

By loading the video, you agree to YouTube's privacy policy.
Learn more

Load video

Dub Isotope’s stellar 171 BPM Drum n’ Bass journey @ VOID Berlin

While the two sets above were in a minor key, our analysis results showed that this set favored F major. Also, compared to the 105-130 BPM range of the techno sets, Dub Isotope’s set sat firmly at 171 BPM, a tempo characteristic of Drum N Bass music. 

Listening to Drum N Bass is quite a diverse emotional journey. The qualities detected in this set were at more moderate levels, compared to the earlier two techno sets. 

Cyanite’s AI mood analysis on Dub Isotope’s Drum n’ Bass trip

Our results signaled to us that this set was definitely more upbeat. While the techno sets at certain emotions at a distinctively high level, and others at a significantly lower range (e.g. Relaxing’ at near rock bottom levels for both), the various emotions detected for Dub Isotope’s set mostly occupied the mid-range. Among these, the top few to note were ‘calm’, ‘dark’ and ‘relaxing’, followed very closely by happiness

Looking at our genre analysis, Dub Isotope’s set was similarly detected to be electronic dance, with an interesting spurt of hip hop just a bit after the halfway mark of the mix

Analyze your own music

We built Cyanite in a way that everyone can use it to analyze their own music with AI. If you want to get insights on mood, genre, bpm, and key for your music, you can register here for free and try it out yourself. Contact us if you have feedback, ideas, or want to use our API to integrate Cyanite into your database.