Experience Our Biggest Web App Update with 1,500+ New Genres! πŸŽ‰ Discover Now

A Look Behind the Curtain: Podcast Interview with Roman Gebhardt

August 22, 2023

Last updated on September 19th, 2023 at 12:58 pm

A.I. Podcast featuring CAIO Roman Gebhardt

Our Chief Artificial Intelligence Officer, Roman Gebhardt, was a guest in this week’s podcast episode of The Illiac Suite – Music and Artificial Intelligence by Dennis Kastrup. The podcast features an A.I.-generated version of “Bohemian Rhapsody,” not sung by Freddie Mercury, but by Whitney Houston. This shows how far A.I.-generated music has already come. It has become fairly easy to train an A.I. model on a single vocalist, but what about entire songs? How do we make A.I. understand music to a degree where it would feel like you’re asking a professional musician to find you the perfect song for any scenario, and it never misses?

 

Metadata

At the core of A.I.’s grasp of music is metadataβ€”categories like moods, genres, instruments, and styles help an A.I. understand songs like we do. We at Cyanite automatically tag these features using our A.I. once our customers, such as Slip.stream or BMG, upload songs into our system. The text-to-music search is not translated into tags but is directly mapped from text to music. So, unlike tag-based music search, Cyanite offers Free Text Search. How do we do that? We use sets of music and text descriptions to let our systems understand everything that comes to your mind, as long as you are able to put it into words.

 

We are not searching for certain keywords that appear in a search. We directly map text to music. We make the system understand which text description fits a song. This is what we call Free Text Search.

Roman Gebhardt, CAIO at Cyanite

The Problem

Music is art. You can’t unanimously define art, or else it probably wouldn’t be art in the first place. That’s the beauty of it all: subjectivity. While we love that about music, it’s one of the biggest challenges for our A.I. model. Just as Dennis Kastrup beautifully said in the podcast: while some people listen to sad music and find solace, others might get even sadder. Images, for example, are much easier to analyze than music. Imagine a picture with a yellow wall and a TV standing in front of it. While all people see color slightly differently, no one would identify the TV as a record player. Now think of “Bohemian Rhapsody” by Queen. Good luck unanimously agreeing on the genre or even the mood of this song.

Never-Ending Solutions

The key to successfully training a model like ours is to provide meaningful audio descriptions with the songs that we feed into our system. From there, we do a lot of reverse engineering. We analyze our models and get feedback from our customers to see where our models lack understanding. Then, we search for meaningful metadata to make up for that. So, it’s a never-ending cycle in which we build and refine, just to build again and refine from there. On and on and on.

Unleashing What We’ve Built

At the end of all of this, we work together with numerous companies that integrate our search technology into their systems and websites. A good example showcased in the podcast is Slip.stream. With Slip.stream, you can browse a massive royalty-free catalog, and with the power of Cyanite, you can find the perfect song for any situation using the Free Text Search. For example, “music to enjoy the moment.” Listen to what Cyanite came up with here. We at Cyanite are building the tools necessary for you to shape tomorrow’s music industry. Feel free to check out the Podcast above.

Your Cyanite Team.

 

More Cyanite content on AI and music