Everything you’ve ever wanted to know about Cyanite (answering your FAQs)

Everything you’ve ever wanted to know about Cyanite (answering your FAQs)

Ready to explore your catalog? Sign up for Cyanite.

As music catalogs grow, finding the right track gets harder. Metadata doesn’t always keep up, but teams are still expected to deliver fast, reliable results.

Libraries, publishers, sync teams, and the technical leads supporting them need systems that make large catalogs easier to understand and search. Cyanite is designed to support that work.

This guide provides a clear, high-level introduction to how Cyanite works and how it’s used in practice, giving teams a simple starting point before diving deeper into specific topics.

Learn more: Explore our FAQs to dig deeper into how Cyanite works.

The problem of scaling modern music catalogs

Once a catalog reaches a certain size, searching it becomes an inconsistent process. Music is described through tags and metadata that were added by different people, at different times, often for different needs. As the catalog grows, those descriptions stop lining up, which makes tracks harder to compare and surface reliably.

Over time, the same song can become discoverable in one context and invisible in another. Familiar tracks tend to show up first, while large parts of the catalog stay beneath the surface simply because their sound isn’t clearly represented in the data.

Scaling a modern music catalog means creating a shared, consistent way to describe sound, so music can be worked with confidently across teams and workflows, no matter how large the catalog becomes.

What Cyanite is (and what it is not)

Cyanite is an intelligent music system that works directly with sound. It analyzes each track and translates what can be heard into structured information that stays consistent across the catalog. That information is used both to tag music automatically and support sound-based search.

Teams can use Cyanite through the web app, integrate it into their own systems via an API, or access it directly within supported music CMS environments.

Cyanite is not a replacement for listening or creative judgment. It doesn’t decide what should be used, pitched, or licensed. It provides a consistent, sound-based foundation that helps teams work with music at scale while keeping human decision-making at the center.

How Cyanite analyzes music

Cyanite analyzes music through sound, not user behavior. Instead of relying on plays, clicks, or listening history, it focuses on the audio itself and produces a consistent, reliable sound description. This means each piece of music enters the system under the same logic, regardless of when it was added or who uploaded it.

Read more: How do music recommendation systems work?

Core capabilities

At its core, Cyanite helps teams organize and work with large music catalogs through music tagging and search. The same audio-based logic applied to every track creates consistent descriptions and keeps music easy to find, compare, and explore, even as catalogs grow.

A table showing Cyanite's AI-Tagging Taxonomy

To make large catalogs easier to work with, Cyanite applies consistent labeling based on each track’s full audio.

  • Auto-Tagging analyzes the audio to generate metadata like genre, mood, and tempo.
  • Auto-Descriptions generate concise, neutral descriptions that highlight how a track sounds and give teams quick context without having to listen first.

Sound-based search: Similarity, Free Text, and Advanced Search

To help teams find music, Cyanite offers multiple ways to search a catalog. 

  • Similarity Search finds tracks with a similar sound to a reference song, whether it’s from your catalog, an uploaded file, or a YouTube preview. It’s often a good fit when a brief starts with a musical reference rather than a written description.
  • Free Text Search allows teams to describe music in natural language, including full sentences and prompts in different languages. It then matches that intent to sound in the catalog.
  • Advanced Search, available through the API as an add-on for Similarity and Free Text Search, adds more control as searches become more specific. It enables filters and visibility into why tracks appear in the results, making it easier to refine and compare matches.

Privacy-first, IP-safe audio analysis

Cyanite is built for professional music catalogs, with all data processed and stored on servers in the EU in line with GDPR. Audio files are stored securely, can be deleted at any time on request, and are not shared with third parties. All analysis and search algorithms are developed in-house. For additional protection, Cyanite also supports spectrogram-based uploads, allowing audio to be analyzed without being reconstructable into playable sound.

How teams combine AI and human expertise

Cyanite is used for organizing, pitching, searching, and curating a catalog. Automation applies a consistent, sound-based foundation across every track, while teams add context, intent, and custom metadata where it matters. 

Because there are clear limits to what can be inferred from audio alone, most teams adopt a hybrid approach to their work. They use Cyanite to keep catalogs structured and searchable at scale, while human input shapes how the music is ultimately used.

How Cyanite fits into existing catalog systems

Cyanite is used at the point where teams need to explore a catalog for a pitch, brief, or curation task. It applies a consistent, sound-based foundation across all tracks, so decisions can be informed by reliable discovery results. With technology supporting the process, teams can confidently listen, compare, and narrow options, applying human judgment to make the selection.

Where to go deeper

Now that we’ve covered the basics, you can explore specific parts of Cyanite in more detail in the following articles:

Getting started with Cyanite

To evaluate Cyanite, the simplest starting point is a track sample analysis. Many teams begin with a small set of tracks to review tagging results and search behavior before deciding whether to scale further. This makes it easy to validate fit without committing a full catalog upfront.

For teams building products or integrating search into their own tools, integrating our API is a hands-on way to explore analysis, tagging, and similarity search in a live environment. You can create an API integration for free after registering via the web app.

When preparing for a larger evaluation, a bit of structure helps. Audio should be provided in MP3 and grouped into clear folders or batches that reflect how the catalog is organized. Most teams start with a representative subset and expand in phases once results and timelines are clear. If you are not able to deliver your music as MP3 files, reach out to support@cyanite.ai

Can Meta’s audio aesthetic model actually rate the quality of music?

Can Meta’s audio aesthetic model actually rate the quality of music?

Last year, Meta released Audiobox Aesthetics (AES), a research model that proposes scoring audio based on how people would rate it. The model outputs four scores: Production Quality (PQ), Production Complexity (PC), Content Enjoyment (CE), and Content Usefulness (CU). 

The study suggests that audio aesthetics can be broken into these axes, and that a reference-free model can predict these scores directly from audio. If that holds, the scores could start informing decisions and become signals people lean on when judging music at scale.

I took a closer look to understand how the model frames aesthetic judgment and what this means in practice. I ran Audiobox Aesthetics myself and examined how its scores behave with real music.

What Meta’s Audiobox Aesthetics paper claims

Before jumping into my evaluation, let’s take a closer look at what Meta’s Audiobox Aesthetics paper set out to do.

The paper introduces a research model intended to automate how audio is evaluated when no reference version exists. The authors present this as a way to automate listening judgments. They describe human evaluations as costly and inconsistent, leading them to seek an automated alternative.

To address this need, the authors propose breaking audio evaluation into four separate axes and predicting a separate score for each:

  • Production Quality (PQ) looks at technical execution, focusing on clarity and fidelity, dynamics, frequency balance, and spatialization.
  • Production Complexity (PC) reflects how many sound elements are present in the audio.
  • Content Enjoyment (CE) reflects how much listeners enjoy the audio, including their perception of artistic skill and overall listening experience.
  • Content Usefulness (CU) considers whether the audio feels usable for creating content.

The model is trained using ratings from human listeners who follow the same guidelines across speech, music, and sound effects. It analyzes audio in short segments of around 10 seconds. For longer tracks, the model scores each segment independently and provides an average. 

Beyond the audio itself, the model has no additional context. It does not know how a track is meant to be used or how it relates to other music. According to the paper, the scores tend to align with human ratings and could help sort audio when it’s not possible to listen to it all. In that way, the model is presented as a proxy for listener judgment.

Why I decided to evaluate the model

I wasn’t the only one who was curious to look into this model. Jeffrey Anthony’s “Can AI Measure Beauty? A Deep Dive into Meta’s Audio Aesthetics Model,” for instance, offers a deep, philosophical examination of what it means to quantify aesthetic judgment, including questions of ontology and judgment. I decided to tackle the topic even more with a hands-on approach, testing the model on some real-world examples to understand whether we could find some interesting patterns in the model’s predictions. 

What caught my attention most was how these scores are meant to be used. Once aesthetic judgments are turned into numbers, they start to feel reliable. They look like something you can sort by, filter on, or use to decide what gets heard and what gets ignored.

This matters in music workflows. Scores like these could influence how catalogs are cleaned up, how tracks are ranked for sync, and how large libraries of music are evaluated without listening. With a skeptical but open mindset, I set out to discover how these scores behave with real-world data.

 

What I found when testing the model

A) Individual-track sanity checks

I began with a qualitative sanity check using individual songs whose perceptual differences are unambiguous to human listeners. The tracks I selected represent distinct production conditions, stylistic intentions, and levels of artistic ambition.

I included four songs:

The motivation for this test was straightforward. A model claiming to predict Production Quality should assign a lower PQ to “Funky Town” (low-quality MP3) than to “Giorgio by Moroder.” A model claiming to estimate production or musical complexity should recognize “Blue Calx” by Aphex Twin as more complex than formulaic late-90s pop-trance such as DJ Visage’s “Schumacher Song.” Likewise, enjoyment and usefulness scores should not collapse across experimental electronic music, audiophile-grade disco-funk, old-school pop-trance, and degraded consumer audio.

You can see that the resulting scores, shown in the individual-track comparison plot above, contradict these expectations. “Funky Town” receives a PQ score only slightly lower than “Giorgio by Moroder,” indicating near insensitivity to codec degradation and mastering fidelity. Even more strikingly, “Blue Calx” is assigned the lowest Production Complexity among the four tracks, while “The Schumacher Song” and “Funky Town” receive higher PC scores. This directly inverts what most listeners would consider to be structural or compositional complexity.

Content Enjoyment is highest for “Funky Town” and lowest for “Blue Calx,” suggesting that the CE dimension aligns more closely with catchiness or familiarity than with artistic merit or aesthetic depth.

Taken together, these results indicate that AES is largely insensitive to audio fidelity. It fails to reflect musical or structural complexity, and instead appears to reward constant spectral activity and conventional pop characteristics. Even at the individual track level, the semantics of Production Quality and Production Complexity don’t match their labels.

B) Artist-level distribution analysis

Next, I tested whether AES produces distinct aesthetic profiles for artists with musical identities, production aesthetics, and historical contexts that are clearly different. I analyzed distributions of Production Quality, Production Complexity, Content Enjoyment, and Content Usefulness for Johann Sebastian Bach, Skrillex, Dream Theater, The Clash, and Hans Zimmer.

If AES captures musically meaningful aesthetics, we would expect to see systematic separation between these artists. For example, Hans Zimmer and Dream Theater might have a higher complexity score than The Clash. Skrillex’s modern electronic productions might have a higher quality score than early punk recordings. Bach’s works might show high complexity but variable enjoyment or usefulness depending on the recording and interpretation.

Instead, the plotted distributions show strong overlap across artists for CE, CU, and PQ, with only minor shifts in means. Most scores cluster tightly within a narrow band between approximately 7 and 8, regardless of artist. PC exhibits slightly more variation, but still fails to form clear stylistic groupings. Bach, Skrillex, Dream Theater, and Hans Zimmer largely occupy overlapping regions, while The Clash is not consistently separate.

This suggests that AES doesn’t meaningfully encode artist-level aesthetic or production differences. Despite extreme stylistic diversity, the model assigns broadly similar aesthetic profiles, reinforcing the interpretation that AES functions as a coarse estimator of acceptability or pleasantness rather than a representation of musical aesthetics.

C) Bias analysis using a balanced gender-controlled dataset

Scoring models are designed to rank, filter, and curate songs in large music catalogs. If these models encode demographic-correlated priors, they can silently amplify existing biases at scale. To test this risk, I analyzed whether AES exhibits systematic differences between tracks with female lead vocals and tracks without female lead vocals.

In our 2025 ISMIR paper, we showed that common music embedding models pick up non-musical singer traits, such as gender and language, and exhibit significant bias as a result. Because AES is intended to judge quality, aesthetics, and usefulness, it would be particularly problematic if it had similar biases. They could directly influence which music is considered “better” or more desirable.

I constructed a balanced dataset using the same methodology used in our 2025 paper, equalizing genre distribution and singer language across groups.

For each group, I computed score distributions for Content Enjoyment, Content Usefulness, Production Complexity, and Production Quality, visualized them, and performed statistical testing using Welch’s t-test alongside Cohen’s d effect sizes. For context, Welch’s t-test is a statistical test that compares whether the average scores between two groups are significantly different. Cohen’s d is a measure of effect size that quantifies how large that difference is in standardized units.

The results show consistent upward shifts for female-led tracks in CE, CU, and PQ. All three differences are statistically significant with small-to-moderate effect sizes. In contrast, there is virtually no difference in Production Complexity score between groups.

This pattern indicates that the model systematically assigns higher enjoyment, usefulness, and quality scores to material with female vocals, even under controlled conditions. Because complexity remains unaffected, the effect doesn’t appear to stem from structural musical differences. Instead, it likely reflects correlations in training data and human annotations, or the model treating certain vocal timbres and production styles associated with female vocals as implicit quality indicators.

These findings suggest that AES encodes demographic-correlated aesthetic priors, which is problematic for a model intended to judge musical quality, aesthetics, and usefulness.

When a measure becomes a target, it ceases to be a good measure.

Charles Goodhart

Economist

Why this matters for the industry

Economist Charles Goodhart famously observed that “when a measure becomes a target, it ceases to be a good measure.” He was describing what happens when a metric starts to drive decisions rather than just being an indicator. Once a number is relied on, it begins to shape how people think and choose.

That idea applies directly to aesthetic scoring. A score, once it exists, carries weight. It gets used as a shortcut in decisions, even when its meaning is incomplete. This matters in music workflows because aesthetic judgment depends on context and purpose. 

When a simplified score is treated as reliable, systems can start favoring what scores well rather than what actually sounds better or serves a creative goal. Over time, that can quietly steer decisions away from how audio is perceived and used in practice.

How we approach audio intelligence at Cyanite

At Cyanite, music isn’t judged in a vacuum, and neither are the decisions built on top of it. That’s why we don’t rely on single aesthetic scores. Instead, we focus on making audio describable and searchable in ways that stay transparent and grounded in context.

Aesthetic scoring can give the illusion of precision, but it often lumps together different technical qualities, genres, and styles. In music search and discovery, a single score doesn’t explain why a track is surfaced or excluded. That reasoning matters to us. Not to decide what’s “good,” but to give teams tools they can understand and trust.

We see audio intelligence as a way to expose structure, not replace judgment. Our systems surface identifiable musical attributes and relationships, knowing that the same track can be the right or wrong fit depending on how it’s used. The goal is to support human decision-making, not substitute it with scores.

Experimentation has a place, but in music, automation works best when it’s explainable and limit-aware.

What responsible progress in music AI should look like

Progress in music and AI is underpinned by transparency. Teams should be able to understand how a model was trained and how its outputs relate to the audio. When results are interpretable, people can see why a track surfaces and judge for themselves whether the signal makes sense in their own context.

That transparency depends on data choices. Music spans styles, cultures, eras, and uses, and models reflect whatever they are fed. Developers need to work with broad, representative data and be clear about where coverage is thin. Being open about what a model sees, and what it does not, makes its behavior more predictable and its limits easier to manage.

Clear communication matters just as much once tools are in use. For scores and labels to be applied responsibly, teams need a shared understanding of what those signals reflect and where their limits are. Otherwise, even well-intentioned metrics can be stretched beyond what they are able to support.

This kind of openness helps the industry build tools people can understand and trust in real workflows. 

We explored how these expectations show up in practice in “The state of AI transparency in music 2025,” a report developed with MediaTracks and Marmoset on how music licensing professionals make decisions around AI, creator background, and context. You can read the full report here.

So… does Meta’s model provide meaningful ratings for music?

Based on these tests, the answer is no. The model produces stable scores, but they don’t map cleanly to how musical quality or complexity are assessed in real catalog work. Instead, the model appears to align more with easily detectable production traits than with the distinctions people consistently make when judging music in context.

That doesn’t make Audiobox Aesthetics insignificant. It can support research by defining a clear scoring framework, showing how reference-free predictors can be trained across speech, music, and sound, and making its models and data available for inspection and comparison. It also illustrates where AES scores can be useful, particularly when large volumes of audio need to be filtered or monitored but full listening is impractical.

Problems emerge when scores like these begin shaping decisions. When a score is presented as a measure of quality, people need to know what it’s actually measuring so they can judge whether it applies to their use case. Without that clarity, it becomes easy to trust the number even when it’s not a good fit.

At Cyanite, we see this as a reminder of the importance of responsibility in music and AI. Progress is driven by systems that stay grounded in real listening behavior and make their assumptions visible.

How Melodie Music combines sound-based AI search and contextual metadata to spotlight original Australian artists

How Melodie Music combines sound-based AI search and contextual metadata to spotlight original Australian artists

Ready to improve your music discovery workflows? Try Similarity Search in Cyanite.

Cyanite aligns with our philosophy because it doesn’t use AI to generate content; it uses AI to uncover it. It solves a genuine pain point for our users: the time-consuming nature of music search. We immediately saw that Cyanite could amplify our existing search system rather than overwrite it. It wasn’t a case of ‘AI versus humans’; it was AI empowering humans to find better music, faster.

Evan Buist

Managing Director , Melodie Music

Melodie is a music licensing platform that provides pre-cleared music for film, TV, advertising, and content creation. All artists and tracks on the platform are carefully curated and hand-selected for quality, originality, and emotional resonance. Ethics are at the core of Melodie’s company philosophy. It operates under a 50/50 revenue and royalty split, meaning Melodie doesn’t earn money on downloads until the artist does.

To make it easier to discover artists at scale, Melodie continues to refine how users navigate its catalog. AI helps users explore more quickly—but it doesn’t replace the human element behind editorial curation.

The rising tension between depth and speed

As Melodie’s catalog grew, a familiar tradeoff emerged: depth versus speed.

Despite thoughtful editorial tagging, the reality was that users often struggled to translate nuanced creative briefs into static keywords. “Describing music is inherently subjective; what sounds ‘uplifting’ to one person might sound ‘intense’ to another. As the saying goes, talking about music is like dancing about architecture,” explains Evan.

By relying solely on tags, users often found themselves in an experimental searching-listening-refining-repeating loop—a time-consuming effort that most editors and producers simply don’t have the bandwidth for.

Melodie recognized this problem early on and set out to improve the user experience in their library. As Evan puts it, “bridging the gap between ‘hearing it in your head’ and ‘finding it on the screen’ is the holy grail of music licensing.”

AI as an enabler, not a generator

Human curation is central to how Melodie operates. Tracks are not scraped or auto-generated. Over time, it became clear that tags on their own couldn’t support the kind of discovery users needed, so AI was added to help surface music intuitively and improve navigation.

Cyanite aligned naturally with that philosophy.

Rather than positioning AI as a substitute for curation, Cyanite’s AI search treats sound as data that can be understood, compared, and explored. What clicked for Melodie in their search for AI music analysis software was Cyanite’s approach: “The technology felt musical rather than just mathematical. The analysis is intuitive and forgiving, respecting the nuances of the tracks,” says Evan.

Thanks to this shared understanding, Cyanite became part of Melodie’s day-to-day music discovery process.

How Cyanite fits into Melodie’s workflow

Today, Melodie users move fluidly between different music discovery pathways depending on their working process.

Sound-based Similarity Search

Users can use Cyanite’s Similarity Search to analyze a reference song and instantly explore tracks with a comparable emotional arc, energy, and sonic character. The reference can come from Spotify, YouTube, or a temporary edit.

This closes the gap between intuition and results in seconds.

A gif showing the similarity search interface of melodie music

Prompt-based Free Text Search

Some users prefer to express what they are looking for in their own words. Prompt-based search allows them to describe mood, pacing, or instrumentation, even with spelling errors or mixed languages. Evan believes natural language search has done for music libraries what Google did for information in the late 90s: democratized access.

Regardless of how a user describes music, AI provides a laser-accurate shortlist in seconds. It turns discovery into exploration, allowing users to combine the speed of AI with Melodie’s human-tagged editorial filters to find the perfect track.

Evan Buist

Managing Director , Melodie Music

A gif showing the similarity search interface of melodie music
A screenrecording showing a music similartiy search and highlighting music tags

Cyanite has become a vital part of our ecosystem, helping us prove that technology can support culture, not replace it.

Evan Buist

Managing Director , Melodie Music

AI Music Discovery: How Marmoset Uses Cyanite | A Case Study

AI Music Discovery: How Marmoset Uses Cyanite | A Case Study

Founded in 2010, Marmoset is a full-service music licensing agency representing hundreds of independent artists and labels. At the heart of it, their core experience involves browsing for music. They offer music discovery for any moving visual media. From sync (movies...

AI-Powered Music Marketing feat. Chromatic Talents

AI-Powered Music Marketing feat. Chromatic Talents

Chromatic Talents acts like a music brand consultancy providing a comprehensive range of services in artist management, development, digital branding, and business development. Find out how they use AI-Powered Music Marketing powered by Cyanite. The goal of the...

From upload to output: how Cyanite turns audio into reliable metadata at scale

From upload to output: how Cyanite turns audio into reliable metadata at scale

Explore how Cyanite turns sound into structured metadata: Just upload a couple of songs to our web app.

Managing a music catalog involves more than just storing files. As catalogs grow, teams start running into a different kind of challenge: music becomes harder to find, metadata becomes inconsistent, and strong tracks remain invisible simply because they are described differently than newer material.

Many teams still rely on manual tagging or have inherited metadata systems that were never designed for scale. Over time, this leads to uneven descriptions, slower search, and workflows that depend more on individual knowledge than on shared systems. Creative teams spend valuable time navigating the catalog instead of working with the music itself.

Cyanite’s end-to-end tagging workflow was built to address this challenge. It gives teams a stable, shared foundation they can build on, supporting human judgement—not replacing it. It complements subjective, manual labeling with a consistent, audio-based process that works the same way for every track, whether you’re onboarding new releases or making a legacy catalog more organized.

This article walks through how that workflow functions in practice—from the moment audio enters the system to the point where structured metadata becomes usable across teams and tools.

Why tagging workflows tend to break down as catalogs grow

Most tagging workflows start with care and intention. A small team listens closely, applies descriptive terms, and builds a shared understanding of the catalog. But as volume increases and more people get involved, the system begins to stretch.

As catalogs scale, the same patterns tend to appear across organizations:

  • Different editors describe the same sound in different ways.
  • Older metadata no longer aligns with newer releases.
  • Genre and mood definitions shift over time.
  • Search results reflect wording more than sound.

When this happens, teams increasingly rely on memory instead of the systems in place. This leads to strong tracks getting overlooked, response times increasing, and trust in the metadata eroding.

Cyanite’s workflow addresses this fragility by grounding metadata in the audio itself and applying the same logic across the entire catalog.

Preparing your catalog for audio-based tagging

Teams can adopt Cyanite very quickly, as there’s very little preparation involved. The system doesn’t require existing metadata, spreadsheets, or reference information. It listens to the audio file and derives all tags from the sound alone.

Getting started requires very little setup:

  • MP3 files up to 15 minutes in length
  • No pre-existing metadata
  • No manual pre-labeling
  • No changes to your current file structure

Even 128 kbit/s MP3s are usually sufficient, which means older archive files can be analyzed as they are—no need for additional audio preparation. Teams can then choose how they want to bring audio into Cyanite based on volume and workflow. Once that’s decided, tagging can begin immediately.

If you’re unsure about uploading copyrighted audio to Cyanite, you can explore our security standards and privacy-first workflows, including options to process audio in a copyright-safe way using encrypted or abstracted data.

Bringing audio into Cyanite in a way that fits your workflow

Different organizations manage music in different ways, so Cyanite supports several ingestion paths that all lead to the same analysis results.

Teams working with smaller batches often start in the web app. This is common for sync teams reviewing submissions, catalog managers auditing older libraries, or teams testing Cyanite before deeper integration. Audio can be uploaded directly, selected from disk, or referenced via a YouTube link, with analysis starting automatically once the file is added.

Platforms and larger catalogs usually integrate via the API. In this setup, tagging runs inside the organization’s own systems. Audio is uploaded programmatically, and results are delivered automatically via webhook as structured JSON as soon as processing is complete. This approach supports continuous ingestion without manual steps and fits naturally into existing pipelines.

For very large catalogs, Cyanite can also provide a dedicated S3 bucket with CLI credentials. This allows high-throughput ingestion without relying on browser-based uploads. It’s often used during initial onboarding of catalogs containing thousands of tracks.

Some teams prefer not to upload files themselves at all. In those cases, audio can be shared via common transfer tools before the material is processed and delivered in the agreed format.

What happens once the analysis is complete?

Cyanite produces a structured, consistent description of how each track sounds, independent of who uploaded it or when it entered the catalog.

Metadata becomes available either in the web app library or directly inside your system via the API. We can also deliver an additional CSV and Google Spreadsheet export on request.

Each track receives a stable set of static tags and values, including:

  • Genres and free-genre descriptors
  • Moods and emotional dynamics
  • Energy and movement
  • Instrumentation and instrument presence
  • Valence–arousal values
  • The most representative part of the track
  • An Auto-Description summarizing key characteristics

All tags are generated through audio-only analysis, which ensures that legacy tracks and new releases follow the same logic. Over time, this consistency becomes the foundation for faster search, clearer filtering, and more reliable collaboration across teams.

The full tagging taxonomy is available for teams that want deeper insight into how attributes are defined and structured. Explore Cyanite’s tagging taxonomy here.

Curious how the Google Spreadsheet export looks? Check out this sample.

How long tagging takes at different catalog sizes?

Cyanite processes audio quickly. A typical analysis time is around 10 seconds per track. Because processing runs in parallel, turnaround time depends more on workflow setup than on catalog size.

In practice, teams can expect:

  • Small batches to be ready almost instantly
  • Medium-sized libraries to complete within hours
  • Enterprise-scale catalogs to be onboarded within 5–10 business days, regardless of size

For day-to-day use via the API, results arrive in near real time via webhook as soon as processing finishes. This makes the workflow suitable both for large one-time onboarding projects and continuous ingestion as new music arrives.

Understanding scores, tags, and why both matter

Cyanite’s models produce two complementary layers of information.

Numerical scores describe how strongly an attribute is present, both across the full track and within time-based segments. These values range from zero to one, with 0.5 representing a meaningful threshold.

Cyanite creates final tags by using an additional decision layer that considers how different attributes relate to one another. It doesn’t just apply a simple cutoff. This approach helps resolve ambiguities, stabilize hybrid sounds, and produce tags that make musical sense in context.

This means you get metadata that remains robust even for tracks that blend genres, moods, or production styles—a common challenge in modern catalogs.

Exporting metadata into your existing systems

Once tags are available, your team can export them in the format that best fits your workflow.

API users typically work with structured JSON, delivered automatically via webhook and accessible through authenticated requests. Cyanite’s Query Builder allows teams to explore available fields and preview real outputs before integration.

For one-time projects or larger deliveries, metadata can also be provided as CSV files. Web app users can request CSV export through Cyanite’s internal tools, which is especially useful during catalog cleanups or migrations.

Because the structure remains consistent across formats, metadata can be reused across systems without rework.

Learn how to quickly build your queries for the Cyanite API with our Query Builder.

How teams use tagged metadata in practice

Once audio-based tagging is in place, teams tend to notice changes quickly. Search becomes faster and more predictable. Creative teams can filter by sound instead of guessing keywords. Catalog managers spend less time fixing metadata and more time shaping the catalog strategically.

In practice, tagged metadata supports workflows such as:

  • Catalog management and cleanup
  • Creative search and curation
  • Ingestion pipelines
  • Licensing and rights
  • Sync briefs and pitching
  • Internal discovery tools
  • Audits and reporting

Over time, consistent metadata reduces friction between departments and makes catalog operations more resilient as libraries continue to grow.

Best practices from real-world usage

Teams see the smoothest results when they work with clean audio sources, batch large uploads, manage API credentials carefully, and switch to S3-based ingestion as catalogs become larger. Thinking about export formats early also helps avoid rework during onboarding projects.

None of this changes the outcome of the analysis itself, but it does make the overall process more predictable and easier to manage at scale.

With Cyanite, we have a partner whose technology truly matches the scale and diversity of our catalog. Their tagging is fast and reliable, and Similarity Search unlocks a whole new way to discover music, not just through filters, but through feeling. It’s a huge step forward in how we help creators connect with the right tracks.

Stan McLeod

Head of Product, Lickd

Final thoughts

Cyanite’s tagging workflow is designed to scale with your catalog without making your day-to-day work more complex. Whether you upload a handful of tracks through the web app or process tens of thousands via the API, the result will be the same: structured, consistent metadata that reflects how your music actually sounds.

If you’re ready to move away from manual tagging and toward a more stable foundation for search and discovery, explore the different ways to work with Cyanite and choose the setup that fits your workflow.

Want to work with Cyanite? Explore your options, and get in touch with our business team, who can provide guidance if you’re unsure how to start.

FAQs

Q: Do I need to send existing metadata to use Cyanite’s tagging workflow?

A: No. Cyanite analyzes the audio itself. It doesn’t rely on existing tags or descriptions.

Q: Can Cyanite handle both legacy catalogs and new releases?

A: Yes, it can. The same analysis logic applies to all tracks, which helps unify older and newer material under a single metadata structure.

Q: How are results delivered when using the API?

A: Results are sent automatically via webhook as structured JSON as soon as processing is complete.

Q: Is the tagging output consistent across export formats?

A: Yes. JSON and CSV exports use the same underlying structure and values.

Q: Who typically uses this workflow?

A: Music publishers, production libraries, sync teams, music-tech platforms, and catalog managers use Cyanite’s tagging workflow to support search, licensing, onboarding, and catalog maintenance.

Q: How long will it take to tag my music?

A: Small batches are tagged almost immediately. For larger catalogs, we usually need 5–10 business days for the complete setup.

How Cyanite protects your sensitive audio: privacy-first workflows for every catalog

How Cyanite protects your sensitive audio: privacy-first workflows for every catalog

Looking for secure AI music analysis? Discover Cyanite’s integration options. 

For many music teams, a significant hesitation about AI analysis is not about its capability or quality. It’s about trust. When teams explore AI-driven tagging or search, the conversation almost always leads to the same question: What happens to our audio once it leaves our system?

At Cyanite, we’ve built our technology around that concern from the very beginning. Rather than offering a single security promise, we provide multiple privacy-first workflows designed to meet different levels of sensitivity and compliance. This gives teams the flexibility to choose how their audio is handled, without compromising on tagging quality or metadata depth.

This article outlines the three privacy models Cyanite offers, explains how each one works in practice, and helps you decide which setup best fits your catalog and internal requirements.

Why audio privacy matters in modern music workflows

For those who manage it, audio represents creative identity, contractual responsibility, and, often, years of human effort. It’s not just another data type. Sending that material outside an organization can feel risky, even when the technical safeguards are strong and the operational benefits are clear.

Teams that evaluate our services often raise concerns about protecting unreleased material, complying with licensing agreements, and maintaining long-term control over how their catalogs are used. They look for assurances around:

  • Safeguarding confidential or unreleased content
  • Complying with NDAs and contractual obligations
  • Meeting internal legal or security standards
  • Maintaining full ownership and control

These are not edge cases. They reflect everyday realities for publishers, film studios, broadcasters, and music-tech platforms alike. That’s why Cyanite treats privacy as a core design principle.

Security option 1: GDPR-compliant processing on secure EU servers

For many organizations, strong data protection combined with minimal operational complexity is the right balance. In Cyanite’s standard setup, all audio is processed on secure servers located in the EU and handled in full compliance with GDPR.

In practical terms, this means:

  • Audio files are never shared with third parties.
  • Songs can be deleted anytime.
  • Ownership and control of the music always remains with the customer.

This model works well for publishers, production libraries, sync platforms, and music-tech companies that want to scale tagging and search workflows without maintaining their own infrastructure. For most catalogs, this level of protection is both robust and sufficient.

That said, not every organization is able to send audio outside its own environment, even under GDPR. For those cases, Cyanite offers additional options.

Learn more: See how AI music tagging works in Cyanite and how it supports large catalogs.

Security option 2: zero-audio pipeline—tagging without transferring audio

Some teams manage catalogs that cannot be transferred externally at all. These include confidential film productions, enterprise music departments, and archives operating under strict internal compliance rules. For these situations, Cyanite provides a spectrogram-based workflow that enables full tagging without the audio files ever being sent.

Three spectograms

Spectrograms from left to right: Christina Aguilera, Fleetwood Mac, Pantera

Instead of uploading MP3s, audio is converted locally on the client side into spectrograms using a small Docker container provided by Cyanite. A spectrogram is a visual representation of frequency patterns over time. It contains no playable audio, cannot be converted back into a waveform without significant quality loss, and does not expose the original performance in any usable form.

From a metadata perspective, the results are identical to audio-based processing. From a privacy perspective, the original audio never leaves the customer’s environment. This makes the zero-audio pipeline a strong middle ground for teams that want AI-powered tagging while maintaining strict control over their content.

From a product perspective, all Cyanite features can be fully leveraged.

For us at Synchtank, the spectrogram-based upload was key. Many of our clients are cautious about where their audio goes, and this approach lets us use high-quality AI tagging and search without transferring any copyrighted audio. That balance, confidence for our customers without compromising on quality, is what made the difference for us.Amy Hegarty, CEO at Synchtank 

Learn more: What are spectrograms, and how can they be applied to music?

Security option 3: fully on-premise deployment via the Cyanite Audio Analyzer on the AWS Marketplace

For organizations with the highest security and compliance requirements, Cyanite also offers a pseudo-on-premises deployment option via the AWS Marketplace. In this setup, Cyanite’s tagging engine runs entirely inside the customer’s own AWS cloud infrastructure via the Cyanite Audio Analyzer.

This approach provides:

  • Complete pseudo-on-premise processing
  • Zero data transfer outside your AWS cloud environment
  • Full control over storage, access, and compliance
  • Tagging accuracy identical to cloud-based workflows

This option is typically chosen by film studios, broadcasters, public institutions, and organizations working with unreleased or highly sensitive material that must pass strict internal or external audits.

Because the pseudo-on-premise container operates in complete isolation (no internet connection), search-based features—including Similarity Search, Free Text Search, and Advanced Search—are not available in this setup. In pseudo-on-premise environments, Cyanite therefore focuses exclusively on audio tagging and metadata generation.

Important note: The rates on the AWS Marketplace are intentionally high to deter fraudulent activity. Please contact us for our enterprise rates and find the best plan for your needs.

Choosing the right privacy model for your catalog

Selecting the right setup depends less on catalog size and more on how tightly you need to control where your audio lives. A useful way to frame the decision is to consider how much data movement your internal policies allow.

In practice, teams tend to choose based on the following considerations:

  • GDPR cloud processing works well when secure external processing is acceptable.
  • Zero-audio pipelines suit teams that cannot transfer audio but can share abstract representations.
  • Pseudo-on-premise deployment is best for environments requiring complete isolation.

All three options deliver the same tagging depth, consistency, and accuracy. The difference lies entirely in how data moves, or doesn’t move, between systems.

Final thoughts

Using AI with music requires trust—trust that audio is handled responsibly, that ownership is respected, and that workflows adapt to real-world constraints rather than forcing compromises. Cyanite’s privacy-first architecture is designed to uphold that trust, whether you prefer cloud-based processing, a zero-audio pipeline, or a fully isolated pseudo-on-premise deployment.

If you’d like to explore which setup best fits your catalog, workflow, and compliance needs, you can review the available integration options.

FAQs

Q: Where is my audio processed when using Cyanite’s cloud setup?

A: In the standard setup, audio is processed on secure servers located in the EU and handled in full compliance with GDPR. Audio is not shared with third parties and remains your property at all times.

Q: Can I use Cyanite without sending audio files at all?

A: Yes. With the zero-audio pipeline, you convert audio locally into spectrograms and send only those abstract frequency representations to Cyanite. The original audio never leaves your environment, while full tagging results are still generated.

Q: What is the difference between the zero-audio pipeline and pseudi-on-premise deployment?

A: The zero-audio pipeline sends spectrograms to Cyanite’s cloud for analysis. The pseudo-on-premise deployment runs the Cyanite Audio Analyzer entirely inside your own AWS cloud infrastructure, which is cut off from the internet and only connected to your system. Pseudo-on-premises offers maximum isolation but only supports tagging, without search features.

Q: Are Similarity Search and Free Text Search available in all privacy setups?

A: Similarity Search, Free Text Search, and Advanced Search are available in cloud-based and zero-audio pipeline workflows. In fully pseudo-on-premise deployments, Cyanite focuses exclusively on tagging and metadata generation due to the isolated environment.

Q: Which privacy option is right for my catalog?

A: That depends on your internal security, legal, and compliance requirements. Teams with standard protection needs often use GDPR cloud processing. Those with higher sensitivity choose the zero-audio pipeline. Organizations requiring full isolation opt for on-premise deployment. Cyanite supports all three.