Tech

Google AI tool uses written descriptions to create music

A paper describing the results of a music-making artificial intelligence (AI) tool was published this week by Google researchers.

The AI music tool MusicLM is not the first to be released. Google, on the other hand, uses a limited set of descriptive words to illustrate musical creativity in the examples it provides.

AI demonstrates how human-like behavior has been taught to complex computer systems.

Tools like ChatGPT can quickly produce written documents that are comparable to human efforts. To operate intricate machine-learning models, ChatGPT and similar systems necessitate powerful computers. Late last year saw the launch of ChatGPT by OpenAI, a company based in San Francisco.

These systems, including AI voice generators, are programmed using a vast amount of data to learn and replicate various types of content. Written content, design elements, art, or music are all examples of computer-generated content.

ChatGPT has recently garnered a lot of attention for its capacity to generate intricate writings and other content from a straightforward natural language description.

MusicLM from Google

The MusicLM system is explained by Google engineers as follows:

A user’s first step is to think of a few words that best describe the kind of music they want the tool to make.

A user could, for instance, enter the following succinct phrase into the system: “a continuous calming violin backed by a soft guitar sound.” The descriptions that are entered may include various musical genres, instruments, or other sounds that already exist.

MusicLM produced a number of distinct music examples that were made available online. Some of the music that was made was based on just one or two words, like “jazz,” “rock,” or “techno.” Other examples were generated by the system from more in-depth descriptions that included entire sentences.

For example, Google researchers provide MusicLM with the following instructions: “The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds…”

In the final recording, the music seems to stay very close to what was described. According to the team, the system can attempt to produce a better description with more detail.

The machine-learning platforms used by ChatGPT are analogous to how the MusicLM model operates. Because they are trained on huge amounts of data, these tools can produce human-like results. The systems are fed a wide variety of materials to enable them to acquire complex skills for creating realistic works.

According to the team, the system can also create examples based on a person’s own singing, humming, whistling, or instrument playing, in addition to creating new music from written descriptions.

The tool “produces high-quality music…over several minutes, while being faithful to the text conditioning signal,” according to the researchers.

The MusicLM models have not yet been made available to the general public by the Google team. This is in contrast to ChatGPT, which was made accessible online in November for users to try out.

However, MusicCaps, a “high-quality dataset” composed of over 5,500 music-writing pairs prepared by professional musicians, was announced by Google. This action was taken by the researchers to aid in the creation of additional AI music generators.

According to the MusicLM researchers, they are confident that they have developed a novel instrument that will enable anyone to quickly and easily produce music selections of high quality. However, the team stated that it also recognizes some machine learning-related risks.

“biases present in the training data” was one of the most significant issues the researchers identified. Including too much of one side and not enough of the other could be considered bias. “About appropriateness for music generation for cultures underrepresented in the training data,” the researchers stated, “raises a question.”

The team stated that it intends to continue to study any system results that could be regarded as cultural appropriation. Through additional development and testing, the objective would be to reduce biases.

In addition, the researchers stated that they intend to continue improving the system to include better voice and music quality, text conditioning, and the generation of lyrics.

Raeesa Sayyad

Recent Posts

How Adam Adler Turns Operating Experience Into Smarter Investments

There’s a moment in every business where the plan stops working, and real decisions begin.… Read More

2 days ago

Organic SEO in 2026: From Digital Growth Engine to Brand Survival in the Age of AI

Mohsen Avid, CEO of Kholaseh Agency, offers a comprehensive analysis of the global SEO market, the hidden… Read More

2 days ago

Evan Weiss, St Louis: Why Apprenticeships Are Making a Comeback—and Why They Should

Apprenticeships are making a resurgence in today’s workforce, offering an effective alternative to traditional college… Read More

5 days ago

Global Supply Chain Redistribution: Otto Media Grup Rapidly Builds a More Effective Growth System Between Singapore and Indonesia

Since the onset of the Russia-Ukraine conflict in early 2022, the global business environment has… Read More

1 week ago

A Global Academic Leader Driving the Future of Education: The Journey of Habib Al Souleiman

Prof. Dr. Dr. h.c. Habib Al Souleiman is recognized as a leading figure in today's… Read More

1 week ago

Forged in the Flames: How One Man’s Brutal Odyssey Is Rewriting the Rules of Command

In an era when polished résumés and corporate seminars dominate leadership conversations, a different paradigm… Read More

2 weeks ago