Unravel AI with Latent Terrain and explore neural synthesis in Max

“I’m not particularly interested in typing prompts and building things; I’m interested in breaking and analyzing prompts,” writes Jasper Shuoyang Zheng while introducing Latent Terrain. Now you can play the strange world of neural audio codecs like an instrument and transform your own sounds into new textures using the intuitive and elegantly designed open source Max externals and UI.

There’s really a lot to say about this general field of activity. Unlike the hyped AI in music technology that steals the headlines, the data source is your own sound. And the processing is done locally, not in a data center where it sucks water. And…well, it’s just beautiful and weird and can be played using instrumental techniques.

Naturally, this attracts musicians and experimentalists. Because the neural network goes back to discovering new sounds instead of making everything sound the same.

Latent Terrain is a great place to start because it’s especially accessible to Max users, especially those who are already using FluCoMa or Data Knot. This generates a visual map, a kind of warped texture that you can move from your mouse to your controller as you like. You can even train your own tiny neural networks directly in Max and watch your tones crunch, crumble, and melt into new material. You can carefully build sound libraries and graph their paths. It’s very personal.

To get a feel for the sound, here’s a great example by Keigo Yoshida. It uses a technique that sonifies EEG measurements.

More than just demos, there are also artistic demos that promote technology as a way to make sense of sound. I love Liu Jiatong’s work here, especially because my first impression of Beijing was shocked by the hutongs and really wanted to walk inside them forever.

I’d like to hear more details, Jasper, please consider this first. (The project will also be presented at NIME (International Conference on New Interfaces for Musical Expression) in London later this month.)

Liu Jiatongnn/mémoire is a virtual gallery soundscape built from archival recordings of Beijing’s hutong districts, a rapidly disappearing urban soundworld. The terrain becomes a spatially moving ambient archive. Liu explained that “learning how to deal with unpredictability” is not a problem to be eliminated, but a core design issue.

It’s also all very well documented, so I encourage you to read it. This article by Jasper is a great guide to this work and its potential.

Latent Domain: Anatomy of Neural Audio Codecs

Project page with research content:

https://jasper-zheng.github.io/nn_terrain/

Example Max for Live devices are inbound and already support some common choices, allowing you to use different audio autoencoders for different results. More generally, there’s been a lot of movement on the Stable Audio front as well, so I’ll try to keep up.

Download Max (and Max for Live) on macOS and Windows. (And I’m sure someone is thinking about Pure Data, so stay tuned…)

https://jasper-zheng.github.io/nn_terrain/installation

Previously and very relevant:

Tags: AI, autoencoder, external, FLOSS, free as free, Jasper Shuoyang Zheng, latent space, latent terrain, Mac, machine learning, MacOS, maximum, maximum live, neural audio, neural audio codec, neural network, neem, open source, research, software, sound art, synthesis, Windows

Source link