The Forgotten OGs of AI Music

Discover the pioneers who developed AI music when no one was watching.

Jordi Pons

Apr 21, 2025

Much has happened between the pioneering AI music of the 1980s and today's generative AI wave.

Along that journey, the connectionists’ work was forgotten during the AI winter.

They didn’t have fancy million-dollar compute clusters.

They had VAX-11/780 workstations running at 0.1 MFLOPS.

This is the story of the connectionists.

The OGs (original gangsters) whose work quietly influenced the models running today.

No GPUs required.

Only AI music and research from the 80s, 90s, and early 2000s.

AI winter facts

Expert systems were the dominant AI approach in the 1980s.
Expert systems used databases of expert knowledge.
Expert systems failed to scale and were expensive to build and maintain.
Disillusionment with expert systems led to reduced funding.
Connectionists were inspired by how the human brain works.
Connectionists use models known as artificial neural networks.
Connectionist methods faced skepticism due to the limited computational power and datasets available at the time.

Connectionists: 1980s and 1990s

The AI winter period resulted in a series of spurious work on algorithmic composition.

They maintained the field’s relevancy from the 80’s to the 2000’s.

This is the contribution of the so-called connectionists to the field of AI music.

Yet, these early works are pretty much unknown to most contemporary researchers.

Todd, 1988 – “A sequential network design for musical applications” in Proceedings of the Connectionist Models Summer School.
Lewis, 1988 – “Creation by Refinement: A creativity paradigm for gradient descent learning networks” in International Conference on Neural Networks.

This first wave of work was initiated in 1988 by Lewis and Todd.

Who proposed the use of neural networks for automatic music composition.

Lewis used a multi-layer perceptron for his algorithmic approach to composition called “creation by refinement”.

That, in essence, is based on the same idea as DeepDream: utilizing gradients to create art.

File:Deep-dream-white-noise-0045.jpg — DeepDream image which started with white noise (CC0).

Todd experimented with Jordan & Elman (auto-regressive) neural networks to generate music sequentially.

A principle that, after so many years, is still valid.

Many kept using this idea (auto-regressive, next-token modelling) throughout the years:

2000s — Eck and Schmidhuber, who proposed using LSTMs for sequential algorithmic composition (see below).
2020s — Or, to consider a more recent work, ChatGPT also makes use of this same causal principle.

But, if their ideas were correct, why did they not succeed?

Well, in Lewis’ words: “it was difficult to compute much of anything”.

While modern GPUs can deliver hundreds or even thousands of TFLOPS, the VAX-11/780 workstation that Lewis used in 1988 offered just 0.1 MFLOPS.

Although Lewis and Todd worked on algorithmic music composition, other connectionists explored different musical tasks.

In 1989, Laden and Keefe did some work on chord classification.

Or in 1995, Matityaho and Furst classified spectrograms into pop or classical music.

And in 1997, Dannenberg et al. studied how to classify MIDI scores into music styles like "syncopated" or "pointillistic".

LSTMs: early 2000s

Now we are in the AI and transformers era (the T in ChatGPT).

But back then, right before the deep learning days, LSTMs were popular.

LSTMs are a type of artificial neural network that can learn long-term dependencies.

Eck and Schmidhuber used LSTMs learn the (long-term) musical structure in blues music.

Eck & Schmidhuber, 2002 – “Finding temporal structure in music: Blues improvisation with LSTM recurrent networks” in IEEE Workshop on Neural Networks for Signal Processing.

Listen to an LSTM blues improvization by Eck & Schmidhuber in 2002.

1×

0:00

-5:47

👾 Art in Tech 👾

Discussion about this post