// Xavier Fresquet — SCAI · Sorbonne Abu Dhabi
A lab where medieval chant manuscripts meet large language models, where Renaissance polyphony gets a GPU, and where the boundary between musicologist and ML engineer has long since evaporated.
Captal — from Old Gascon capitalis, "chief lord". A medieval feudal title in Gascony, immortalized by Jean Froissart in his Chronicles as the Captal de Buch, champion of the Black Prince. The name came back during a White Rose Fellowship in Sheffield, where I first read Froissart. It stuck.
I'm a musicologist who became suspicious of computers early on, then couldn't stop. Trained at Bordeaux Montaigne and Sorbonne, I did a PhD in Music & Digital Humanities (2011), with research stays at the University of Sheffield (White Rose Scholarship), Brown University, UC Berkeley, and a sabbatical year at UWA in Perth that confirmed: no matter how far you go, you're still thinking about medieval manuscripts.
Back in Paris, I co-founded SCAI — the Sorbonne Center for Artificial Intelligence — alongside mathematician Gérard Biau. Running SCAI meant organizing a few hundred seminars, managing European grants across three languages, and convincing a lot of scientists that the humanities might actually have something useful to say about AI.
In February 2026, I moved to Abu Dhabi to become Director of SCAI at Sorbonne Abu Dhabi. The lab followed. The medieval sources did not, but the servers talk to Eurasia just fine over SSH.
The CAPTAL Lab is where I house everything: AI models for music generation, computational musicology, open datasets, DH tools, and the occasional philosophical detour into AI ethics. The name stands for Computational Analysis of Polyphonic and Traditional Artworks Lab — which nobody actually says out loud.
The Musiconis database holds thousands of medieval musical representations — instruments, performers, angels with lutes, suspicious-looking theorists. We've built an agentic AI system that reasons over this corpus using multimodal LLMs. Article submitted to Digital Medievalist, June 2026.
A large-scale symbolic music corpus spanning roughly five centuries, with harmonic annotations. The pipeline runs on an HPC cluster named Eurasia. It is cooking. We check on it regularly.
Fine-tuning Gemma on French Baroque polyphony (think Charpentier, lots of it) with preference-based alignment. Multiple training phases, each more elaborately named than the last. Results pending, spirits high.
Can a small language model reason through counterpoint? We're building chain-of-thought reasoning traces from music theory treatises and fine-tuning DeepSeek-R1. It thinks before it writes notes. Very à la mode.
A tripartite Transformer + Diffusion + VAE architecture for symbolic music generation. Soprano inpainting: done. The model has been running on Eurasia long enough to know the cluster's maintenance schedule.
Computational philology of Latin music performance terminology. ~5,200 words, a full NLP pipeline, and more ablative absolutes than strictly necessary. Targeting Digital Humanities Quarterly.
52 sacred works by Marc-Antoine Charpentier, annotated and published on Zenodo (DOI: 10.5281/zenodo.20450425). CITATION.cff, full metadata, multi-format. It's out there. Properly.
Gradio app for Baroque style transfer: feed it a melody, get back something that sounds vaguely like Charpentier. Powered by a fine-tuned NotaGen. Running locally on port 8012. Works surprisingly often.
Transforming the Euterpe iconographic database into a proper RDF/SKOS knowledge graph, with a data paper and migration pipeline into Musiconis (~1,000 medieval records). Unglamorous, essential.
A structured lexicon of 19th-century music performance terminology, pulling from French, German (Meyers — yes, the whole thing), and English sources. 3,760 entries. Zenodo deposit imminent.
On plainchant that happens outside the book — informal, contextual, sung without a score. Article complete, titled "Bizarrifying the Past", targeting Filigrane. Related to a November 2025 cathedral concert.
145 automated tests covering fugue exposition detection, subject/answer identification, and voice-leading validation. Phases 0–4 complete. Bach would still find something to complain about.
Applying phylogenetic and cladistic methods to the morphological evolution of the medieval harp via the Musiconis dataset. With Edmundo Camacho (UNAM). Article in co-author review, targeting Cahiers d'ethnomusicologie.
LoRA fine-tuning of Gemma on Renaissance polyphony. Training completed on Eurasia (~5h). The adapter exists; it sounds like it read Josquin but is unsure about Palestrina.
A retrieval-augmented generation assistant over the MusiSorbonne archival corpus. Useful for navigating large, heterogeneous musicological archives. Prototype running; article drafted.
MELODY covers four interconnected research tracks, all running in parallel on the CAPTAL-LAB infrastructure (local machines + Eurasia GPU cluster). The target output: open datasets, reproducible models, and a substantial contribution to the ISMIR community. 2026 is just the start.
A large-scale symbolic music corpus spanning five centuries with harmonic annotations. Pipeline running on Eurasia. ~5,000 pieces expected. The infrastructure for everything else.
Two parallel approaches to symbolic music generation: a fine-tuned Gemma with preference alignment on French Baroque polyphony, and a Transformer+Diffusion+VAE tripartite architecture for soprano inpainting and conditioned generation.
Can a small language model reason through counterpoint? Chain-of-thought traces from music theory treatises, fine-tuning DeepSeek-R1 on voice-leading tasks. The model thinks before it writes notes. TISMIR target.
Open-source toolchain for symbolic music ingestion, conversion, and analysis. Drop a score file in; get ABC, MusicXML, AST, and AI-driven musicological analysis out. Freely available on GitHub and HAL.
Since February 2026, CAPTAL-LAB has a Gulf outpost. As Director of SCAI at Sorbonne Abu Dhabi, the research agenda has expanded to include Gulf soundscapes, local heritage digitisation, and Year of AI 2026 coordination. The timezone is challenging; the light is not.
A DH seminar series at SUAD — informal by design, more garden table than podium. Topics move between AI methods, heritage data, and questions without clean answers yet. The project has a physical dimension: students are building an AI-managed garden on campus, with soil sensors and a Ma Hawa atmospheric water generator — a machine that makes water from air. In Abu Dhabi, that's not a metaphor.
Computational and ethnomusicological study of migrant musical worlds, cultural policy, and sonic space in Abu Dhabi. 175 entries, full draft complete. Targeting Ethnomusicology / Music & Politics.
A web application archiving and spatialising soundscape recordings from Abu Dhabi. Prototype operational. The data paper and Zenodo deposit are in progress. Listening to cities as data.
A web radio project for Sorbonne Abu Dhabi. Because a university should have a radio station, and someone should make playlists that algorithmically transition from Lassus to Mohamed Abdo without warning.
A slightly crazy idea, full of hope. CERN brought nations together around a particle accelerator and kept channels open across the Iron Curtain. What if Abu Dhabi — crossroads between blocs, acceptable to all sides — hosted a neutral public institution for frontier AI: open science, shared evaluation, safety research, and fellows from the Global South? Not a think-tank. An instrument. A paper is being written; the argument is more serious than it sounds.
An idea from Sophia Mahroug: a Digital Humanities lab focused on archiving censored and vanishing web content — legislation behind geofences, blogs erased after regime change, state sites rewritten in silence. Beyond preservation, a space to bring the DH community in Abu Dhabi together around shared tools, methods, and questions that matter. A seminar, a lab, and — if we get the timing right — a research project.
These are people I work with, have worked with for a long time, and genuinely like. Frédéric Billiet (Paris-Sorbonne) has been my closest collaborator in medieval musicology for over a decade — co-creator of Musiconis, and the kind of colleague whose rigour makes the computational experiments feel honest. Gérard Biau (Sorbonne) co-founded SCAI with me and has the rare quality of making a mathematician and a medievalist feel like they're working on the same problem.
Susan Boynton (Columbia) brought a wonderful chant scholarship perspective to the Musiconis project. Jérôme Nika (IRCAM) thinks about AI and musical creativity in ways I find consistently inspiring. Valérie Nunes-Le Page navigates the border between music, AI, and artistic practice with real elegance. Fouad Aouinti and Victoria Eyharabide (both at LIP6/Sorbonne) have been great collaborators on deep learning and medieval image analysis. Motasem Alrahabi and Pierre-Marie Chauvin (Sorbonne) are colleagues whose paths keep crossing mine in the best ways.
Edmundo Camacho (UNAM) spent 2024–2025 applying phylogenetics to medieval harps, which is exactly as interesting as it sounds. Sabrina Moura (UNICAMP) collaborated on the Arabian Gulf DH project — a genuinely stimulating collective effort. Ana Amorim dragged me into a beautiful project about mental maps and urban memory, Unwalked Days, which I'm glad she did.
In Abu Dhabi: David Wrisley (NYU Abu Dhabi) is the kind of DH interlocutor you hope to find when you land in a new city. Proscovia Svärd and Lama Tarsissi (SUAD) make the day-to-day research life here genuinely good. And Louvre Abu Dhabi has been an open and generous partner for thinking about AI and cultural heritage in the Gulf.
A selection of recent output — articles, preprints, and open datasets. For the full picture, see my HAL page.