SMC 2023 Proceedings of the Sound and Music Computing Conference 2023 Royal College of Music and KTH Royal Institute of Technology in Stockholm, Sweden ISBN 978-91-527-7372-7 Bresin, R., & Falkenberg, K. (Eds.). 2023. Proceedings of the 20th Sound and Music Computing Conference. June 15-17, 2023. Stockholm, Sweden DOI 10.5281/zenodo.8136568 ISBN 978-91-527-7372-7 Conference website: smcnetwork.org/smc2023/ Video recordings of the conference concerts and keynotes: www.youtube.com/@navetresearch Table of Contents CREPE NOTES: A NEW METHOD FOR SEGMENTING PITCH CONTOURS INTO DISCRETE NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Xavier Riley and Simon Dixon Design Process in Visual Programming: Methods for Visual and Temporal Analysis . . . . . . 6 Jack Armitage, Thor Magnusson and Andrew McPherson Comparing various sensors for capturing human micromotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Maham Riaz, Finn Upham, Kayla Burnim, Laura Bishop and Alexander Refsum Jensenius Introducing stateful conditional branching in Ciaramella. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Paolo Marrone, Stefano D’Angelo and Federico Fontana F0 ANALYSIS OF GHANAIAN POP SINGING REVEALS PROGRESSIVE ALIGNMENT WITH EQUAL TEMPERAMENT OVER THE PAST THREE DECADES: A CASE STUDY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Iran Roman, Daniel Faronbi, Isabelle Burger-Weiser and Leila Adu-Gilmore Conditional sound effects generation with regularized WGAN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Yunyi Liu and Craig Jin INTERACTIVE MUSIC SCORE COMPLETION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Gregory Beller, Jacob Sello, Georg Hajdu and Thomas Görne STRUCTURING MUSIC FOR ANY SOURCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Orestis Karamanlis Daisy Dub: a modular and updateable real-time audio effect for music production and performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Rasmus Kjærbo, Leo Fogadic, Oliver Bjørk Winkel and Stefania Serafin The effect of actuating the bass trombone second valve on the quality of note transition in legato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Renato Lisboa, Gustavo Machado, Thiago Campolina and Mauŕıcio Loureiro TickTacking – Drawing trajectories with two buttons and rhythm . . . . . . . . . . . . . . . . . . . . . . . . 63 Davide Rocchesso, Alessio Bellino and Antonino Perez VIBROTACTILE FEEDBACK ENHANCES PERCEIVED AROUSAL AND LISTENING EXPERIENCE IN MUSIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Hanna Järveläinen and Eric Larrieux WebChucK IDE: A Web-Based Programming Sandbox for ChucK . . . . . . . . . . . . . . . . . . . . . . . . 79 Terry Feng, Celeste Betancur, Michael Mulshine, Chris Chafe and Ge Wang XR etudes for augmented piano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Giovanni Santini Dynamical Complexity Measurement with Random Projection: A Metric Optimised for Realtime Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chris Kiefer Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden i Temporality Across Three Media: Inner Transmissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Julia Mills VocalHUM: real-time whisper-to-speech enhancement for patients with vocal frailty . . . . . . 103 Francesco Roberto Dani, Sonia Cenceschi and Alessandro Trivilini A Programmable Linux-Based FPGA Platform for Audio DSP . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Pierre Cochard, Maxime Popoff, Antoine Fraboulet, Tanguy Risset, Stephane Letz and Romain Michon DJeye: Towards an Accessible Gaze-Based Musical Interface for Quadriplegic DJs . . . . . . . . 117 Fabio Bottarelli, Nicola Davanzo, Giorgio Presti and Federico Avanzini EFFICIENT SIMULATION OF ACOUSTIC PHYSICAL MODELS WITH NONLINEAR DISSIPATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Riccardo Russo, Michele Ducceschi, Stefan Bilbao and Matthew Hamilton musif: a Python package for symbolic music feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Ana Llorens, Federico Simonetta, Mart́ın Serrano and Álvaro Torrente Score-Informed MIDI Velocity Estimation for Piano Performance by FiLM Conditioning . . 139 Hyon Kim, Marius Miron and Xavier Serra A Web-Based MIDI 2.0 Monitor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Federico Avanzini, Vanessa Faschi and Luca Andrea Ludovico Web Applications for Automatic Audio-to-Score Synchronization with Iterative Refinement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Adriano Baratè, Goffredo Haus, Luca Andrea Ludovico, Giorgio Presti, Stefano Di Bisceglie, Alessandro Minoli and Davide Andrea Mauro Multi-Source Contrastive Learning for Musical Audio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Christos Garoufis, Athanasia Zlatintsi and Petros Maragos A Microcontroller-Based Network Client Towards Distributed Spatial Audio . . . . . . . . . . . . . . 170 Thomas Rushton, Romain Michon and Stéphane Letz Principal Component Analysis of binaural HRTF pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Georgios Marentakis EXPLORING POLYPHONIC ACCOMPANIMENT GENERATION USING GENERATIVE ADVERSARIAL NETWORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Danae Charitou, Christos Garoufis, Athanasia Zlatintsi and Petros Maragos A real-time cent-sensitive strobe-like tuning software based on spectral estimates of the Snail-Analyser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Thomas Hélie, Charles Picasso, Robert Piéchaud, Michaël Jousserand and Tom Colinot Accessible Sonification of Movement: A case in Swedish folk dance . . . . . . . . . . . . . . . . . . . . . . . 201 Olof Misgeld, Hans Lindetorp and Andre Holzapfel The ”Collective Rhythms Toolbox”: an audio-visual interface for coupled-oscillator rhythmic generation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Nolan Lem Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden ii AUTOMATIC LEGATO TRANSCRIPTION BASED ON ONSET DETECTION . . . . . . . . 214 Simon Falk, Bob Sturm and Sven Ahlbäck Post-mix vocoding and the making of All You Need Is Lunch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Miller Puckette and Kerry Hagan Using Deep Learning and Low-Frequency Fourier Analysis to Predict Parameters of Coupled Non-Linear Oscillators for the Generation of Complex Rhythms . . . . . . . . . . . . . . . . . 227 Luc Döbereiner Music Boundary Detection Using Local Contextual Information Based on Implication-Realization Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Kaede Noto, Akira Maezawa, Yoshinari Takegawa, Takuya Fujishima and Keiji Hirata Sound Design Strategies For Latent Audio Space Explorations Using Deep Learning Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Kivanç Tatar, Kelsey Cotton and Daniel Bisig POLYSPRING : A PYTHON TOOLBOX TO MANIPULATE 2-D SOUND DATABASE REPRESENTATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Victor Paredes, Jules Françoise and Frederic Bevilacqua OUR SOUND SPACE (OSS) – AN INSTALLATION FOR PARTICIPATORY AND INTERACTIVE EXPLORATION OF SOUNDSCAPES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Maurizio Goina, Roberto Bresin and Romina Rodela Developing and evaluating a Musical Attention Control Training game application . . . . . . . . 261 Anja Volk, Ermis Chalkiadakis, Sander Bakkes, Laurien Hakvoort and Rebecca S. Schaefer Modeling Piano Fingering Decisions with Conditional Random Fields. . . . . . . . . . . . . . . . . . . . . 269 David Randolph, Barbara Di Eugenio and Justin Badgerow Generating symbolic music using diffusion models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 Lilac Atassi Embodied Tempo Tracking with a Virtual Quadruped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Alex Szorkovszky, Frank Veenstra, Olivier Lartillot, Alexander Jensenius and Kyrre Glette A COMPARATIVE ANALYSIS OF LATENT REGRESSOR LOSSES FOR SINGING VOICE CONVERSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Brendan O’Connor and Simon Dixon Quantifying the Extended Acceptance of Pioneering Art Music Through the Creation of Electroacoustic Music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Soma Arai, Hiroyuki Yaguchi, Hidefumi Ohmura, Ludger Brümmer and Takuro Shibayama A Qualitative Investigation of Binaural Spatial Music in Virtual Reality . . . . . . . . . . . . . . . . . . 304 Sandra Mahlamäki A digital toolbox for musical analysis of computer music: exploring music and technology through sonic experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Michael Clarke, Frédéric Dufeu and Keitaro Takahashi Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden iii Citation is not Collaboration: Music-Genre Dependence of Graph-Related Metrics in a Music Credits Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Giulia Clerici and Marco Tiraboschi Ding-dong: Meaningful Musical Interactions with Minimal Input . . . . . . . . . . . . . . . . . . . . . . . . . 323 Ciaran Frame A Comparative Computational Approach to Piano Modeling Analysis . . . . . . . . . . . . . . . . . . . . 330 Riccardo Simionato and Stefano Fasciani RESURRECTING THE VIOLINO ARPA: A MUSEUM EXHIBITION . . . . . . . . . . . . . . . . . . 338 Simon Rostami Mosen, Stefania Serafin, Ali Adjorlu, Ulla Hahn Ranmar and Marie Martens Song Popularity Prediction using Ordinal Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Michael Vötter, Maximilian Mayerl, Eva Zangerle and Günther Specht Real-Time Implementation of the Kirchhoff Plate Equation using Finite-Difference Time-Domain Methods on CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Zehao Wang, Stefan Bilbao, Tom Erbe and Miller Puckette Playing the Design: Creating Soundscapes through Playful Interaction . . . . . . . . . . . . . . . . . . . 362 Ricardo Atienza, Hans Lindetorp and Kjetil Falkenberg WHAT IS THE COLOR OF CHORO? COLOR PREFERENCES FOR AN INSTRUMENTAL BRAZILIAN POPULAR MUSIC GENRE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 Philip Berrez, Tiago Maranhao, Martin Kihl and Roberto Bresin Sculpting Algorithmic Pattern: Informal and Visuospatial Interaction in Musical Instrument Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Jack Armitage, Thor Magnusson and Andrew McPherson ”Video Accompaniment”: Synchronous Live Playback for Score-Aligned Animation . . . . . . . 385 Kaitlin Pet, Nikki Pet and Christopher Raphael Heat-sensitive sonic textiles: increasing awareness of the energy we save by wearing warm fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 Vincenzo Madaghiele, Arife Dila Demir and Sandra Pauletto Sonifying energy consumption using SpecSinGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Sandra Pauletto, Adrián Barahona-Rı́os, Vincenzo Madaghiele and Yann Seznec A Real-Time Cochlear Implant Simulator - Design and Evaluation. . . . . . . . . . . . . . . . . . . . . . . . 410 Christina Steinhauer, Tobias Lykke Sønderbo and Razvan Paisa AN ARTISTIC AUDIOTACTILE INSTALLATION FOR AUGMENTED MUSIC. . . . . . . . 419 Jeremy Marozeau Salient Sights and Sounds: Comparing Visual and Auditory Stimuli Remembrance using Audio Set Ontology and Sonic Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 Laura McHugh, Chih-Wei Wu, Xuanling Xu and Kjetil Falkenberg Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden iv musif: a Python package for symbolic music feature extraction Ana Llorens,1 Federico Simonetta,2 Martı́n Serrano,2 Álvaro Torrente1, 2 1 Department of Musicology, Universidad Complutense de Madrid, Madrid, Spain 2 Instituto Complutense de Ciencias Musicales, Madrid, Spain {first letter of the first name}{surname}@iccmu.es ABSTRACT In this work, we introduce musif, a Python package that fa- cilitates the automatic extraction of features from symbolic music scores. The package includes the implementation of a large number of features, which have been developed by a team of experts in musicology, music theory, statistics, and computer science. Additionally, the package allows for the easy creation of custom features using commonly available Python libraries. musif is primarily geared to- wards processing high-quality musicological data encoded in MusicXML format, but also supports other formats com- monly used in music information retrieval tasks, including MIDI, MEI, Kern, and others. We provide comprehensive documentation and tutorials to aid in the extension of the framework and to facilitate the introduction of new and inexperienced users to its usage. 1. INTRODUCTION The abstraction represented in music scores, which are sym- bolic representations of music, has been shown to be highly relevant for both cognitive and musicological studies. In cognitive studies, the abstraction process used by human music cognition to categorize sound is important to un- derstand how we identify and perceive different musical aspects, such as timbres, pitches, durations, and rhythms [1]. In musicological studies, the abstraction represented in mu- sic scores is important as it provides a direct source of information to understand how the music was constructed. Throughout history, these aspects have been encoded in dif- ferent forms, with common Western music notation being the most widely used in the Western world for centuries. Therefore, music notation is considered of paramount im- portance in the field of musicology. In the field of sound and music computing, however, re- search has primarily focused on analyzing music in the audio domain, while other modalities such as images and scores have received less attention [2]. Researchers inter- ested in applying machine learning methods to the analysis of music scores will likely seek methods for representing them in a suitable way. In the context of modern deep learning and machine learning, two main approaches have emerged: feature learning [3] and feature extraction [4, 5]. Copyright: © 2023 Ana Llorens et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Feature learning – or representation learning – involves us- ing algorithms to learn the features from the data in a way that is optimal to the specific statistical inference problem and is mainly applied with Neural Networks [6–8]; feature extraction, instead, involves the computation of generic and hand-crafted features, needing further successive steps such as feature selection and dimensionality reduction. Both approaches have their own set of advantages and disadvan- tages and the choice of which approach to use will depend on the specific task and the available data. Here we focus on the latter exclusively. Feature extraction has widely been used in various ma- chine learning tasks and has been partially successful in music computing [9–11]. However, a major drawback is the effort and time required to craft useful features for a specific task. To address this issue, researchers have previously pro- posed software tools that assist in extracting features from music, such as audio files and scores. Additionally, with the advancement of modern computer languages such as Python and JavaScript, the implementation of new features has become easier and more accessible. Musicologists may also resort to feature extraction, espe- cially in the context of the so-called corpus studies. In fact, existing software for symbolic music feature extraction – e.g. jSymbolic [5] – was partly designed to help musicol- ogists obtain the data they required in a fast and accurate way. This is especially important because the computation of the features could hardly be achieved by the manual work of musicologists, who, as of today, devote time to manual annotations such as harmony [12] and cadence [13, 14]. Ex- amples of such feature-driven, computational musicology can be found in studies of musical form [15], harmony [16], and compositional styles [17–21], among others. In this work, we introduce a software tool named musif, which offers a comprehensive collection of features that are extracted from various file formats. The tool is designed to be easily extensible using the Python programming lan- guage and is specifically tailored for 18th-century opera arias, although it has been tested on a variety of other reper- toires, including Renaissance and Pop music. Furthermore, in contrast to previous software [4, 5], musif is developed with a focus on musicological studies and is thus geared towards high-quality music datasets, addressing the issue of limited data availability that is commonly encountered in feature learning methods. To aid in its usage, musif is accompanied by detailed software documentation 1 . This documentation provides 1 https://musif.didone.eu Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 132 http://creativecommons.org/licenses/by/3.0/ https://musif.didone.eu FeatureExtractor DataProcessor NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Score-level feature extraction Window-level feature extraction DataProcessor Music Score Figure 1. Flow chart of the general pipeline for feature extraction with musif from a single score. First, features are extracted from a music score. If window-level features are used, then a row for each window of measures is generated, otherwise a single row for the whole score. Then, the DataProcessor cleans the table by removing NaN, replacing them with 0, and merging or removing the undesired columns. adequate information for both novice and advanced users, enabling them to take full advantage of the tool and add new features and file formats as needed. The project is developed using open source methods and adopts GitHub to manage issues and pull requests, as well as to distribute the source code 2 . 2. DESIGN PRINCIPLES The development of the musif was guided by four key design principles. The foremost principle was the ability to customize and ex- tend the framework to meet the user’s specific requirements. This includes the capability to alter the feature extraction process by introducing new features coded by the user and by modifying the existing pipeline. The second principle that guided the development was to ensure the usability of the software by individuals with min- imal technical expertise, with musicologists as the primary target audience. This principle mainly entailed providing a user-friendly interface for the entire feature extraction process, with default settings that are deemed optimal. Ad- ditionally, comprehensive documentation was produced to aid novice users in understanding the feature extraction process of symbolic music. As musicologists were identified as the primary target au- dience, special attention was paid to the file types supported by the system. Specifically, an effort was made to find a combination of file formats that were both easy to create and able to represent musicological annotations, which could be used as sources for feature extraction. The final principle that underpins the entire structure of musif is its suitability for big data analysis. Specifically, measures were taken to ensure that the framework was com- putationally efficient on commercially available computers. 2 https://github.com/DIDONEproject/musif 3. IMPLEMENTATION 3.1 General pipeline The implementation of musif is mainly based on mu- sic21 [4] and methodically divided into two primary stages, both of which are highly configurable. Fig. 1 shows a flowchart of the general pipeline. The initial stage pertains to the actual extraction of fea- tures, during which a substantial number of features are de- rived from the data. Among these features, some are solely designed for the calculation of “second-order” features, which are derived from the primary ones. For instance, the number of notes on a score may not hold inherent signifi- cance, but it acquires meaning when considered in relation to the total length of the score. Therefore, an additional operation is required to compute the ratio between the num- ber of notes and features that denote the total duration of the score, such as the total number of beats. As a result of this, certain ”first-order” features may not be relevant for the specific task at hand. To address this issue, we have implemented an additional step that we refer to as “post-processing”. In this stage, certain “first-order” features are eliminated, while others are aggregated according to the user specifications. For example, to lower the overall number of features and attain a more succinct representation, the user may choose to aggregate features that originate from similar instruments, such as strings, by utilizing statistical measures such as the mean, the variance, and other statistical moments. Another crucial task accomplished during post-processing is the standardization of representation for missing data, such as NaN values or empty strings. The aforementioned two steps correspond to two Python objects, namely, the FeaturesExtractor and the DataProcessor. Both of these objects take as input an extensible configuration, which can be expressed in vari- ous ways, namely variadic python arguments in the class constructor and/or a YAML file. The configuration of the Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 133 https://github.com/DIDONEproject/musif from musif.extract.extract import FeaturesExtractor from musif.process.processor import DataProcessor features = FeaturesExtractor( # here we use `None`, but it could be the path to a YAML file containing # specifications None, # the options below override the YAML file if it is provided xml_dir="data_notation", musescore_dir="data_harmony", basic_modules=["scoring"], features=["core", "ambitus", "interval", "tempo", "density", "texture", "lyrics", "scale", "key", "dynamics", "rhythm"] ).extract() # For the DataProcessor, the arguments are the extracted table and the path to a YAML # file # As before, the YAML file can be overridden by variadic arguments processed_features = DataProcessor(features, None).process().data # the output is a pandas DataFrame! Listing 1: Example of usage feature extraction with default options and stock features FeaturesExtractor object includes the path to the data, the features that should be extracted, the paths or objects containing custom features, and other similar re- quirements. For its part, the configuration of the DataProcessor object offers the flexibility to specify the columns that should be aggregated or removed, as well as the columns in which NaN values should be replaced with a default value, such as zero. The outcome of the entire process is a tabular representa- tion, with one column per feature and one row per musical score. Optionally, scores can be analyzed using moving windows, in which case the output table will have one row for each window. When using windows, the window size and overlap can be specified as the number of measures, as shown in Fig. 2. A sample code that demonstrates the usage of the tool is provided in Listing 1. 3.2 File formats Given that our primary objective was to develop a software tool for musicological applications, it was imperative to support file formats that are easily usable in musicological analysis. As such, we carefully considered file formats such as MusicXML, MEI, and IEEE 1599. These file formats can represent common Western music notation with a high de- gree of detail and have been utilized for both musicological and MIR tasks. However, it was determined that only Mu- sicXML is fully supported by user-end graphical interfaces. The requirement for users to possess both musicological training and the ability to effectively utilize advanced soft- ware for editing large XML files is a rare combination, and, as such, it was not deemed a viable inclusion in the design of the system. Moreover, certain features implemented by musif are derived from functional, Roman-numeral har- monic analysis, which cannot be represented in the standard format of MusicXML. To solve this issue, we have adopted the MuseScore file format, in line with previous works in this field [12, 13]. Overall, the recommended file formats for the musif sys- tem are MusicXML for notation parsing and MuseScore for harmonic annotations. However, if only MuseScore files are available, the MuseScore software can be utilized to generate the necessary MusicXML files. Additionally, alter- native file formats may be employed in place of MusicXML by leveraging the music21 library for parsing notation files. This approach supports a comprehensive array of file formats. Furthermore, any file format supported by MuseScore can be utilized through automatic conversion to MusicXML. This pipeline is particularly recommended for extracting features from MIDI files. However, the parsing approach adopted in this system may be relatively slow when working with a large number of files. To mitigate this issue, a caching system has been im- plemented in order to save to disk any property, function, or method result that originates from music21 objects. This approach has been tested and has demonstrated a signifi- cant improvement in processing speed, with a 2 to 3 times increase in speed observed when cached files are used. This caching system is particularly useful when designing or debugging feature extractions on a large number of files, as it allows for more efficient and expedient processing. 3.3 Customization To facilitate customization of the feature extraction process, three main tools are available. These tools allow for more flexibility and precision in the feature extraction process, enabling users to tailor the process to their specific needs and requirements. These tools are further described in the subsequent list. 1. Custom features: The user can add custom features Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 134 Figure 2. Example of windowing on a music score. The unit of measure for the window length and overlap are measures. In this case, windows have a length of 3 and an overlap of 2. At the top of the score, in red, an example of harmonic analysis is shown. by developing two simple functions: one to extract features from each individual part in the score, and another to extract features from the entire score. This second function can optionally utilize the features extracted from the individual parts. Additionally, the user can specify the extraction order and feature dependencies, allowing for the use of previously ex- tracted features in the computation of newer features. The implementation of these custom features can be easily accomplished using the music21 Python li- brary. 2. Hooks: Hooks are user-provided functions that are called at specific stages of the extraction process. In the current version of musif, only one type of hook is possible, namely just after the parsing of the input files is completed and just before the caching mech- anism is initialized. The user can provide a list of functions that accept the parsed score as input and that are run before the caching mechanism is initial- ized. When using the cached files, these hooks will no longer be run. This hook is particularly useful for modifying the input scores before caching, such as deleting or modifying unsupported notation ele- ments from music21 objects, thus mitigating the constraints of the caching mechanism, which only allows read-only operations on the scores. 3. Python mechanisms: The Python programming lan- guage offers a range of advanced methods for mod- ifying and extending existing software. As musif is fully implemented in pure Python, these methods are fully applicable. They include, but are not limited to, class inheritance, method and property overriding, and type casting. 4. STOCK FEATURES musif is distributed with a wide variety of features already implemented. These sets of features can be selected for ex- traction using the FeaturesExtractor’s constructor arguments – see Listing 1 –, while the DataProcessor can be utilized for further refining the desired features. Each set corresponds to a specific Python sub-package. The total number of features varies based on the instrumentation used in the score and is usually between 500 feature values for simple monophonic scores and more than 10,000 feature values for orchestral scores. In this presentation, we will provide a brief summary of each of these modules. For those who wish to carefully select features, more detailed information can be found in the online documentation, in- cluding pre-made Python regular expressions that can be used to easily select the desired features. In general, all the features were designed to be meaning- ful for musicologists and music theorists, giving value to studies attempting to explain statistical results on the basis of the features. The modular structure of the features also allows researchers to conveniently focus their analysis on only certain aspects of the music. Here, we will use the word sound to refer to a specific timbre – e.g. violin – which can be repeated multiple times in the score – e.g. violin I and violin II. Moreover, we will use family to refer to a family of instruments – e.g. strings, voices, brasses, and so on. The stock feature modules available in musif are as follows: Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 135 • Core: These features are essential for the identifica- tion of music scores and for subsequent elaboration. They are always required and include the total num- ber of measures and notes, as well as the number of measures containing notes and their averages for each sound or part and for each family and/or score. Other examples of such features include the filename of the score, the time signature, and the key signature. • Scoring: This module computes features that are re- lated to the instrumentation and voices used in the score. Examples of features in this module include the instruments, families, and parts present in the score, as well as the number of parts for each instru- ment and family in the score. This module can be used to get a better understanding of the orchestration used in the composition. • Key: This module computes features that are related to the key signature and tonality, i.e., the key, of the piece. Examples of features in this module include the Krumhansl-Schmuckler tonality estimation [22], the key signature, and the mode (major or minor). This module allows for analyzing the underlying tonal system used in the composition. • Tempo: This module computes features that are re- lated to the tempo marking on the score. It should be noted that since some features depend on the termi- nology used by the composer for the tempo indica- tion, some of these features may not be reliable for all repertoires. In fact, as the composers’ marking need not be expressed quantitatively – it is actually more typical in some repertoires to have just a ver- bal indication – the numerical values extracted by musif ultimately depend on the BPM value given during the engraving process, if available. • Density: These features relate the number of notes with respect to the total number of measures, as well as with respect to the total number of measures that contain sound, for a single part, sound, or family. This module provides insights into the density of the sound in the composition and allows comparing the activity level of different parts or families in the score. • Harmony: This is one of the largest feature modules; it computes features based on the harmonic annota- tions provided in the MuseScore files according to a previous standard [12, 13]. Examples of these fea- tures include the number of harmonic annotations, the number of chords performing the tonic, dominant, and sub-dominant functions, the harmonic rhythm – i.e. the rate of harmonic changes in relation to the number of beats or measures –, as well as features related to modulations annotated in the MuseScore files. This module can be used to get a better under- standing of the harmonic structure of the composition and to analyze the harmonic progressions used in the composition. • Rhythm: This module computes features related to the note durations and to particular rhythmic figures, such as dotted and double-dotted rhythms. Examples of features in this module include the average note duration and the frequency of particular rhythmic figures. This module analyzes the rhythmic structure of the composition and the rhythmic patterns used in it. • Scale: This module computes features related to spe- cific melodic degrees with respect to the main key of the score, as computed in the key module, and to the local key, as provided in the MuseScore harmonic annotations. Examples of features in this module include the frequency of specific scale degrees in a given part. • Dynamics: This module computes features related to the distribution of dynamic markings across the score, by assigning numerical values to each dynamic marking according to their corresponding intensity. As is the case with tempo, the specific numerical value of a given dynamic marking is assigned during the engraving process, with some software assigning default values that the engraver may need to modify depending on the notation conventions. Similarly to other features, this module may not be completely generalizable to some repertoires, as the interpreta- tion of dynamic markings can vary across different compositions and styles, or even be completely ab- sent. Examples of features in this module include the frequency of specific dynamic markings, the av- erage dynamic level, and the distribution of dynamic markings across the score. This module extracts in- formation about the expressivity of the composition and analyzes the use of dynamic contrasts in it. • Ambitus: This module computes the ambitus, or melodic range, of the piece in semitones, for the whole piece as well as for each individual part, sound, or family. It also computes the lowest and highest pitches and the note names thereof. • Melody: This module computes an extensive num- ber of features related to the distribution and types of melodic intervals for each part, voice, sound, and fam- ily. This is the largest set of features within musif. Examples of features in this module include the fre- quency of specific interval types, the distribution of interval sizes, and the proportion of ascending and descending intervals. This module provides insights into the melodic structure of the composition by ana- lyzing the use of specific intervals in it. • Lyrics: This module considers the alignment between lyrics, if available, and the notes and computes fea- tures related to their distribution. Examples of fea- tures in this module include the total number of sylla- bles in each vocal part, the average number of notes per syllable, and the proportion of measures that con- tain notes for each vocal part in the score. This mod- ule can facilitate a more profound comprehension Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 136 of the relationships between lyrics and music in the composition. • Texture: This module computes the ratio of the num- ber of notes between two parts, considering all the possible pairs of parts. This feature can provide in- sight into the relative density and activity level of different parts in the score and can be used to analyze the texture of the composition. 5. DISCUSSION AND FUTURE WORKS This work presents the musif module to the scientific community as a tool for the extraction of features from symbolic music scores. It is designed with a focus on extensibility and customization, while also providing good defaults for the novice user and supporting musicologically- curated datasets. The module is implemented in Python, and it provides a comprehensive set of features covering various aspects of music scores, including harmony, rhythm, melody, and many more. The modular structure of the musif makes it easy to use and customize according to the user’s needs. In comparison to existing software such as jsymbolic [5] and music21 [4], musif offers a significantly larger num- ber of features, approximately 2 times larger. Additionally, jsymbolic computes features based on pure MIDI en- coding, with only 2 features based on the MEI format. This is an essential aspect for musicological studies as MIDI, although commonly used in the MIP field, is not capable of representing various characteristics of music notation, such as alterations, key signatures, rhythmic and dynamic annotations, chords, and lyrics. music21 already implements several features based on its powerful parsing engine, which allows it to take full ad- vantage of MusicXML, MEI, and Kern features. However, musif expands upon this set of computable features while remaining completely based on music21 and allowing the automatic extraction of features at the window level. Furthermore, it includes a caching system that allows for improved performance during the feature extraction process. This caching system saves the results of computations to disk, reducing the need to perform the same calculations multiple times, thus making the extraction process more efficient. Thus, musif provides a more extensive set of features while being highly performant in its extraction pro- cess, making it a valuable tool for researchers in the field of music information retrieval and musicology. While this paper describes the release of musif 1.0, we are aware that there is wide room to improve musif fur- ther, making it faster, more general, usable, and accurate. Specifically, we want to improve three aspects of the soft- ware: • Data visualization: we want to provide the user with tools that help the visualization of the data that musif extracts; this aspect would particularly be useful for preliminary analysis. • Repertoire: As of now, musif has been tested on other types of corpora for different music styles, including EWLDc̃itesimonetta2018symbolic, Hum- drum database [23], piano scores and performances [24], and masses from the Renaissance [25]. It has addi- tionally been utilized on an in-house corpus of more than 1600 opera arias. For this reason, most of the design choices and of the implemented features target this repertoire. We want to make it more powerful and efficient for other repertoires too. • More numerical features: Although musif already provides a wide set of musical features, we are sure that many other features could be defined and in- cluded in musif, empowering both musicological analysis and data science studies. We also plan to study in more depth the comparison be- tween the existing tools for music feature extraction, includ- ing benchmarks and test performances. While we continue working on these paths, we hope that musif can be a valuable tool for the Sound and Music Computing community and welcome any suggestions or contributions to the software. We encourage the community to use and test musif and provide feedback so that we can continue to improve and develop it further. It is our goal to make musif a widely used and reliable tool for MIP and musicology research. Acknowledgments This work is a result of the Didone Project [26], which has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program, Grant agreement No. 788986. It has also been conducted with funding from Spain’s Min- istry of Science and Innovation (IJC2020-043969-I/AEI/- 10.13039/501100011033). 6. REFERENCES [1] D. Deutsch, The Psychology of Music, 3rd ed. Aca- demic Press, 2013. [2] F. Simonetta, S. Ntalampiras, and F. Avanzini, “Mul- timodal Music Information Processing and Retrieval: Survey and Future Challenges,” in Proceedings of 2019 International Workshop on Multilayer Music Repre- sentation and Processing. Milan, Italy: IEEE Con- ference Publishing Services, 2019, pp. 10–18. doi: 10.1109/mmrp.2019.00012 [3] Y. Bengio, A. Courville, and P. Vincent, “Representa- tion Learning: A Review and New Perspectives,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, vol. 35, no. 8, pp. 1798–1828, Aug. 2013. doi: 10.1109/TPAMI.2013.50 [4] M. S. Cuthbert, C. Ariza, and L. Friedland, “Feature extraction and machine learning on symbolic music us- ing the music21 toolkit,” in Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, Florida, USA, Oc- tober 24-28, 2011. University of Miami, 2011, pp. 387–392. doi: 10.5281/zenodo.1416288 Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 137 https://doi.org/10.13039/501100011033 https://doi.org/10.13039/501100011033 https://doi.org/10.1109/mmrp.2019.00012 https://doi.org/10.1109/TPAMI.2013.50 https://doi.org/10.5281/zenodo.1416288 [5] C. McKay, J. Cumming, and I. Fujinaga, “jSymbolic 2.2: Extracting features from symbolic music for use in musicological and MIR research,” in Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018. ISBN 978-2-9540351-2-3 pp. 348–354. doi: 10.5281/zenodo.1492421 [6] F. Simonetta, C. E. Cancino-Chacón, S. Ntalampiras, and G. Widmer, “A convolutional approach to melody line identification in symbolic scores,” in Proceedings of the 20th International Society for Music Information Retrieval Conference. Delft, The Netherlands: ISMIR, Nov. 2019, pp. 924–931. doi: 10.5281/zenodo.3527966 [7] M. Prang and P. Esling, “Signal-domain representa- tion of symbolic music for learning embedding spaces,” Stockholm, Sweden, p. 10, Oct. 2020. [8] P. Lisena, A. Meroño-Peñuela, and R. Troncy, “MIDI2vec: Learning MIDI embeddings for reliable prediction of symbolic music metadata,” Semantic Web, vol. 13, no. 3, pp. 357–377, Jan. 2022. doi: 10.3233/SW- 210446 [9] L. Bigo, M. Giraud, R. Groult, N. Guiomard-Kagan, and F. Levé, “Sketching sonata form structure in selected classical string quartets.” in Proceedings of the 18th International Society for Music Information Retrieval Conference. Suzhou, China: ISMIR, Oct. 2017, pp. 752–759. doi: 10.5281/zenodo.1415020 [10] K. C. Kempfert and S. W. K. Wong, “Where does Haydn end and Mozart begin? Composer classifica- tion of string quartets,” Journal of New Music Re- search, vol. 49, no. 5, pp. 457–476, Oct. 2020. doi: 10.1080/09298215.2020.1814822 [11] F. Simonetta, F. Avanzini, and S. Ntalampiras, “A Per- ceptual Measure for Evaluating the Resynthesis of Au- tomatic Music Transcriptions,” Multimedia Tools and Applications, 2022. doi: 10.1007/s11042-022-12476-0 [12] M. Neuwirth, D. Harasim, F. C. Moss, and M. Rohrmeier, “The Annotated Beethoven Corpus (ABC): A Dataset of Harmonic Analyses of All Beethoven String Quartets,” Frontiers in Digital Hu- manities, vol. 5, 2018. doi: 10.3389/fdigh.2018.00016 [13] J. Hentschel, M. Neuwirth, and M. Rohrmeier, “The Annotated Mozart Sonatas: Score, Harmony, and Ca- dence,” Transactions of the International Society for Music Information Retrieval, vol. 4, no. 1, pp. 67–80, May 2021. doi: 10.5334/tismir.63 [14] O. Raz, D. Chawin, and U. B. Rom, “The Mozart Expo- sitional Punctuation Corpus: A Dataset of Interthematic Cadences in Mozart’s Sonata-Allegro Expositions,” Em- pirical Musicology Review, vol. 16, pp. 134–144, 2021. doi: 10/grq2fp [15] F. Moss, W. Fernandes de Souza, and M. Rohrmeier, “Harmony and Form in Brazilian Choro: A Corpus- Driven Approach to Musical Style Analysis,” Journal of New Music Research, vol. 49, pp. 416–437, 2020. doi: 10/grq2fm [16] J. Hentschel, F. C. Moss, A. McLeod, M. Neuwirth, and M. Rohrmeir, “Towards a Unified Model of Chords in Western Harmony,” in Music Encoding Conference 2021, 2022, pp. 143–149. doi: 10/grq2fk [17] A. Llorens and A. Torrente, “Constructing opera seria in the Iberian Courts: Metastasian Repertoire for Spain and Portugal,” Anuario Musical, vol. 76, pp. 73–110, Jul. 2021. doi: 10/grq2fn [18] M. E. Cuenca and C. McKay, “Exploring Musical Style in the Anonymous and Doubtfully Attributed Mass Movements of the Coimbra Manuscripts: A Statisti- cal Approach,” in Medieval and Renaissance Music Conference, 2019. [19] V. Anzani and A. Llorens, “Shaping Eighteenth-Century Opera: The Singer’s Impact,” in Tosc@ Junior Confer- ence, 2021. [20] A. Torrente, “Didone trasmutata: Aria Settings and the Expression of Emotions in Metastasian Operas,” in Mapping Artistic Networks of Italian Theatre and Opera across Europe, 1600–1800, 2019. [21] E. Rodriguez-Garcia and C. McKay, “Ave festiva fer- culis: Exploring Attribution by Combining Manual and Computational Analysis.” in Medieval and Renaissance Music Conference, 2021. [22] C. L. Krumhansl, Cognitive Foundations of Musical Pitch. Oxford University Press, 1990. ISBN 0-19- 505475-X [23] C. S. Sapp, “Online Database of Scores in the Humdrum File Format,” in Proceedings of the 6th International Conference on Music Information Retrieval, 2005, p. 2. doi: 10.5281/zenodo.1417281 [24] F. Foscarin, A. Mcleod, P. Rigaux, F. Jacquemard, and M. Sakai, “ASAP: A dataset of aligned scores and performances for piano transcription,” in Proceedings of the 21st International Society for Music Informa- tion Retrieval, 2020, Proceedings. doi: 10.5281/zen- odo.4245489 [25] J. Cumming, C. McKay, J. Stuchbery, and I. Fujinaga, “Methodologies for Creating Symbolic Corpora of West- ern Music Before 1600,” in Proceedings of the ISMIR. Paris, France: ISMIR, Sep. 2018, pp. 491–498. doi: 10.5281/zenodo.1492459 [26] A. Torrente and A. Llorens, “The Musicology Lab: Teamwork and the Musicological Toolbox,” in Mu- sic Encoding Conference 2021, 2022, pp. 9–20. doi: 10/grqp2b Proceedings of the Sound and Music Computing Conference 2023, Stockholm, Sweden 138 https://doi.org/10.5281/zenodo.1492421 https://doi.org/10.5281/zenodo.3527966 https://doi.org/10.3233/SW-210446 https://doi.org/10.3233/SW-210446 https://doi.org/10.5281/zenodo.1415020 https://doi.org/10.1080/09298215.2020.1814822 https://doi.org/10.1007/s11042-022-12476-0 https://doi.org/10.3389/fdigh.2018.00016 https://doi.org/10.5334/tismir.63 https://doi.org/10/grq2fp https://doi.org/10/grq2fm https://doi.org/10/grq2fk https://doi.org/10/grq2fn https://doi.org/10.5281/zenodo.1417281 https://doi.org/10.5281/zenodo.4245489 https://doi.org/10.5281/zenodo.4245489 https://doi.org/10.5281/zenodo.1492459 https://doi.org/10/grqp2b Proceedings of the Sound and Music Computing Conference 2023 Royal College of Music and KTH Royal Institute of Technology in Stockholm, Sweden Binder3.pdf SMC2023_cover_page SMC2023_proceedings.pdf preface.pdf toc.pdf pc.pdf paper_199 paper_417 paper_568 paper_631 paper_815 paper_986 paper_1119 paper_1314 paper_1411 paper_1454 paper_1717 paper_1956 paper_2102 paper_2164 paper_2465 paper_2722 paper_2881 paper_2999 paper_3407 paper_3854 paper_4133 paper_4386 paper_4772 paper_5030 paper_5060 paper_5179 paper_5282 paper_5331 paper_5469 paper_5588 paper_5641 paper_5662 paper_5727 paper_5923 paper_6100 paper_6312 paper_6752 paper_6896 paper_7179 paper_7290 paper_7327 paper_7416 paper_7440 paper_7462 paper_7494 1. INTRODUCTION Overview and Conceptualisation 2. Theoretical framework: Auralisation 2.1 Geometry-Based Auralisation 2.2 Perception-Based Auralisation 3. research questions and methodology 4. experiment procedure 4.1 The Two Experiment Versions Explained – Differences in Music Spatialisation 4.2 Spatial Music Static Auralisation 4.3 Spatial Music Object-Based Auralisation 4.4 Adaptive Music Auralisation in the Second Experiment Version 4.5 Hybrid of Auralisation Techniques 5. Results of the qualitative inquiry 6. Discussion 7. conclusions 8. REFERENCES paper_7681 paper_7850 paper_8079 paper_8112 paper_8117 paper_8127 paper_8412 paper_8636 paper_8654 paper_9431 paper_9594 paper_9600 paper_9619 paper_9832 paper_9850 paper_9932 keyword_index.pdf author_index.pdf