MIDI @ arXiv

Posted at 2025-07-13

1

Hidden magnetic fields of the quiet Sun derived from Hanle depolarization of lines of the "second solar spectrum" at the limb from Pic du Midi observations
https://arxiv.org/pdf/2505.17545

2

Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes
https://arxiv.org/pdf/2505.12863

3

How to Infer Repeat Structures in MIDI Performances
https://arxiv.org/pdf/2505.05055

4

Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
https://arxiv.org/pdf/2504.15071

5

Music Information Retrieval on Representative Mexican Folk Vocal Melodies Through MIDI Feature Extraction
https://arxiv.org/pdf/2503.24243

6

Zero to 16383 Through the Wire: Transmitting High- Resolution MIDI with WebSockets and the Browser
https://arxiv.org/pdf/2503.09055

7

MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition
https://arxiv.org/pdf/2501.17011

8

MIDIS: Quantifying the AGN component of X-ray-detected galaxies
https://arxiv.org/pdf/2501.11491

9

Annotation-Free MIDI-to-Audio Synthesis via Concatenative Synthesis and Generative Refinement
https://arxiv.org/pdf/2410.16785

10

End-to-end Piano Performance-MIDI to Score Conversion with Transformers
https://arxiv.org/pdf/2410.00210

11 Beat and Downbeat Tracking in Performance MIDI Using an End-to-End Transformer Architecture

Sebastian Murgul, Michael Heizmann
https://arxiv.org/pdf/2507.00466

REFERENCES

[1] M. E. P. Davies, S. Böck, and M. Fuentes, Tempo, Beat and Downbeat Estimation. https://tempobeatdownbeat.github.io/tutorial/intro.html, 2021.
[2] E. Benetos, S. Dixon, Z. Duan, and S. Ewert, “Automatic Music Transcription: An Overview,” IEEE Signal Processing Magazine, vol. 36, no. 1, pp. 20–30, 2018.
[3] S. Böck and M. Schedl, “Enhanced Beat Tracking with Context-Aware Neural Networks,” in Proceedings of the 14th International Conference on Digital Audio Effects (DAFx), 2011.
[4] S. Böck, F. Krebs, and G. Widmer, “Joint Beat and Downbeat Tracking with Recurrent Neural Networks,” in Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016.
[5] M. E. P. Davies and S. Böck, “Temporal Convolutional
Networks for Musical Audio Beat Tracking,” in 27th
European Signal Processing Conference (EUSIPCO),
2019.
[6] J. Zhao, G. Xia, and Y. Wang, “Beat Transformer:
Demixed Beat and Downbeat Tracking with Dilated
Self-Attention,” in Proceedings of the 23rd Interna-
tional Society for Music Information Retrieval Confer-
ence (ISMIR), 2022.
[7] F. Foscarin, J. Schlüter, and G. Widmer, “Beat This!
Accurate beat tracking without DBN postprocessing,” in
Proceedings of the 25th International Society for Music
Information Retrieval Conference (ISMIR), 2024.
[8] E. Cambouropoulos, “From MIDI to Traditional Mu-
sical Notation,” in Proceedings of the AAAI Workshop
on Artificial Intelligence and Music: Towards Formal
Models for Composition, Performance and Analysis,
2000.
[9] D. Temperley, Music and Probability. Mit Press, 2007.
[10] A. Cogliati, D. Temperley, and Z. Duan, “Transcribing
Human Piano Performances Into Music Notation,” in
Proceedings of the 17th International Society for Music
Information Retrieval Conference (ISMIR), 2016.
[11] F. Foscarin, F. Jacquemard, P. Rigaux, and M. Sakai,
“A Parse-Based Framework for Coupled Rhythm Quan-
tization and Score Structuring,” in Mathematics and
Computation in Music, 2019.
[12] K. Shibata, E. Nakamura, and K. Yoshii, “Non-Local
Musical Statistics As Guides for Audio-To-Score Piano
Transcription,” Information Sciences, vol. 566, pp. 262–
280, 2021.
[13] L. Liu, Q. Kong, G. Morfi, E. Benetos et al., “Per-
formance MIDI-To-Score Conversion by Neural Beat
Tracking,” in Proceedings of the 23rd International
Society for Music Information Retrieval Conference (IS-
MIR), 2022.
[14] S. Kim, T. Hayashi, and T. Toda, “Note-Level Auto-
matic Guitar Transcription Using Attention Mechanism,”
in Proceedings of the 30th European Signal Processing
Conference (EUSIPCO), 2022.
[15] T. Beyer and A. Dai, “End-to-End Piano Performance-
MIDI To Score Conversion with Transformers,” in Pro-
ceedings of the 25th International Society for Music
Information Retrieval Conference (ISMIR), 2024.
[16] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang,
M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the
Limits of Transfer Learning with a Unified Text-to-Text
Transformer,” Journal of Machine Learning Research,
vol. 21, no. 140, pp. 1–67, 2020.
[17] C. Raffel and D. P. Ellis, “Intuitive Analysis, Creation
and Manipulation of MIDI Data with pretty_midi,” in
Proceedings of the 15th International Society for Music
Information Retrieval Conference (ISMIR), 2014.
[18] C. Hawthorne, I. Simon, R. Swavely, E. Manilow, and
J. Engel, “Sequence-to-Sequence Piano Transcription
with Transformers,” in Proceedings of the 22nd Interna-
tional Society for Music Information Retrieval Confer-
ence (ISMIR), 2021.
[19] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. De-
langue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Fun-
towicz et al., “Hugging Face’s Transformers: State-of-
the-Art Natural Language Processing,” arXiv preprint
arXiv:1910.03771, 2019.
[20] N. Shazeer and M. Stern, “Adafactor: Adaptive Learn-
ing Rates with Sublinear Memory Cost,” in Interna-
tional Conference on Machine Learning. PMLR, 2018,
pp. 4596–4604.
[21] A. Ycart and E. Benetos, “A-MAPS: Augmented MAPS
Dataset with Rhythm and Key Annotations,” in Pro-
ceedings of the 19th International Society for Music
Information Retrieval Conference (ISMIR), 2018.
[22] V. Emiya, R. Badeau, and B. David, “Multipitch Es-
timation of Piano Sounds Using a New Probabilistic
Spectral Smoothness Principle,” IEEE Transactions on
Audio, Speech, and Language Processing, vol. 18, no. 6,
pp. 1643–1654, 2009.
[23] F. Foscarin, A. Mcleod, P. Rigaux, F. Jacquemard, and
M. Sakai, “ASAP: A Dataset of Aligned Scores and
Performances for Piano Transcription,” in Proceedings
of the 21st International Society for Music Information
Retrieval Conference (ISMIR), 2020.
[24] Q. Xi, R. M. Bittner, J. Pauwels, X. Ye, and J. P. Bello,
“GuitarSet: A Dataset for Guitar Transcription,” in Pro-
ceedings of the 19th International Society for Music
Information Retrieval Conference (ISMIR), 2018.
[25] D. Edwards, X. Riley, and S. Dixon, “The François
Leduc Dataset,” Apr. 2024. [Online]. Available:
https://doi.org/10.5281/zenodo.10984521
[26] X. Riley, D. Edwards, and S. Dixon, “High Resolu-
tion Guitar Transcription via Domain Adaptation,” in
IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP), 2024.
[27] C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Ni-
eto, D. Liang, D. P. Ellis, and C. C. Raffel, “MIR_EVAL:
A Transparent Implementation of Common MIR Met-
rics,” in Proceedings of the 15th International Society
for Music Information Retrieval Conference (ISMIR),
2014.
[28] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei,
I. Sutskever et al., “Language models are unsupervised
multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9,
2019.