Music Creation Using a Hybrid Model of LLM And LSTM
DOI:
https://doi.org/10.53032/tvcr/2025.v7n2.40Keywords:
Hybrid model, LLM, LSTMAbstract
Over the years music generation using artificial intelligence and deep learning have undergone lots of progress. There are platforms such as Magneta, MuseNet, Deep Back which incorporates deep learning and LSTM models in their architecture. Along with that platform such as music, riffusion, MIDI which incorporates LLM based learning methods. These models are useful, but there are very few researches which incorporate LLM and LSTM architecture for music generation and improve their individual limitations and combine their strengths. Hence, this will be the major highlight of the research. The LLM based models offer sequential structural richness and LSTM ensures temporal consistency. The approach will involve encoding MIDI sequences into English text format, allowing the LLM to process musical structures with natural language processing techniques. The output is then refined using an LSTM-based decoder to enhance the coherence of note transitions and rhythmic consistency. The hybrid model approaches to discover the potential of combining sequence modelling for improved AI driven music composition, which can make an interesting blend between creativity and machines.
References
Time. (2023, December 6). How AI could transform music in 2023. Time. https://time.com/6340294/ai-transform-music-2023/ uscutia, V. (2023). Large language models at work.
Empress. (2023, September 21). Music and technology: A brief history of AI in music. Empress. https://blog.empress.ac/music-and-technology-a-briefhistory-of-ai-in-music-clq1eh8fd361731wr3j9m23ehi/
Javatpoint. (n.d.). Deep learning for sequential data. Javatpoint. https://www.javatpoint.com/deeplearning-for-sequential-data 13
Liu, X., and Wang, J. (2020). Pre-training transformers for language modeling: A survey. arXiv. https://arxiv.org/abs/2006.09838utm
Berton, M., et al. (2022). Improving music generation using transformerbased models. arXiv. https://arxiv.org/pdf/2203.12105
Huang, A., and Yang, Z. (2018). Music Transformer: Generating Music with Long-Term Structure. arXiv preprint arXiv:1809.04281.
Predicting Music with LLM https://medium.com/@carneyr98/predictingmusic-with-an-llm-d349296a2dd9
Berton, M., et al. (2022). Improving music generation using transformerbased models. arXiv. https://arxiv.org/pdf/2203.12105
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. A., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/pdf/1706.03762
Noorfo, M. (2023, April 26). Noorfo music AI: A brief history. Machine Learning for Music. https://mct-master.github.io/machine-learning/2023/04/26/noorfomusic-ai-a-brief-history.html
Data Scientists Diary. (2023, March 3). Deep learning for music composition. Medium. https://medium.com/data-scientists-diary/deep-learning-formusic-composition-ed21ef1fccf7
A Study of Artificial Intelligence for Creative Uses in Music:A Research Paper submitted to the Department of Engineering and Society
Eck, D., Schmidhuber, J. (2002). ”Finding temporal structure in music: Blues improvisation with LSTM recurrent networks.” Neural Networks in the Arts and Humanities.
Dong, H. W., Hsiao, W. Y., Yang, L. C., Yang, Y. H. (2018). ”MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment.” Proceedings of AAAI Conference on Artificial Intelligence.
Hadjeres, G., Pachet, F., Nielsen, F. (2017). ”DeepBach: a steerable model for Bach chorales generation.” Proceedings of the 34th International Conference on Machine Learning (ICML).
Huang, C. Z. A., Vaswani, A., Uszkoreit, J., et al. (2018). ”Music Transformer: Generating Music with Long-Term Structure.” arXiv preprint arXiv:1809.04281.
Payne, C. (2019). ”MuseNet: Composing with Large-Scale Transformer Models.” OpenAI Blog.
Dhariwal, P., Jun, H., Payne, C., et al. (2020). ”Jukebox: A generative model for music.” arXiv preprint arXiv:2005.00341.
Exploring Hybrid GRU-LSTM Networks for Enhanced Music Generation Authors: Suman Maria Tony, S. Sasikumar Citation: Tony, S. M., and Sasikumar, S. (2024). Exploring Hybrid GRU-LSTM Networks for Enhanced Music Generation. SSRG International Journal of Electronics and Communication Engineering, 11(7), 150-162. https://doi.org/10.14445/23488549/IJECEV11I7P115 Indian Classical Music Generation Using LSTM and RNN Authors: Bhavanasree, Krishnan, Shanaz, Krishnan Citation: Bhavanasree, K., Krishnan, K., and Shanaz, K. (2024).
Indian Classical Music Generation Using LSTM and RNN. IJCRT, 3(4), 1-10. https://ijcrt.org/papers/IJCRT2403917.pdf
MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation Authors: Shuyu Li, Yunsick Sung Citation: Li, S., and Sung, Y. (2023). MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation. Mathematics, 11(4), 1-14. https://doi.org/10.3390/math11040114
Xu, X. (2020). LSTM Networks for Music Generation. arXiv preprint arXiv:2006.09838. https://arxiv.org/abs/2006.09838
Kotecha, N., Young, P. (2018). Generating Music using an LSTM Network. arXiv preprint arXiv:1804.07300. https://arxiv.org/abs/1804.07300
Conner, M., Gral, L., Adams, K., Hunger, D., Strelow, R., Neuwirth, A. (2022). Music Generation Using an LSTM. arXiv preprint arXiv:2203.12105. https://arxiv.org/abs/2203.12105
Ingale, V., Mohan, A., Adlakha, D., Kumar, K., Gupta, M. (2021). Music Generation using Three-layered LSTM. arXiv preprint arXiv:2105.09046. https://arxiv.org/abs/2105.09046
Mangal, S., Modak, R., Joshi, P. (2019). LSTM Based Music Generation System. arXiv preprint arXiv:1908.01080. https://arxiv.org/abs/1908.01080
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 The Voice of Creative Research

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.