Music Creation Using a Hybrid Model of LLM And LSTM

Authors

  • Kashish Bhatia Student of Department of Computer Science, University of Mumbai, Vidya Nagari, Kalina, Santacruz East
  • Dr. Sampada Malhar Margaj Assistant Professor Department of Computer Science, Kirti M. Doongursee College, Dadar, Mumbai, Maharashtra, India
  • Dr. Jyotshna Jagdeo Dongardive Associate Professor, Department of Computer Science, University of Mumbai, Mumbai, Maharashtra, India

DOI:

https://doi.org/10.53032/tvcr/2025.v7n2.40

Keywords:

Hybrid model, LLM, LSTM

Abstract

Over the years music generation using artificial intelligence and deep learning have undergone lots of progress. There are platforms such as Magneta, MuseNet, Deep Back which incorporates deep learning and LSTM models in their architecture. Along with that platform such as music, riffusion, MIDI which incorporates LLM based learning methods. These models are useful, but there are very few researches which incorporate LLM and LSTM architecture for music generation and improve their individual limitations and combine their strengths. Hence, this will be the major highlight of the research. The LLM based models offer sequential structural richness and LSTM ensures temporal consistency. The approach will involve encoding MIDI sequences into English text format, allowing the LLM to process musical structures with natural language processing techniques. The output is then refined using an LSTM-based decoder to enhance the coherence of note transitions and rhythmic consistency. The hybrid model approaches to discover the potential of combining sequence modelling for improved AI driven music composition, which can make an interesting blend between creativity and machines.

References

Time. (2023, December 6). How AI could transform music in 2023. Time. https://time.com/6340294/ai-transform-music-2023/ uscutia, V. (2023). Large language models at work.

Empress. (2023, September 21). Music and technology: A brief history of AI in music. Empress. https://blog.empress.ac/music-and-technology-a-briefhistory-of-ai-in-music-clq1eh8fd361731wr3j9m23ehi/

Javatpoint. (n.d.). Deep learning for sequential data. Javatpoint. https://www.javatpoint.com/deeplearning-for-sequential-data 13

Liu, X., and Wang, J. (2020). Pre-training transformers for language modeling: A survey. arXiv. https://arxiv.org/abs/2006.09838utm

Berton, M., et al. (2022). Improving music generation using transformerbased models. arXiv. https://arxiv.org/pdf/2203.12105

Huang, A., and Yang, Z. (2018). Music Transformer: Generating Music with Long-Term Structure. arXiv preprint arXiv:1809.04281.

Predicting Music with LLM https://medium.com/@carneyr98/predictingmusic-with-an-llm-d349296a2dd9

Berton, M., et al. (2022). Improving music generation using transformerbased models. arXiv. https://arxiv.org/pdf/2203.12105

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. A., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. arXiv. https://arxiv.org/pdf/1706.03762

Noorfo, M. (2023, April 26). Noorfo music AI: A brief history. Machine Learning for Music. https://mct-master.github.io/machine-learning/2023/04/26/noorfomusic-ai-a-brief-history.html

Data Scientists Diary. (2023, March 3). Deep learning for music composition. Medium. https://medium.com/data-scientists-diary/deep-learning-formusic-composition-ed21ef1fccf7

A Study of Artificial Intelligence for Creative Uses in Music:A Research Paper submitted to the Department of Engineering and Society

Eck, D., Schmidhuber, J. (2002). ”Finding temporal structure in music: Blues improvisation with LSTM recurrent networks.” Neural Networks in the Arts and Humanities.

Dong, H. W., Hsiao, W. Y., Yang, L. C., Yang, Y. H. (2018). ”MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment.” Proceedings of AAAI Conference on Artificial Intelligence.

Hadjeres, G., Pachet, F., Nielsen, F. (2017). ”DeepBach: a steerable model for Bach chorales generation.” Proceedings of the 34th International Conference on Machine Learning (ICML).

Huang, C. Z. A., Vaswani, A., Uszkoreit, J., et al. (2018). ”Music Transformer: Generating Music with Long-Term Structure.” arXiv preprint arXiv:1809.04281.

Payne, C. (2019). ”MuseNet: Composing with Large-Scale Transformer Models.” OpenAI Blog.

Dhariwal, P., Jun, H., Payne, C., et al. (2020). ”Jukebox: A generative model for music.” arXiv preprint arXiv:2005.00341.

Exploring Hybrid GRU-LSTM Networks for Enhanced Music Generation Authors: Suman Maria Tony, S. Sasikumar Citation: Tony, S. M., and Sasikumar, S. (2024). Exploring Hybrid GRU-LSTM Networks for Enhanced Music Generation. SSRG International Journal of Electronics and Communication Engineering, 11(7), 150-162. https://doi.org/10.14445/23488549/IJECEV11I7P115 Indian Classical Music Generation Using LSTM and RNN Authors: Bhavanasree, Krishnan, Shanaz, Krishnan Citation: Bhavanasree, K., Krishnan, K., and Shanaz, K. (2024).

Indian Classical Music Generation Using LSTM and RNN. IJCRT, 3(4), 1-10. https://ijcrt.org/papers/IJCRT2403917.pdf

MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation Authors: Shuyu Li, Yunsick Sung Citation: Li, S., and Sung, Y. (2023). MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation. Mathematics, 11(4), 1-14. https://doi.org/10.3390/math11040114

Xu, X. (2020). LSTM Networks for Music Generation. arXiv preprint arXiv:2006.09838. https://arxiv.org/abs/2006.09838

Kotecha, N., Young, P. (2018). Generating Music using an LSTM Network. arXiv preprint arXiv:1804.07300. https://arxiv.org/abs/1804.07300

Conner, M., Gral, L., Adams, K., Hunger, D., Strelow, R., Neuwirth, A. (2022). Music Generation Using an LSTM. arXiv preprint arXiv:2203.12105. https://arxiv.org/abs/2203.12105

Ingale, V., Mohan, A., Adlakha, D., Kumar, K., Gupta, M. (2021). Music Generation using Three-layered LSTM. arXiv preprint arXiv:2105.09046. https://arxiv.org/abs/2105.09046

Mangal, S., Modak, R., Joshi, P. (2019). LSTM Based Music Generation System. arXiv preprint arXiv:1908.01080. https://arxiv.org/abs/1908.01080

Downloads

Published

2025-04-30

How to Cite

Kashish Bhatia, Dr. Sampada Malhar Margaj, & Dr. Jyotshna Jagdeo Dongardive. (2025). Music Creation Using a Hybrid Model of LLM And LSTM. The Voice of Creative Research, 7(2), 323–335. https://doi.org/10.53032/tvcr/2025.v7n2.40