Browse Articles

Article|11 Aug 2025|OPEN
RAGA: a reference-assisted genome assembly tool for efficient population-scale assembly
Ru-Peng Zhao1 ,† , Yu-Hong Luo1 ,† , Wen-Zhao Xie2 ,† , Zu-Wen Zhou1 , Yong-Qing Qian3 , Si-Long Yuan1 , Dong-Ao Li1 , Jiana Li3 , Kun Lu3 , Xingtan Zhang4 , Jia-Ming Song3 , and Ling-Ling Chen,1,5 ,
1State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
2Ministry of Education Key Laboratory of Molecular and Cellular Biology, Hebei Research Center of the Basic Discipline of Cell Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang 050024, China
3College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China
4National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangzhou 518120, China
5Yazhouwan National Laboratory, Sanya 572025, China
*Corresponding author. E-mail: jmsong@swu.edu.cn,llchen@gxu.edu.cn
Ru-Peng Zhao,Yu-Hong Luo and Wen-Zhao Xie contributed equally to the study.

Horticulture Research 12,
Article number: uhaf207 (2025)
doi: https://doi.org/10.1093/hr/uhaf207
Views: 177

Received: 11 Dec 2024
Accepted: 29 Jul 2025
Published online: 11 Aug 2025

Abstract

High-quality reference genomes at the population scale are fundamental for advancing pan-genomic research. However, high-quality genome assembly at the population scale is costly and time-consuming. To overcome these limitations, we developed Reference-Assisted Genome Assembly (RAGA), a hybrid computational tool that combines de novo and reference-based assembly approaches. RAGA efficiently employs existing reference genomes from the same or closely related species in combination with PacBio HiFi reads to produce high-quality alternative long sequences. These sequences can be integrated with de novo assemblies to improve assembly quality across population-scale datasets. The performance of RAGA across various plant genomes demonstrated its ability to reduce the number of contigs, decrease gaps, and correct genome assembly errors. The implementation of RAGA (available at https://github.com/wzxie/RAGA) significantly streamlines population-scale genome assembly workflows, providing a robust foundation for comprehensive pan-genomic investigations. This tool represents a substantial advancement in making large-scale genomic studies more accessible and efficient.