Browse Articles

Article|01 Jul 2025|OPEN
Mango pangenome reveals dramatic impacts of reference bias on population genomic analyses
Bilal Ahmad1 ,† , Ying Su1,2 ,† , Yani Hao1,3 ,† , Tayyaba Razzaq1 , Rida Arshad1 , Yi Zhang1 , Yingchun Zhang1 , Xingyi Wang1,4 , Guizhou Huang1 , Xiangnian Su1 , Ting Hou1 , Chaochao Li5 , Xuanwen Yang1 , Chuanning Li6 , Zhenzhou Chu2,6 , Qiuyan Wang6 , Yu Zhang7 , Zhongxin Jin5 , Qi Xu1 , Xiaodong Xu1,5 , Yanling Peng1 , Guiqi Bi5 and Chengjie Chen5 , Yeyuan Chen8 , Hua Xiao1 , , Jianfeng Huang5 , , Yongfeng Zhou1,5 , , Xinmin Tian,2,6 ,
1State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences 518000, Shenzhen, China
2Xinjiang Key Laboratory of Biological Resources and Genetic Engineering, College of Life Science and Technology, Xinjiang University, Urumqi, Xinjiang 830046, China
3Department of Horticulture, Hainan Institute of Northwest A&F University, Sanya 572024, China
4College of Forestry, Beijing Forestry University, 100083 Beijing, China
5State Key Laboratory of Tropical Crop Breeding, Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571100, China
6Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection (Ministry of Education) & Guangxi Key Laboratory of Landscape Resources Conservation and Sustainable Utilization in Lijiang River Basin, Guangxi University Engineering Research Center of Bioinformation and Genetic Improvement of Specialty Crops, Guangxi 541006, China
7Guangxi Subtropical Research Institute, Guangxi Academy of Agricultural Sciences, Nanning 530001,China
8Sanya Research Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya 572025, China
*Corresponding author. E-mail: xiaohua01@caas.cn,Huangjian1984xy@163.com,zhouyongfeng@caas.cn,tianxm333333@foxmail.com
Bilal Ahmad,Ying Su,Yani Hao contributed equally to the study.

Horticulture Research 12,
Article number: uhaf166 (2025)
doi: https://doi.org/10.1093/hr/uhaf166
Views: 788

Received: 01 Mar 2025
Accepted: 16 Jun 2025
Published online: 01 Jul 2025

Abstract

Most genomic studies start by mapping sequencing data to a reference genome. The quality of reference genome assembly, genetic relatedness to the studied population, and the mapping method employed directly impact variant calling accuracy and subsequent genomic analyses, introducing reference bias and resulting in erroneous conclusions. However, the impacts of reference bias have gained limited attention. This study compared population genomic analyses using four different reference genomes of mango (Mangifera indica), including the two haploid assemblies of haplotype-resolved telomere-to-telomere (T2T) genome assembly, a pangenome, and an older version of the reference genome available on NCBI. The choice of reference genome dramatically impacted the mapping efficiency and resulted in notable differences in calling the genetic variants, particularly structural variations (SVs). Phylogenetic analysis was more sensitive to the reference genome compared to genetic differentiation. Population genomic analyses of artificial selection in domestication and SV hotspot regions varied across reference genomes. Notably, the gene enrichment analyses showed significant differences in the top enriched biological processes depending on the reference genome used. Overall, the mango pangenome outperformed the other reference genomes across various metrics, followed by T2T reference genomes, as they captured greater diversity and effectively reduced reference bias. Our findings highlight the role of the mango pangenome in reducing reference bias and underscore the critical role of reference genome selection, suggesting that it is one of the most important factors in population genomic studies.