Browse Articles

Article|06 Feb 2025|OPEN
The China National GeneBank Sequence Archive (CNSA) 2024 update
Weiwen Wang1,2,3 ,† , Cong Tan4,5 ,† , Ling Li1,2,3 ,† , Xia Li2,3 , Lei Zhang5 , Xiaoqiang Li5 , Jieyu Wang2,3 , Ziyi He5 , Tao Yang2,3 , Kailong Ma2,3 , Qingjiang Hu2,3 , Wenzhen Yang2,3 , Zhiyong Li5 , Mingwen Zhang5 , Wensi Du2,3 and Fan Yang2,3 , Zhicheng Xu2,3 , Xizheng Ma2,3 , Jiawei Tong5 , Jia Cai5 , Cong Hua5 , Fengzhen Chen3 , Lijin You2,3 , Liang Li2,3 , Wenjun Zeng2,3 , Bo Wang2,3 , , Xun Xu3 , , Xiaofeng Wei,1,2,3 ,
1Guangdong Provincial Genomics Data Center, BGI Research, Beishan Road, Yantian District, Shenzhen 518120, China
2China National GeneBank, BGI Research, Jinsha Road, Dapeng District, Shenzhen 518120, China
3BGI Research, Beishan Road, Yantian District, Shenzhen 518083, China
4State Key Laboratory of Agricultural Genomics, BGI Bioverse, Yunhua Road, Yantian District, Shenzhen 518083, China
5Data Application Center, BGI Research, Kejisan Road, Donghu District, Wuhan 430074, China
*Corresponding author. E-mail: wangbo@cngb.org,xuxun@genomics.cn,weixiaofeng@cngb.org
Weiwen Wang,Cong Tan,Ling Li contributed equally to the study.

Horticulture Research 12,
Article number: uhaf036 (2025)
doi: https://doi.org/10.1093/hr/uhaf036
Views: 1796

Received: 14 Sep 2024
Accepted: 25 Jan 2025
Published online: 06 Feb 2025

Abstract

Abstract

The China National GeneBank Sequence Archive (CNSA) is an open and freely accessible curated data repository built for archiving, sharing, and reutilizing of multiomics data. The remarkable advancement in sequencing technologies has triggered a paradigm shift in life science research. However, it also poses tremendous challenges for the research community in data management and reusability. With the dramatic advance of sequencing technologies like spatial transcriptome sequencing, it brings an unprecedented explosion in sequence data and new requirements for data archiving. CNSA was established in 2017 as one of the fundamental infrastructures to offer multiomics data archiving for the worldwide research community. Here, we present the state-of-the-art enhancements of CNSA encompassing the dramatical increase of varied types of data, the latest features and services implemented in CNSA as well as consistent efforts supporting global cooperation in biodiversity preservation and utilization. CNSA provides public archiving and open-sharing services for sequencing data and relevant metadata including genome, transcriptome, metabolism, and proteome from single-cell (also spatial resolved) level to individual and population level, as well as further analyzed results. As of 2024, CNSA has archived >16.3 petabytes of data and provided the data curation, preservation, and open-share service for >1581 publications from >560 institutions. It plays a pivotal role in supporting global scientific projects such as the 10 000 Plant Genomes Project. So far, CNSA has been recommended by various academic publishers such as Cell, Elsevier, and Oxford University Press. CNSA is accessible at https://db.cngb.org/cnsa/.