INTRODUCTION:

dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome. Its current version is based on the Gencode release 29 / Ensembl version 94 and includes a total of 84,013,490 nsSNVs and ssSNVs (splicing-site SNVs). It compiles prediction scores from 43 prediction algorithms (SIFT, SIFT4G, Polyphen2-HDIV, Polyphen2-HVAR, LRT, MutationTaster2, MutationAssessor, FATHMM, MetaSVM, MetaLR, MetaRNN, CADD, CADD_hg19, VEST4, PROVEAN, FATHMM-MKL coding, FATHMM-XF coding, fitCons x 4, LINSIGHT, DANN, GenoCanyon, Eigen, Eigen-PC, M-CAP, REVEL, MutPred, MVP, gMVP, MPC, PrimateAI, GEOGEN2, BayesDel_addAF, BayesDel_noAF, ClinPred, LIST-S2, VARITY, ESM1b, EVE, AlphaMissense, ALoFT), 9 conservation scores (PhyloP x 3, phastCons x 3, GERP++, SiPhy and bStatistic) and other related information including allele frequencies observed in the 1000 Genomes Project phase 3 data, UK10K cohorts data, ExAC consortium data, gnomAD data and the NHLBI Exome Sequencing Project ESP6500 data, various gene IDs from different databases, functional descriptions of genes, gene expression and gene interaction information, etc.

Some dbNSFP contents (may not be up-to-date though) can also be accessed through variant tools, ANNOVAR, KGGSeq, VarSome, UCSC Genome Browser's Variant Annotation Integrator, Ensembl Variant Effect Predictor, SnpSift and HGMD. Please cite our papers (see below) if you used dbNSFP contents through those tools. Please note some component score/content of dbNSFP has specific requirements or licence for non-academic usage. dbNSFP does not grant the non-academic usage of those scores/contents, so please contact the original score/content providers for that purpose.

Please join our Email group for news and updates from dbNSFP.

For whole genome annotation, we recommend our whole genome annotation pipeline WGSA, in which dbNSFP is a component resource.

We thank Dr. CS (Jonathan) Liu from Softgenetics and Amazon AWS Open Data Sponsorship Program for providing hosting space.

We welcome developers of functional prediction methods to provide their predictions and scores to the database. Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com).

DATABSE:

You will be directed to query page by selecting database version below:




CITATION:

1. Liu X, Jian X, and Boerwinkle E. 2011. dbNSFP: a lightweight database of human non-synonymous SNPs and their functional predictions. Human Mutation. 32:894-899.

2. Liu X, Jian X, and Boerwinkle E. 2013. dbNSFP v2.0: A Database of Human Non-synonymous SNVs and Their Functional Predictions and Annotations. Human Mutation. 34:E2393-E2402.

3. Liu X, Wu C, Li C and Boerwinkle E. 2016. dbNSFP v3.0: A One-Stop Database of Functional Predictions and Annotations for Human Non-synonymous and Splice Site SNVs. Human Mutation. 37:235-241.

4. Liu X, Li C, Mou C, Dong Y, and Tu Y. 2020. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Medicine. 12:103.

If you used dbNSFP v1.x, please cite our paper 1. If you used dbNSFP v2.x, please cite our papers 1 & 2. If you used dbNSFP v3.x, please cite our papers 1 & 3. If you used dbNSFP v4.x, please cite our papers 1 & 4.

If you used our ensemble scores (MetaSVM and MetaLR), which are based on 10 component scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes populations. Please cite:

1. Dong C, Wei P, Jian X, Gibbs R, Boerwinkle E, Wang K* and Liu X*. (2015) Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Human Molecular Genetics 24(8):2125-2137. *corresponding authors [preprint]

VERSIONS:

NEW VERSION (November 2, 2023): dbNSFP v4.5 is released. ClinVar has been updated to 20231028. ESM1b, EVE and AlphaMissense scores have been added. AlphaMissense scores are for non-commercial research use only: "AlphaMissense Database Copyright (2023) DeepMind Technologies Limited. All predictions are provided for non-commercial research use only under CC BY-NC-SA license." This distribution of the derived AlphaMissense_score, AlphaMissense_rankscore, and AlphaMissense_pred in dbNSFP are also under CC BY-NC-SA license and only included in the "a" branch of dbNSFP. A copy of CC BY-NC-SA license can be found at https://creativecommons.org/licenses/by-nc-sa/4.0/.

Two branches of dbNSFP are provided: dbNSFP4.5a suitable for academic use, which includes all the resources, and dbNSFP4.5c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, GenoCanyon, and AlphaMissense. dbNSFP4.5a can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.5c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): ESM1b_score, ESM1b_rankscore, ESM1b_pred, EVE_score, EVE_rankscore, EVE_Class10_pred, EVE_Class20_pred, EVE_Class25_pred, EVE_Class30_pred, EVE_Class40_pred, EVE_Class50_pred, EVE_Class60_pred, EVE_Class70_pred, EVE_Class75_pred, EVE_Class80_pred, EVE_Class90_pred, AlphaMissense_score, AlphaMissense_rankscore, AlphaMissense_pred

NEW VERSION (May 6, 2023): dbNSFP v4.4 is released. gMVP and VARITY scores have been added. Allele frequencies of ALFA (Allele Frequency Aggregator) have been added. dbSNP has been updated to b156. clinvar has been updated to 20230430. phyloP30way_mammalian has been replaced by phyloP470way_mammalian. phastCons30way_mammalian has been replaced by phastCons470way_mammalian. A bug in MutPred scores (not all SNVs causing the same AA change have scores) has been fixed. We thank Mária Šurinová for reporting this bug.

Two branches of dbNSFP are provided: dbNSFP4.4a suitable for academic use, which includes all the resources, and dbNSFP4.4c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, and GenoCanyon. dbNSFP4.4a can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.4c can be downloaded from Amazon or Box or googledrive or softgenetics ftp. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): gMVP_score, gMVP_rankscore, VARITY_R_score, VARITY_R_rankscore, VARITY_ER_score, VARITY_ER_rankscore, VARITY_R_LOO_score, VARITY_R_LOO_rankscore, VARITY_ER_LOO_score, VARITY_ER_LOO_rankscore, ALFA_European_AC, ALFA_European_AN, ALFA_European_AF, ALFA_African_Others_AC, ALFA_African_Others_AN, ALFA_African_Others_AF, ALFA_East_Asian_AC, ALFA_East_Asian_AN, ALFA_East_Asian_AF, ALFA_African_American_AC, ALFA_African_American_AN, ALFA_African_American_AF, ALFA_Latin_American_1_AC, ALFA_Latin_American_1_AN, ALFA_Latin_American_1_AF, ALFA_Latin_American_2_AC, ALFA_Latin_American_2_AN, ALFA_Latin_American_2_AF, ALFA_Other_Asian_AC, ALFA_Other_Asian_AN, ALFA_Other_Asian_AF, ALFA_South_Asian_AC, ALFA_South_Asian_AN, ALFA_South_Asian_AF, ALFA_Other_AC, ALFA_Other_AN, ALFA_Other_AF, ALFA_African_AC, ALFA_African_AN, ALFA_African_AF, ALFA_Asian_AC, ALFA_Asian_AN, ALFA_Asian_AF, ALFA_Total_AC, ALFA_Total_AN, ALFA_Total_AF

Columns name changes (dbNSFP_variant): phyloP30way_mammalian changed to phyloP470way_mammalian, phyloP30way_mammalian_rankscore changed to phyloP470way_mammalian_rankscore, phastCons30way_mammalian changed to phastCons470way_mammalian, phastCons30way_mammalian_rankscore changed to phastCons470way_mammalian_rankscore

NEW VERSION (February 18, 2022): dbNSFP v4.3 is released. REVEL scores have been updated with transcript ids, i.e., the scores are now transcript-specific. Genotypes of Chagyrskaya Neandertals have been added. dbSNP has been updated to b155. ClinVar has been updated to 20220122. dbNSFP4.3a can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.3c can be downloaded from Amazon or Box or googledrive. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): ChagyrskayaNeandertal

UPDATE (July 27, 2021): Allele Frequencies of gnomAD v3.1 mtDNA have been added. Users who want to use those annotations can download the corresponding files for v4.2a and v4.2c and the updated readme files for v4.2a and v4.2c.

UPDATE (July 14, 2021): There is a bug in the search program when adding columns of gene annotations. We thank Eun Gyo Kim for reporting it. Please re-download the search programs for v4.2a and v4.2c and unzip the files to the dbNSFP folder.

UPDATE (April 14, 2021): We added the support of searching SpliceAI to the command-line only version of the search programs for v4.2a and v4.2c. If you want to use those programs, please download the corresponding zip files. Then unzip the files to the dbNSFP folder and run them according to the updated search_dbNSFP readme file.

NEW VERSION (April 6, 2021): dbNSFP v4.2 is released. MetaRNN scores (https://doi.org/10.1101/2021.04.09.438706) have been added. MetaRNN is a deep learning based ensemble pathogenicity prediction score for nsSNVs and non-frameshift indels. MetaRNN used a recurrent neural network (RNN) to integrate information from 16 high-level pathogenicity prediction scores, 8 conservation scores, and allele frequency information from the 1000 Genomes Project (1000GP), ExAC, and gnomAD. Allele frequencies of gnomAD exome have been updated to r2.1.1. Allele Frequencies of gnomAD genome have been updated to v3.1. dbSNP has been updated to 154. clinvar has been updated to 20210131. dbNSFP4.2a can be downloaded from ftp://dbnsfp:dbnsfp@dbnsfp.softgenetics.com/dbNSFP4.2a.zip or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.2c can be downloaded from ftp://dbnsfp:dbnsfp@dbnsfp.softgenetics.com/dbNSFP4.2c.zip or googledrive. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): MetaRNN_score, MetaRNN_rankscore, MetaRNN_pred, gnomAD_exomes_non_neuro_AN, gnomAD_exomes_non_neuro_AF, gnomAD_exomes_non_neuro_nhomalt, gnomAD_exomes_non_cancer_AC, gnomAD_exomes_non_cancer_AN, gnomAD_exomes_non_cancer_AF, gnomAD_exomes_non_cancer_nhomalt, gnomAD_exomes_non_topmed_AC, gnomAD_exomes_non_topmed_AN, gnomAD_exomes_non_topmed_AF, gnomAD_exomes_non_topmed_nhomalt, gnomAD_exomes_non_neuro_AFR_AC, gnomAD_exomes_non_neuro_AFR_AN, gnomAD_exomes_non_neuro_AFR_AF, gnomAD_exomes_non_neuro_AFR_nhomalt, gnomAD_exomes_non_neuro_AMR_AC, gnomAD_exomes_non_neuro_AMR_AN, gnomAD_exomes_non_neuro_AMR_AF, gnomAD_exomes_non_neuro_AMR_nhomalt, gnomAD_exomes_non_neuro_ASJ_AC, gnomAD_exomes_non_neuro_ASJ_AN, gnomAD_exomes_non_neuro_ASJ_AF, gnomAD_exomes_non_neuro_ASJ_nhomalt, gnomAD_exomes_non_neuro_EAS_AC, gnomAD_exomes_non_neuro_EAS_AN, gnomAD_exomes_non_neuro_EAS_AF, gnomAD_exomes_non_neuro_EAS_nhomalt, gnomAD_exomes_non_neuro_FIN_AC, gnomAD_exomes_non_neuro_FIN_AN, gnomAD_exomes_non_neuro_FIN_AF, gnomAD_exomes_non_neuro_FIN_nhomalt, gnomAD_exomes_non_neuro_NFE_AC, gnomAD_exomes_non_neuro_NFE_AN, gnomAD_exomes_non_neuro_NFE_AF, gnomAD_exomes_non_neuro_NFE_nhomalt, gnomAD_exomes_non_neuro_SAS_AC, gnomAD_exomes_non_neuro_SAS_AN, gnomAD_exomes_non_neuro_SAS_AF, gnomAD_exomes_non_neuro_SAS_nhomalt, gnomAD_exomes_non_neuro_POPMAX_AC, gnomAD_exomes_non_neuro_POPMAX_AN, gnomAD_exomes_non_neuro_POPMAX_AF, gnomAD_exomes_non_neuro_POPMAX_nhomalt, gnomAD_exomes_non_cancer_AFR_AC, gnomAD_exomes_non_cancer_AFR_AN, gnomAD_exomes_non_cancer_AFR_AF, gnomAD_exomes_non_cancer_AFR_nhomalt, gnomAD_exomes_non_cancer_AMR_AC, gnomAD_exomes_non_cancer_AMR_AN, gnomAD_exomes_non_cancer_AMR_AF, gnomAD_exomes_non_cancer_AMR_nhomalt, gnomAD_exomes_non_cancer_ASJ_AC, gnomAD_exomes_non_cancer_ASJ_AN, gnomAD_exomes_non_cancer_ASJ_AF, gnomAD_exomes_non_cancer_ASJ_nhomalt, gnomAD_exomes_non_cancer_EAS_AC, gnomAD_exomes_non_cancer_EAS_AN, gnomAD_exomes_non_cancer_EAS_AF, gnomAD_exomes_non_cancer_EAS_nhomalt, gnomAD_exomes_non_cancer_FIN_AC, gnomAD_exomes_non_cancer_FIN_AN, gnomAD_exomes_non_cancer_FIN_AF, gnomAD_exomes_non_cancer_FIN_nhomalt, gnomAD_exomes_non_cancer_NFE_AC, gnomAD_exomes_non_cancer_NFE_AN, gnomAD_exomes_non_cancer_NFE_AF, gnomAD_exomes_non_cancer_NFE_nhomalt, gnomAD_exomes_non_cancer_SAS_AC, gnomAD_exomes_non_cancer_SAS_AN, gnomAD_exomes_non_cancer_SAS_AF, gnomAD_exomes_non_cancer_SAS_nhomalt, gnomAD_exomes_non_cancer_POPMAX_AC, gnomAD_exomes_non_cancer_POPMAX_AN, gnomAD_exomes_non_cancer_POPMAX_AF, gnomAD_exomes_non_cancer_POPMAX_nhomalt, gnomAD_exomes_non_topmed_AFR_AC, gnomAD_exomes_non_topmed_AFR_AN, gnomAD_exomes_non_topmed_AFR_AF, gnomAD_exomes_non_topmed_AFR_nhomalt, gnomAD_exomes_non_topmed_AMR_AC, gnomAD_exomes_non_topmed_AMR_AN, gnomAD_exomes_non_topmed_AMR_AF, gnomAD_exomes_non_topmed_AMR_nhomalt, gnomAD_exomes_non_topmed_ASJ_AC, gnomAD_exomes_non_topmed_ASJ_AN, gnomAD_exomes_non_topmed_ASJ_AF, gnomAD_exomes_non_topmed_ASJ_nhomalt, gnomAD_exomes_non_topmed_EAS_AC, gnomAD_exomes_non_topmed_EAS_AN, gnomAD_exomes_non_topmed_EAS_AF, gnomAD_exomes_non_topmed_EAS_nhomalt, gnomAD_exomes_non_topmed_FIN_AC, gnomAD_exomes_non_topmed_FIN_AN, gnomAD_exomes_non_topmed_FIN_AF, gnomAD_exomes_non_topmed_FIN_nhomalt, gnomAD_exomes_non_topmed_NFE_AC, gnomAD_exomes_non_topmed_NFE_AN, gnomAD_exomes_non_topmed_NFE_AF, gnomAD_exomes_non_topmed_NFE_nhomalt, gnomAD_exomes_non_topmed_SAS_AC, gnomAD_exomes_non_topmed_SAS_AN, gnomAD_exomes_non_topmed_SAS_AF, gnomAD_exomes_non_topmed_SAS_nhomalt, gnomAD_exomes_non_topmed_POPMAX_AC, gnomAD_exomes_non_topmed_POPMAX_AN, gnomAD_exomes_non_topmed_POPMAX_AF, gnomAD_exomes_non_topmed_POPMAX_nhomalt, gnomAD_genomes_MID_AC, gnomAD_genomes_MID_AN, gnomAD_genomes_MID_AF, gnomAD_genomes_MID_nhomalt, gnomAD_genomes_controls_and_biobanks_AC, gnomAD_genomes_controls_and_biobanks_AN, gnomAD_genomes_controls_and_biobanks_AF, gnomAD_genomes_controls_and_biobanks_nhomalt, gnomAD_genomes_non_neuro_AC, gnomAD_genomes_non_neuro_AN, gnomAD_genomes_non_neuro_AF, gnomAD_genomes_non_neuro_nhomalt, gnomAD_genomes_non_cancer_AC, gnomAD_genomes_non_cancer_AN, gnomAD_genomes_non_cancer_AF, gnomAD_genomes_non_cancer_nhomalt, gnomAD_genomes_non_topmed_AC, gnomAD_genomes_non_topmed_AN, gnomAD_genomes_non_topmed_AF, gnomAD_genomes_non_topmed_nhomalt, gnomAD_genomes_controls_and_biobanks_AFR_AC, gnomAD_genomes_controls_and_biobanks_AFR_AN, gnomAD_genomes_controls_and_biobanks_AFR_AF, gnomAD_genomes_controls_and_biobanks_AFR_nhomalt, gnomAD_genomes_controls_and_biobanks_AMI_AC, gnomAD_genomes_controls_and_biobanks_AMI_AN, gnomAD_genomes_controls_and_biobanks_AMI_AF, gnomAD_genomes_controls_and_biobanks_AMI_nhomalt, gnomAD_genomes_controls_and_biobanks_AMR_AC, gnomAD_genomes_controls_and_biobanks_AMR_AN, gnomAD_genomes_controls_and_biobanks_AMR_AF, gnomAD_genomes_controls_and_biobanks_AMR_nhomalt, gnomAD_genomes_controls_and_biobanks_ASJ_AC, gnomAD_genomes_controls_and_biobanks_ASJ_AN, gnomAD_genomes_controls_and_biobanks_ASJ_AF, gnomAD_genomes_controls_and_biobanks_ASJ_nhomalt, gnomAD_genomes_controls_and_biobanks_EAS_AC, gnomAD_genomes_controls_and_biobanks_EAS_AN, gnomAD_genomes_controls_and_biobanks_EAS_AF, gnomAD_genomes_controls_and_biobanks_EAS_nhomalt, gnomAD_genomes_controls_and_biobanks_FIN_AC, gnomAD_genomes_controls_and_biobanks_FIN_AN, gnomAD_genomes_controls_and_biobanks_FIN_AF, gnomAD_genomes_controls_and_biobanks_FIN_nhomalt, gnomAD_genomes_controls_and_biobanks_MID_AC, gnomAD_genomes_controls_and_biobanks_MID_AN, gnomAD_genomes_controls_and_biobanks_MID_AF, gnomAD_genomes_controls_and_biobanks_MID_nhomalt, gnomAD_genomes_controls_and_biobanks_NFE_AC, gnomAD_genomes_controls_and_biobanks_NFE_AN, gnomAD_genomes_controls_and_biobanks_NFE_AF, gnomAD_genomes_controls_and_biobanks_NFE_nhomalt, gnomAD_genomes_controls_and_biobanks_SAS_AC, gnomAD_genomes_controls_and_biobanks_SAS_AN, gnomAD_genomes_controls_and_biobanks_SAS_AF, gnomAD_genomes_controls_and_biobanks_SAS_nhomalt, gnomAD_genomes_non_neuro_AFR_AC, gnomAD_genomes_non_neuro_AFR_AN, gnomAD_genomes_non_neuro_AFR_AF, gnomAD_genomes_non_neuro_AFR_nhomalt, gnomAD_genomes_non_neuro_AMI_AC, gnomAD_genomes_non_neuro_AMI_AN, gnomAD_genomes_non_neuro_AMI_AF, gnomAD_genomes_non_neuro_AMI_nhomalt, gnomAD_genomes_non_neuro_AMR_AC, gnomAD_genomes_non_neuro_AMR_AN, gnomAD_genomes_non_neuro_AMR_AF, gnomAD_genomes_non_neuro_AMR_nhomalt, gnomAD_genomes_non_neuro_ASJ_AC, gnomAD_genomes_non_neuro_ASJ_AN, gnomAD_genomes_non_neuro_ASJ_AF, gnomAD_genomes_non_neuro_ASJ_nhomalt, gnomAD_genomes_non_neuro_EAS_AC, gnomAD_genomes_non_neuro_EAS_AN, gnomAD_genomes_non_neuro_EAS_AF, gnomAD_genomes_non_neuro_EAS_nhomalt, gnomAD_genomes_non_neuro_FIN_AC, gnomAD_genomes_non_neuro_FIN_AN, gnomAD_genomes_non_neuro_FIN_AF, gnomAD_genomes_non_neuro_FIN_nhomalt, gnomAD_genomes_non_neuro_MID_AC, gnomAD_genomes_non_neuro_MID_AN, gnomAD_genomes_non_neuro_MID_AF, gnomAD_genomes_non_neuro_MID_nhomalt, gnomAD_genomes_non_neuro_NFE_AC, gnomAD_genomes_non_neuro_NFE_AN, gnomAD_genomes_non_neuro_NFE_AF, gnomAD_genomes_non_neuro_NFE_nhomalt, gnomAD_genomes_non_neuro_SAS_AC, gnomAD_genomes_non_neuro_SAS_AN, gnomAD_genomes_non_neuro_SAS_AF, gnomAD_genomes_non_neuro_SAS_nhomalt, gnomAD_genomes_non_cancer_AFR_AC, gnomAD_genomes_non_cancer_AFR_AN, gnomAD_genomes_non_cancer_AFR_AF, gnomAD_genomes_non_cancer_AFR_nhomalt, gnomAD_genomes_non_cancer_AMI_AC, gnomAD_genomes_non_cancer_AMI_AN, gnomAD_genomes_non_cancer_AMI_AF, gnomAD_genomes_non_cancer_AMI_nhomalt, gnomAD_genomes_non_cancer_AMR_AC, gnomAD_genomes_non_cancer_AMR_AN, gnomAD_genomes_non_cancer_AMR_AF, gnomAD_genomes_non_cancer_AMR_nhomalt, gnomAD_genomes_non_cancer_ASJ_AC, gnomAD_genomes_non_cancer_ASJ_AN, gnomAD_genomes_non_cancer_ASJ_AF, gnomAD_genomes_non_cancer_ASJ_nhomalt, gnomAD_genomes_non_cancer_EAS_AC, gnomAD_genomes_non_cancer_EAS_AN, gnomAD_genomes_non_cancer_EAS_AF, gnomAD_genomes_non_cancer_EAS_nhomalt, gnomAD_genomes_non_cancer_FIN_AC, gnomAD_genomes_non_cancer_FIN_AN, gnomAD_genomes_non_cancer_FIN_AF, gnomAD_genomes_non_cancer_FIN_nhomalt, gnomAD_genomes_non_cancer_MID_AC, gnomAD_genomes_non_cancer_MID_AN, gnomAD_genomes_non_cancer_MID_AF, gnomAD_genomes_non_cancer_MID_nhomalt, gnomAD_genomes_non_cancer_NFE_AC, gnomAD_genomes_non_cancer_NFE_AN, gnomAD_genomes_non_cancer_NFE_AF, gnomAD_genomes_non_cancer_NFE_nhomalt, gnomAD_genomes_non_cancer_SAS_AC, gnomAD_genomes_non_cancer_SAS_AN, gnomAD_genomes_non_cancer_SAS_AF, gnomAD_genomes_non_cancer_SAS_nhomalt, gnomAD_genomes_non_topmed_AFR_AC, gnomAD_genomes_non_topmed_AFR_AN, gnomAD_genomes_non_topmed_AFR_AF, gnomAD_genomes_non_topmed_AFR_nhomalt, gnomAD_genomes_non_topmed_AMI_AC, gnomAD_genomes_non_topmed_AMI_AN, gnomAD_genomes_non_topmed_AMI_AF, gnomAD_genomes_non_topmed_AMI_nhomalt, gnomAD_genomes_non_topmed_AMR_AC, gnomAD_genomes_non_topmed_AMR_AN, gnomAD_genomes_non_topmed_AMR_AF, gnomAD_genomes_non_topmed_AMR_nhomalt, gnomAD_genomes_non_topmed_ASJ_AC, gnomAD_genomes_non_topmed_ASJ_AN, gnomAD_genomes_non_topmed_ASJ_AF, gnomAD_genomes_non_topmed_ASJ_nhomalt, gnomAD_genomes_non_topmed_EAS_AC, gnomAD_genomes_non_topmed_EAS_AN, gnomAD_genomes_non_topmed_EAS_AF, gnomAD_genomes_non_topmed_EAS_nhomalt, gnomAD_genomes_non_topmed_FIN_AC, gnomAD_genomes_non_topmed_FIN_AN, gnomAD_genomes_non_topmed_FIN_AF, gnomAD_genomes_non_topmed_FIN_nhomalt, gnomAD_genomes_non_topmed_MID_AC, gnomAD_genomes_non_topmed_MID_AN, gnomAD_genomes_non_topmed_MID_AF, gnomAD_genomes_non_topmed_MID_nhomalt, gnomAD_genomes_non_topmed_NFE_AC, gnomAD_genomes_non_topmed_NFE_AN, gnomAD_genomes_non_topmed_NFE_AF, gnomAD_genomes_non_topmed_NFE_nhomalt, gnomAD_genomes_non_topmed_SAS_AC, gnomAD_genomes_non_topmed_SAS_AN, gnomAD_genomes_non_topmed_SAS_AF, gnomAD_genomes_non_topmed_SAS_nhomalt

Columns name changes (dbNSFP_variant): rs_dbSNP151 changed to rs_dbSNP

UPDATE (March 12, 2021): A bug fixed. Some ALoFT scores/information are missing in dbNSFP. We thank Dr. Shuwei Li for reporting this bug. If you want to use ALoFT scores, please re-download the updated version. dbNSFP4.1a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.1c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

UPDATE (Feb 10, 2021): A bug fixed. The gnomAD_pLI, gnomAD_pRec and gnomAD_pNull scores in dbNSFP4.1_gene.gz and dbNSFP4.1_gene.complete.gz have a problem that the scores are not always corresponding to the canonical transcripts of the genes. We thank Dr. Raphaël Helaers for reporting this bug. If you want to use those scores, please download the updated version of dbNSFP4.1_gene.gz and dbNSFP4.1_gene.complete.gz to replace the old files.

UPDATE (Jan 27, 2021): Because the search programs provided in v4.1 cannot run in a command-line environment without X11 support, here we add back the command-line only version of the search programs for v4.1a and v4.1c. If you want to use those programs, please download the corresponding zip files. Then unzip the files to the dbNSFP folder and run them according to the updated search_dbNSFP readme file.

UPDATE (June 16, 2020): dbNSFP v4.1 is released. BayesDel (https://doi.org/10.1002/humu.23158), ClinPred (https://doi.org/10.1016/j.ajhg.2018.08.005) and LIST-S2 (https://doi.org/10.1093/nar/gkaa288) scores have been added. CADD has been updated to v1.6, CADD score based on hg19 model has been added. gnomAD genomes have been updated to r3.0: populations AMI (Amish) and SAS (South Asian) have been added; controls have been removed. Clinvar, GTEx have been updated. HPO terms have been added to the dbNSFP_gene. search_dbNSFP programs now support searching SpliceAI (https://doi.org/10.1016/j.cell.2018.12.015) as an attached database, please refer to the readme files of the search_dbNSFP programs for details.

Two branches of dbNSFP are provided: dbNSFP4.1a suitable for academic use, which includes all the resources, and dbNSFP4.1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, ClinPred, CADD, LINSIGHT, and GenoCanyon.

dbNSFP4.1a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.1c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): BayesDel_addAF_score, BayesDel_addAF_rankscore, BayesDel_addAF_pred, BayesDel_noAF_score, BayesDel_noAF_rankscore, BayesDel_noAF_pred, LIST-S2_score, LIST-S2_rankscore, LIST-S2_pred, CADD_raw_hg19, CADD_raw_rankscore_hg19, CADD_phred_hg19, gnomAD_genomes_AMI_AC, gnomAD_genomes_AMI_AN, gnomAD_genomes_AMI_AF, gnomAD_genomes_AMI_nhomalt, gnomAD_genomes_SAS_AC, gnomAD_genomes_SAS_AN, gnomAD_genomes_SAS_AF, gnomAD_genomes_SAS_nhomalt

Columns name changes (dbNSFP_variant): GTEx_V7_gene changed to GTEx_V8_gene, GTEx_V7_tissue changed to GTEx_V8_tissue

Columns deleted (dbNSFP_variant): gnomAD_genomes_controls_AC, gnomAD_genomes_controls_AN, gnomAD_genomes_controls_AF, gnomAD_genomes_controls_nhomalt, gnomAD_genomes_controls_AFR_AC, gnomAD_genomes_controls_AFR_AN, gnomAD_genomes_controls_AFR_AF, gnomAD_genomes_controls_AFR_nhomalt, gnomAD_genomes_controls_AMR_AC, gnomAD_genomes_controls_AMR_AN, gnomAD_genomes_controls_AMR_AF, gnomAD_genomes_controls_AMR_nhomalt, gnomAD_genomes_controls_ASJ_AC, gnomAD_genomes_controls_ASJ_AN, gnomAD_genomes_controls_ASJ_AF, gnomAD_genomes_controls_ASJ_nhomalt, gnomAD_genomes_controls_EAS_AC, gnomAD_genomes_controls_EAS_AN, gnomAD_genomes_controls_EAS_AF, gnomAD_genomes_controls_EAS_nhomalt, gnomAD_genomes_controls_FIN_AC, gnomAD_genomes_controls_FIN_AN, gnomAD_genomes_controls_FIN_AF, gnomAD_genomes_controls_FIN_nhomalt, gnomAD_genomes_controls_NFE_AC, gnomAD_genomes_controls_NFE_AN, gnomAD_genomes_controls_NFE_AF, gnomAD_genomes_controls_NFE_nhomalt, gnomAD_genomes_controls_POPMAX_AC, gnomAD_genomes_controls_POPMAX_AN, gnomAD_genomes_controls_POPMAX_AF, gnomAD_genomes_controls_POPMAX_nhomalt

Columns added (dbNSFP_gene): HPO_id, HPO_name

UPDATE (May 15, 2020): A minor bug is fixed in dbNSFP v4.0. In the previous release, the column Primate_AI_pred was not 100% correct. We thank Alex Kouris for reporting this issue. If you want to use Primate_AI_pred please download it again.

Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources, and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon.

dbNSFP4.0a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

UPDATE (December 5, 2019): A minor bug is fixed in dbNSFP v4.0. In the previous release the content of the following columns were compressed, i.e. if annotations for all transcripts are identical, only one annotation was presented: genename, cds_strand, refcodon, codonpos, codon_degeneracy, FATHMM_score, FATHMM_pred, Interpro_domain. In this new release, those columns are decompressed, i.e. have the same number of annotations as the number of transcripts. A Java-based graphic user interface (GUI) search program (search_dbNSFP40a.jar or search_dbNSFP40c.jar) has been added. Users can double-click the jar file to launch the GUI (it supports command-line also, please check the search_dbNSFP readme pdf for details).

Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources, and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon.

dbNSFP4.0a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

NEW VERSION (May 3, 2019): dbNSFP v4.0 is formally released. HGVS c. and p. presentations from ANNOVAR, SnpEff and VEP have been added. search_dbNSFP now supports search based on HGVS c. and p. presentations. Please refer to search_dbNSFP40a.readme.pdf or search_dbNSFP40c.readme.pdf for details. MedGen ID, OMIM ID and Orphanet ID from clinvar have been added.

Two branches of dbNSFP are provided: dbNSFP4.0a suitable for academic use, which includes all the resources, and dbNSFP4.0c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com) for commercial usage of dbNSFP.

dbNSFP4.0a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

UPDATE (February 20, 2019): dbNSFP v4.0b2 is released for beta testing. Uniprot sprot_varsplic was included in the mapping from Uniprot to Ensembl. Fixed column title inconsistency between the README file and data file. (We thank Kevin Xin and Julius Jacobsen for pointing out the inconsistency.) dbMTS was added as an attached database. search_dbNSFP added support for searching dbMTS with option '-m'.

Two branches of dbNSFP are provided: dbNSFP4.0b2a suitable for academic use, which includes all the resources, and dbNSFP4.0b2c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com) for commercial usage of dbNSFP.

dbNSFP4.0b2a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0b2c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

UPDATE (December 30, 2018): A bug causing id mapping issue from Uniprot to Ensembl, which further causing increased missing rates of Polyphen2, MutationAssessor and DEOGEN2, has been found and fixed (We thank Dr. Daniele Raimondi). If you downloaded dbNSFP v4.0b1 before December 30, please download it again.

NEW VERSION (December 8, 2018): dbNSFP v4.0b1 is released for beta testing. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 29/Ensembl 94 with human reference sequence hg38. Eight deleteriousness prediction scores (ALoFT, DEOGEN2, FATHMM-XF, MPC, MVP, PrimateAI, LINSIGHT, SIFT4G) have been added. Three conservation scores (phyloP17way_primate, phastCons17way_primate, bStatistic) have been added. Allele frequencies from the gnomAD controls subsets, eQTLs from the Geuvadis project, and genotypes of a Vindija33.19 Neanderthal have been added. Some resources have been updated, including VEST (We thank Dr. Karchin), CADD, M-CAP, ancestral alleles, dbSNP, ClinVar, GTEx and InterPro. The presentation of the prediction scores has been further improved by adding the correspondence to transcript/protein ids in a systematic way. APPRIS, GENCODE_basic, TSL and VEP_canonical have been added to facilitate the choice of appropriate transcripts. dbNSFP_gene has also been completely rebuilt using the up-to-date resources. HIPred, gene constraint scores from the gnomAD data, essential genes predictions based on CRISPR, gene-trap and gene networks have been added.

Two branches of dbNSFP are provided: dbNSFP4.0b1a suitable for academic use, which includes all the resources, and dbNSFP4.0b1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com) for commercial usage of dbNSFP.

dbNSFP4.0b1a can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here. dbNSFP4.0b1c can be downloaded from softgenetics ftp or googledrive. The md5sum of the zip file can be found here. A README file is here.

Columns added (dbNSFP_variant): VindijiaNeandertal, Uniprot_acc, Uniprot_entry, APPRIS, GENCODE_basic, TSL, VEP_canonical, MVP_score, MVP_rankscore, MPC_score, MPC_rankscore, PrimateAI_score, PrimateAI_rankscore, PrimateAI_pred, DEOGEN2_score, DEOGEN2_rankscore, DEOGEN2_pred, fathmm-XF_coding_score, fathmm-XF_coding_rankscore, fathmm-XF_coding_pred, bStatistic, bStatistic_rankscore, Aloft_Fraction_transcripts_affected, Aloft_prob_Tolerant, Aloft_prob_Recessive, Aloft_prob_Dominant, Aloft_pred, Aloft_Confidence, UK10K_AC, UK10K_AF, gnomAD_exomes_controls_AC, gnomAD_exomes_controls_AN, gnomAD_exomes_controls_AF, gnomAD_exomes_controls_nhomalt, gnomAD_exomes_controls_AFR_AC, gnomAD_exomes_controls_AFR_AN, gnomAD_exomes_controls_AFR_AF, gnomAD_exomes_controls_AFR_nhomalt, gnomAD_exomes_controls_AMR_AC, gnomAD_exomes_controls_AMR_AN, gnomAD_exomes_controls_AMR_AF, gnomAD_exomes_controls_AMR_nhomalt, gnomAD_exomes_controls_ASJ_AC, gnomAD_exomes_controls_ASJ_AN, gnomAD_exomes_controls_ASJ_AF, gnomAD_exomes_controls_ASJ_nhomalt, gnomAD_exomes_controls_EAS_AC, gnomAD_exomes_controls_EAS_AN, gnomAD_exomes_controls_EAS_AF, gnomAD_exomes_controls_EAS_nhomalt, gnomAD_exomes_controls_FIN_AC, gnomAD_exomes_controls_FIN_AN, gnomAD_exomes_controls_FIN_AF, gnomAD_exomes_controls_FIN_nhomalt, gnomAD_exomes_controls_NFE_AC, gnomAD_exomes_controls_NFE_AN, gnomAD_exomes_controls_NFE_AF, gnomAD_exomes_controls_NFE_nhomalt, gnomAD_exomes_controls_SAS_AC, gnomAD_exomes_controls_SAS_AN, gnomAD_exomes_controls_SAS_AF, gnomAD_exomes_controls_SAS_nhomalt, gnomAD_exomes_controls_POPMAX_AC, gnomAD_exomes_controls_POPMAX_AN, gnomAD_exomes_controls_POPMAX_AF, gnomAD_exomes_controls_POPMAX_nhomalt, gnomAD_exomes_nhomalt, gnomAD_exomes_AFR_nhomalt, gnomAD_exomes_AMR_nhomalt, gnomAD_exomes_ASJ_nhomalt, gnomAD_exomes_EAS_nhomalt, gnomAD_exomes_FIN_nhomalt, gnomAD_exomes_NFE_nhomalt, gnomAD_exomes_SAS_nhomalt, gnomAD_exomes_POPMAX_AC, gnomAD_exomes_POPMAX_AN, gnomAD_exomes_POPMAX_AF, gnomAD_exomes_POPMAX_nhomalt, gnomAD_exomes_flag, gnomAD_genomes_flag, gnomAD_genomes_nhomalt, gnomAD_genomes_AFR_nhomalt, gnomAD_genomes_AMR_nhomalt, gnomAD_genomes_ASJ_nhomalt, gnomAD_genomes_EAS_nhomalt, gnomAD_genomes_FIN_nhomalt, gnomAD_genomes_NFE_nhomalt, gnomAD_genomes_POPMAX_nhomalt, gnomAD_genomes_controls_AC, gnomAD_genomes_controls_AN, gnomAD_genomes_controls_AF, gnomAD_genomes_controls_nhomalt, gnomAD_genomes_controls_AFR_AC, gnomAD_genomes_controls_AFR_AN, gnomAD_genomes_controls_AFR_AF, gnomAD_genomes_controls_AFR_nhomalt, gnomAD_genomes_controls_AMR_AC, gnomAD_genomes_controls_AMR_AN, gnomAD_genomes_controls_AMR_AF, gnomAD_genomes_controls_AMR_nhomalt, gnomAD_genomes_controls_ASJ_AC, gnomAD_genomes_controls_ASJ_AN, gnomAD_genomes_controls_ASJ_AF, gnomAD_genomes_controls_ASJ_nhomalt, gnomAD_genomes_controls_EAS_AC, gnomAD_genomes_controls_EAS_AN, gnomAD_genomes_controls_EAS_AF, gnomAD_genomes_controls_EAS_nhomalt, gnomAD_genomes_controls_FIN_AC, gnomAD_genomes_controls_FIN_AN, gnomAD_genomes_controls_FIN_AF, gnomAD_genomes_controls_FIN_nhomalt, gnomAD_genomes_controls_NFE_AC, gnomAD_genomes_controls_NFE_AN, gnomAD_genomes_controls_NFE_AF, gnomAD_genomes_controls_NFE_nhomalt, gnomAD_genomes_controls_POPMAX_AC, gnomAD_genomes_controls_POPMAX_AN, gnomAD_genomes_controls_POPMAX_AF, gnomAD_genomes_controls_POPMAX_nhomalt, Geuvadis_eQTL_target_gene, clinvar_hgvs, clinvar_var_source, Eigen-raw_coding_rankscore, SIFT4G_score, SIFT4G_pred, SIFT4G_converted_rankscore, phyloP17way_primate, phyloP17way_primate_rankscore, phastCons17way_primate, phastCons17way_primate_rankscore

Columns name changes (dbNSFP_variant): MutationAssessor_score_rankscore to MutationAssessor_rankscore, VEST3_score to VEST4_score, VEST3_rankscore to VEST4_rankscore, GenoCanyon_score_rankscore to GenoCanyo_rankscore, integrated_fitCons_score_rankscore to integrated_fitCons_rankscore, GM12878_fitCons_score_rankscore to GM12878_fitCons_rankscore, H1-hESC_fitCons_score_rankscore to H1-hESC_fitCons_rankscore, HUVEC_fitCons_score_rankscore to HUVEC_fitCons_rankscore, phyloP20way_mammalian to phyloP30way_mammalian, phyloP20way_mammalian_rankscore to phyloP30way_mammalian_rankscore, phastCons20way_mammalian to phastCons30way_mammalian, phastCons20way_mammalian_rankscore to phastCons30way_mammalian_rankscore, clinvar_golden_stars to clinvar_review, GTEx_V6p_gene to GTEx_V7_gene, GTEx_V6p_tissue to GTEx_V7_tissue, Eigen-raw to Eigen-raw_coding, Eigen-phred to Eigen-phred_coding, Eigen-PC-raw to Eigen-PC-raw_coding, Eigen-PC-phred to Eigen-PC-phred_coding, Eigen-PC-raw_rankscore to Eigen-PC-raw_coding_rankscore, rs_dbSNP150 to rs_dbSNP151, clinvar_rs to clinvar_id.

Columns deleted (dbNSFP_variant): Uniprot_acc_Polyphen2, Uniprot_id_Polyphen2, Uniprot_aapos_Polyphen2, MutationAssessor_UniprotID, MutationAssessor_variant, Transcript_id_VEST3, Transcript_var_VEST3, gnomAD_exomes_OTH_AC, gnomAD_exomes_OTH_AN, gnomAD_exomes_OTH_AF, gnomAD_genomes_OTH_AC, gnomAD_genomes_OTH_AN, gnomAD_genomes_OTH_AF, Eigen_coding_or_noncoding

Columns added (dbNSFP_gene): gnomAD_pLI, gnomAD_pRec, gnomAD_pNull, HIPred_score, HIPred, Essential_gene_CRISPR, Essential_gene_CRISPR2, Essential_gene_gene-trap, Gene_indispensability_score, Gene_indispensability_pred.

REMINDER: For whole genome annotation, we recommend our whole genome annotation pipeline WGSA. Currently it supports SNP and indel annotation using hg19 and hg38 coordinates. dbNSFP v2.9.3 (the last dbNSFP native on hg19) is a component resource.

REMINDER: if your snp coordinates are based on hg19, remember to add option "-v hg19" when using the search program because the default position is now in hg38.

Contact Us:

         Xiaoming Liu, Ph.D.

         Associate Professor,

         USF Genomics,

Phone: 813-974-9865

Email: xiaomingliu@health.usf.edu

Lab Page: http://liulab.science

University of South Florida,
         3720 Spectrum Boulevard,
         Suite 304,
         Tampa, FL 33612





Powered by w3.css