Mendelian Disorder: Exome Annotations

Friday, February 17, 2012

Exome Annotations

I just posted a thread on 23andMe about which annotations I use for my exome data. Here's what I said:

I currently use Annovar for annotating VCF files. The output from Annovar is not particularly intuitive, so I wrote a perl script that generates a VCF-based report. I thought I would share the annotations I've been using and ones I plan to add, and see if anyone else has any other annotation ideas. These could be useful for us to annotate our own genomes (and potentially for 23andMe to provide in the future).

The annotations I've been including are:

Gene annotation (type of mutation--exonic, intronic, splicing, etc.)

Gene name

Mutational description (i.e. specific amino acid change, etc.)

dbSNP130

dbSNP135

WashU Exome Variant DB (EVS)

Transcription Factor Binding Site (TFBS)

SIFT score

PolyPhen 2 score (PP2)

GWAS presence

Segmental duplication

(The reason I include both dbSNP130 and 135 is that 135 contains quite a few SNPs that are potentially meaningful from a disease and trait standpoint while 130 is mostly markers not directly affecting diseases and traits. 130 is a subset of 135. Also, the EVS is potentially more useful than either of them as a filtering device.)

Ones that I would like to include in the future:

VAAST

MIE sites/scores (Mendelian inheritance errors)

23andMe annotations (anything from 23andMe's SNP databases--can 23andMe help with that?)

Any other ideas for great annotations that should be included?

The idea behind these types of annotations is to give us a way to sift through the data and extract biologically meaningful results. For example, we are most interested in mutations that actually cause a protein coding change, that are uncommon in the population, and that are predicted to have a dramatic effect on function.

So far these types of annotations have allowed me to narrow very long lists of results in exomes (think on the order of 30-50,000 mutations) down to just a handful (1-20) candidate mutations for particular Mendelian disorders.

Anything I missed?

3 comments:

bioinformaticoApril 13, 2012 at 12:17 AM
Hi,
I think you have a very good selection of annotations. It could be interesting to include the allele frequency of the alternative allele from the 1000 genomes project. Since March, annovar table annotation for this project is available.
Congratulations and thanks for the blog.
Jorge
ReplyDelete
Replies
AnonymousSeptember 23, 2013 at 12:33 PM
any progress about "23andMe annotations (anything from 23andMe's SNP databases--can 23andMe help with that?)" ?
ReplyDelete
Replies

Add comment

Pages

Friday, February 17, 2012

Exome Annotations

3 comments: