Clinical Sequencing Data Sharing Is Essential


Clinical Sequencing Data Sharing Is Essential

The past few decades have seen rapid advances in our knowledge of genetic diseases, which affect an estimated 25 million Americans. These advances can be quantified in things like the growth of dbSNP (now contains about 90 million validated genetic variants) and the number of Mendelian disorders understood at the genetic level (over 5,000).

Some of the factors that have contributed to this progress include:

  • Big science. Ambitious, grant-supported, international efforts like the Human Genome Project, the HapMap Project, and the Cancer Genome Atlas yielded the public resources that form the foundation of modern human genetics research. Thank you, taxpayers.
  • Technology development. Revolutionary advances in genome interrogation technologies (high density SNP arrays, whole-genome sequencing, etc.) have made large-scale genetic studies feasible, both technically and financially.
  • Study participants. It’s important to remember that most (if not all) of human genetics studies could not have happened without the patients and families who volunteered their samples, often with the knowledge that they’d get nothing in return.

The Unsolved Problem of Inherited Disease

Few areas have benefited as much from these advances as the study of rare genetic diseases. Exome sequencing has enabled the rapid genetic diagnosis of many patients, and the discovery of hundreds of new Mendelian disease genes. Yet even well-powered Mendelian disease studies can fail for a variety of reasons. There’s also a considerable gray area between success and failure: the implication of an unknown gene, or one that has never been associated with disease.

One particular challenge is that Mendelian diseases are rare by definition, and the variants definitively shown to cause them are rarer still. As a result, many variants detected in clinical sequencing project end up with the label variant of unknown significance, or VUS. Even when given a classification, some variants are interpreted differently by different clinical laboratories.

As discussed in a report at the New England Journal of Medicine this week, another thing that has hampered our ability to discover and annotate clinically-relevant genetic variation is the “silo effect” — in which research groups (both commercial and academic) maintain private databases of clinical sequencing results. A great example of this is Myriad Genetics, a company that’s probably sitting on the largest database of BRCA1/2 mutations in the world.

The problem, of course, is that not all of the clinical datasets for a given disease or gene ends up in the same silo. Thus, researchers in group A might have a promising new disease gene that researchers in group B have also identified in a different family kindred. If those datasets were shared, rather than kept isolated, these groups could cross-validate with one another and the research community as a whole would benefit.

Data Sharing in ClinVar

The NIH’s Clinical Genome Resource program (ClinGen) hopes to address some of these issues by developing community resources to understand our understanding of genomic variation and improve its use in clinical care. The cornerstone of this effort is ClinVar, a database of variants annotated with clinical data.

ClinVar Contributors

Over 300 different submitters have contributed to ClinVar thus far. Those submitters comprise research groups, clinical laboratories, locus-specific databases, and aggregate databases (like OMIM). Here’s a plot of the variants submitted for some of the major (or interesting) contributors:

ClinVar Submitters

The largest submitter by far is OMIM, which has contributed over 25,000 variants to ClinVar. It’s encouraging to see two of the leading genetic testing providers (GeneDx and Ambry Genetics) making substantial contributions. Among academic centers, the University of Chicago and Emory University are the clear leaders.

As of May 2015, ClinVar contained 172,055 variant submissions across 22,864 genes. More than 118,000 unique variants have clinical annotations, though 21% of those are “variant of unknown significance.” Nevertheless, this rapidly-growing resource illustrates the power of sharing clinical variant annotations in a centralized manner.

Discordant Clinical Annotations

Notably, 12,895 variants have clinical annotations (pathogenic, unknown, or benign) from at least two different laboratories and 17% of the time, those annotations did not agree. For example, at least 220 of the “pathogenic” variants pulled in from OMIM (the largest contributing database) are classified by clinical laboratories as either benign or unknown significance.

It is clear that the guidelines for variant interpretation differ between laboratories, and need to be standardized. Even so, adopting standards and making the effort to share clinical variant findings and annotations (along with the relevant phenotype data) is critical to the success of rare disease research. ClinVar seems to be taking us in the right direction.


Rehm HL, Berg JS, Brooks LD, Bustamante CD, Evans JP, Landrum MJ, Ledbetter DH, Maglott DR, Martin CL, Nussbaum RL, Plon SE, Ramos EM, Sherry ST, Watson MS, & ClinGen (2015). ClinGen – The Clinical Genome Resource. The New England journal of medicine PMID: 26014595

26,951 thoughts on “Clinical Sequencing Data Sharing Is Essential

  1. Thank you for all of your work on this website. Kate really loves setting aside time for internet research and it is easy to see why. I hear all concerning the powerful tactic you create simple guidance on this web blog and as well as improve response from visitors on this content so our child is actually being taught a whole lot. Enjoy the rest of the new year. You have been doing a very good job.

  2. Thank you a lot for providing individuals with remarkably pleasant possiblity to discover important secrets from here. It really is very pleasing and also packed with fun for me personally and my office fellow workers to search your web site not less than 3 times per week to learn the newest guidance you have. And lastly, I’m at all times impressed with your unique points you give. Certain 4 points in this posting are unquestionably the most efficient we have all ever had.

  3. ケイトスペード ハンドバッグ|ケイト?スペード ニューヨークの通販のオンラインショップでは、ケイトスペード バッグ,ケイトスペード 財布,ケイトスペード iphoneケース,ケイトスペード リュック,ケイトスペード 店舗など豊富な商品をラインナップ。【期間限定】ネット注文なら10,000円( 税別)以上購入で全国どこでも ◇送料無料! カテゴリー:ケイトスペード ハンドバッグ

  4. Great post. I was checking constantly this blog and I am impressed!
    Very helpful information specially the last
    part :) I care for such information much. I was seeking this particular information for a very long
    time. Thank you and good luck.

  5. s ѕhows tɦat the producer has expeгienced some time to further
    іmprove their workmanship. Thiss article will examine the qսestiߋn how does thee housing market create
    a recessіon — and keep it going longer than iit might otherwise last.
    For small kitchen, design a cut off Taiwan Island, the half-wall effeϲt, the small kitfhеn wiⅼl be cⅼosed more than transparent, while maіntaining their ⲟwn community kitchen meals.

Leave a Reply

Your email address will not be published.