GRANT 37

Research Institution: Concord Hospital & Garvan Institute of Medical Research

Principle Investigator: Professor Garth Nicholson

Type: Australian – NSW, AUSTRALIA

Project title: “The next wave of Whole Genome Sequencing-based FSHD diagnostics, and clinical measures of progress”

Status: Active

Summary

Fascioscapulohumeral dystrophy (FSHD) is the third most common muscle disorder and causes progressive wasting and weakness particularly of the face (fascio), shoulder (scapulo) and arm (humeral).
The genetics of FSHD are complex, and up until now, obtaining a genetic diagnosis for this condition has been difficult, labour intensive and involved multiple laboratory tests. A new diagnostic method, Whole Genome Sequencing (WGS) is able to diagnose many genetic disorders, and may offer a new paradigm for diagnosing FSHD. This project seeks to develop a WGS-based method to diagnose all forms of FSHD and facilitate novel disease gene identification. To overcome the challenging genetics of FSHD pathogenesis, we will develop novel bioinformatic methods, and utilise the brand new Chromium platform from 10x genomics, to help resolve the “D4Z4 repeat” length in FSHD patients. This will also provide valuable insights into the genetic basis and disease mechanisms underlying this disorder. We are also looking at ways to monitor the natural history and disease progression including specialised MRI scans. This will in turn allow design of treatment trials in the future. This project brings together experts in FSHD genetics, diagnostic testing, with clinical WGS and bioinformatics to develop this new test.

PROGRESS REPORTS


Update September 2019

Grant 37: The next wave of Whole Genome Sequencing-based FSHD diagnostics and clinical measures of progression

We have developed a single diagnostic method for FSHD using 10X Genomics chromium library
preparation with whole genome sequencing (WGS) which is able to; distinguish between the
permissive chromosome 4qA and non-permissive haplotypes (4qB, 10q); identify contractions of
the D4Z4 repeat; and genotype the SNP in the polyadenylation signal of the DUX4 gene which
determines if a specific haplotype is permissive or not. This analysis can diagnose FSHD1 patients
(Aim 1 and 2) as well as diagnose FSHD2 patients by examining the known and related FSHD2
disease genes.

The major challenge we have faced for this phase of the project has been getting the software
used for analysis of chromium sequencing data to work reliably. This software is used for the
genome assembly and haplotyping. To overcome this, we are pursuing several alternatives
including: fixing bugs in the software in consultation with 10X genomics, seeking out alternative
analysis tools developed by other researchers who have experienced the same reliability issues
with the proprietary tools and alternative sequencing approaches that do not use 10X chromium,
but still produce long read data.

Aim 1 and 2.
We have developed a single experiment diagnostic method for FSHD using 10X Genomics
chromium library preparation with WGS. Chromium library preparation tags long DNA molecules
before short-read genome sequencing. This allows large haplotypes to be phased as well as
accurate sequencing in one experiment. The method distinguishes between the permissive chromosome 4qA and non-permissive haplotypes (4qB, 10q) by using short DNA reads anchored to unique regions on chromosomes 4 and 10 (Figure 1). We have established a custom reference-genome with alternate contigs for the chromosome 4qA and 4qB haplotypes. This allows us to take advantage of the structural variation calling algorithms to identify the haplotypes carried by each patient. The chr4A and chr4B haplotypes are analysed separately, then connected to chromosome 4 as ‘virtual’ structural variations.

We identify contractions of the D4Z4 repeat by counting DNA molecules that span across the
repeat. The larger the number of repeats, the less DNA molecules that will be able
to span across. To identify D4Z4 repeat contractions we have written software which uses a
normalised count of DNA molecules spanning the D4Z4 region for each haplotype individually. We
have shown that this measure strongly negatively correlates with number of repeats and can
therefore be used to estimate the size of a repeat contraction from a patient with FSHD1.
Finally the genotype of the SNP in the polyadenylation signal of the DUX4 gene is directly read
from each sequence alignment. This snp determines if the DUX4 gene is active or not. Combined,
these analysis allows the specific genetic makeup for each patient to be read from a single
sequencing assay.

Aim 3.
We have recruited FSHD1 negative patients and prepared samples for genome sequencing for
analysis of SMCHD1 and DNMT3B. These patients will be sequenced with the clinically accredited
genome sequencing platform in KCCG.