DNA repeats and DNA-repeat disorders

Human Brain Disorders and Repeated DNA

The scientific teams in our lab are working to recognize the biology that underlies human brain health and illness, and the ways in which human genes, inherited genetic variation, and somatic mutations conspire to shape this biology. The biological basis for most brain disorders is unknown today: most are understood mainly in terms of collections of symptoms, neuropathological observations – such as cortical thinning, protein aggregates, or death of a specific kind of cell – and human-genetic associations. We need to deeply understand these disorders as biological entities so that we can develop new and innovative ways to monitor and treat them.

Our lab is particularly focused on (i) DNA-repeat disorders and (ii) the disruptions of mental health commonly known as “psychiatric disorders”, especially schizophrenia.

Our research team brings together people with experiences, approaches and insights from biology, human genetics, statistics, and computer science. Our approaches to questions tend to involve one, two or all three of the following:

  1. developing new experimental approaches that turn key aspects of brain biology into “big data” problems (for example, droplet-based single-cell RNA-seq);
  2. developing new computational and statistical ways to analyze high-volume biological data sets; and
  3. applying these approaches to reach insights about the biology that underlies human brain health and illness.

Though we develop new experimental and computational approaches to answer questions, we are question-driven rather than technique-driven. Our goal is always to answer critical questions; we develop whatever experimental or computational approaches we think a biological question needs.

Three current areas of focus are described below. These areas overlap, and many scientists in the lab contribute ideas and work more than one of them. We are also increasingly finding that the biology underlying diverse brain disorders is shared across many disorders; this compels us to think about clusters of brain disorders that are united by shared underlying mechanisms.

DNA repeats and DNA-repeat disorders

A silhouette of a head overlaid with the CAG DNA sequence repeated many times

Thousands of regions within the human genome contain stretches of DNA in which a DNA sequence is repeated a substantial and variable number of times. More than 40 human diseases are known to be caused by expansions of simple DNA sequence repeats; intriguingly, most of these are primarily diseases of the brain.

DNA repeats are mutable and exhibit length variation both across people (polymorphism) and within people (mosaicism). DNA repeats provide fascinating opportunities to study genetic effects on human biology, as they provide allelic series with clear quantitative relationships between DNA variation (number of repeats), cellular phenotypes, and human phenotypes such as illness.

DNA-repeat disorders are caused by inherited alleles with expansions of simple DNA sequence repeats. The classic DNA repeat disorder, Huntington’s Disease (HD), affects people who have inherited an allele of the Huntingtin gene with at least 36 CAG repeats in its first exon. (Most people inherited alleles 15-30 CAG repeats.)  Our team recently made a surprising discovery: such inherited alleles may have no inherent toxicity. Instead, they are somatically unstable: they expand in specific cell types throughout a person’s life. Only upon becoming very long (>150 repeats) do they begin to acquire toxicity, a finding which helps explain their midlife onset and their cell-type-specific pathology.  We are just finishing a preprint on this work; here is a preview from a 2023 conference.

We are working to better understand this mechanism in HD, as well as in other DNA-repeat disorders that we think may share this mechanism with HD.

Copy number variations (CNVs) involve tandem repeats of longer sequences, often several kilobases in length, and sometimes comprise entire genes. Students in our lab discovered that complex variation of the complement component 4 (C4) genes generate the human genome’s largest common effects on schizophrenia, lupus, and Sjogren’s syndrome, and that these help us understand the sex bias of these disorders, in which schizophrenia is more common in males, and lupus and Sjogren’s are much more common in females.

We are also finding, in collaborations with Po-Ru Loh’s lab, that smaller variable-number-of-tandem-repeat (VNTR) polymorphisms – often involving individual gene exons or regulatory sequences – create some of the human genome’s largest common effects on many human phenotypes.

Somatic mosaicism and clonal expansions

Schematic of brain with clonal mutations respresented by ink blots

The brain and other tissues acquire somatic mutations throughout life. Our lab is interested in a specific subset of somatic mutations that appear to recur again and again in different people. These include (i) DNA-repeat expansions, which we have found expand throughout life in a cell-type-specific manner, and (ii) proliferation–promoting mutations, which we have found propel somatically expanded clones with somatic mutations in the blood and brain.

DNA repeat disorders are caused by inherited alleles with expansions of simple DNA sequence repeats. The iconic DNA repeat disorder, Huntington’s Disease, is caused by alleles in which a CAG repeat in the Huntingtin gene is repeated at least 36 times (vs. normally 15-30). We have recently found that this disease is fundamentally driven by somatic mutations that cause this repeat to expand beyond 150 repeat units in specific, vulnerable cell types:

Another kind of recurring somatic mutation propels clonal expansion of mitotic cells. Scientists in the lab discovered that, as we age, our blood increasingly is generated by populations of clonally expanded cells with somatic mutations. We subsequently found that inherited genetic variation conspires with somatic mutations in surprising ways to bring about somatically expanded clones.

We have recently discovered that analogous clonal dynamics occur in the human brain. Scientists in the lab are deepening our understanding of mosaicism and clonal expansions by developing new detection and and analysis methods to understand the cell-type specific dynamics and effects, and applying these new approaches to brain tissue from hundreds of human donors.

Biology of mental health and illness

Two head silhouettes: one with a coil in the brain representing brain health and one with a tangled rope in the brain representing mental illness
Mental health and mental illness present compelling unmet need: disorders such as schizophrenia and bipolar disorder cause enormous suffering, yet are not understood as biological entities. Our goal is to reveal and understand the biology that underlies these disorders so that more effective therapies can be developed.

Since human genetics provides unbiased information – unlimited by the frameworks and hypotheses of any biological field, and without the assumptions of models – we work to understand what humans and their genetics are trying to teach us about the biology of these disorders, focusing on what we can learn by analyzing human brain tissue and human genetics data in new ways. However, genetic findings on mental health disorders have been hard to interpret biologically: In contrast to the genetics of DNA-repeat disorders, the genetics of mental health disorders is far more complex, with influences from more than 100 genomic loci. Recognizing the underlying biology requires different kinds of approaches.  It was for this need that we originally developed droplet-based single-cell RNA-seq, a technology that scientists in the lab are further developing to analyze single synapses, nuclear proteins, and synaptic proteins.

One of our approaches is to identify the large-scale cellular and molecular systems that underlie brain health, and to identify the specific systems that are impaired in each disorder. To do this, we have been developing new ways to analyze single-cell RNA-seq data generated from hundreds of human brain donors. One of our approaches, based on machine learning, recently led us to discover that cortical neurons and astrocytes closely coordinate their gene expression, in a program we call “SNAP” – a collaboration that we find appears to be central in protection from schizophrenia and cognitive aging.

We are currently working to more deeply understand SNAP at the molecular and cellular levels by using animal models, cellular models, and deeper analyses of human brain tissue.

In earlier work, students in the lab showed that the human genome’s largest common influence on risk of schizophrenia – the Major Histocompatibility Complex (MHC) locus – surprisingly does not arise, as previously thought, from HLA genes, but rather from many different alleles of the complement component 4 (C4) genes. We also found that C4 protein localizes to synapses. These results demonstrated that molecular events at synapses – rather than an infection or autoimmune response – are the likely explanation for this genetic effect on schizophrenia.