Long somatic DNA-repeat expansion drives neurodegeneration in Huntington’s Disease


Additional resources for Handsaker, Kashin, Reed et al. Cell 2025.

Overview

Huntington Disease (HD) is a fatal genetic brain disorder in which most of a person’s striatal projection neurons (SPNs) degenerate and die. Science has long sought to understand why SPNs are so vulnerable in HD, why this pathology follows decades of apparent health, and how the disease-causing inherited DNA repeat (CAGn, n > 36) in the huntingtin (HTT) gene leads to this neurodegeneration. This DNA repeat exhibits somatic mosaicism (variable length); we developed a way to measure its length together with genome-wide RNA expression in the same individual cells. We found that, in persons with typical inherited HD-causing alleles (of < 50 CAG repeats), the CAG-repeat tract routinely expanded to 100-500+ CAG repeats in SPNs but rarely if ever did so in striatal interneurons or glia. Surprisingly, gene expression in these persons’ individual SPNs exhibited no apparent relationship to those SPNs’ CAG-repeat lengths across a wide range (36-150 repeats). In contrast, sparse SPNs with longer (150-500+) CAG repeats had profound gene-expression distortions which affected hundreds of genes, escalated alongside further repeat expansion, and culminated in widespread gene de-repression and expression of senescence/apoptosis genes. Our experiments, analyses, and simulations suggest that individual SPNs undergo decades of biologically quiet DNA repeat expansion, then asynchronously enter a brief toxicity phase before dying. We conclude that, at any moment in time, most SPNs in persons with HD actually have a benign (but somewhat unstable) huntingtin gene; and that HD is a DNA process for almost all of a neuron’s life.

Published version (Cell, open access): www.cell.com

Preprint (bioRxiv): doi:https://10.1101/2024.05.17.592722 (download PDF)

Explore the data in our data browser

The data browser allows interactive exploration of the data set generated as part of this study. The browser provides access to single-cell data underlying the pooled analysis of 53 controls and 50 HD donors (the “cell village” data) as well as the single-cell data from the six more deeply sequenced donors.

Animations and simulations of the DNA-repeat expansion process

We developed mathematical models and simulations to explore the dynamics of somatic repeat expansion in vulnerable neurons.

Check out the simulations here to learn more.

Detailed lab protocol for making single-cell measurements of HTT CAG repeat

Published protocol (Jan, 2025): HTT-CAG Lab Protocol

Download the data

Raw and processed data are available from NeMO the Neuroscience Multi-omic Archive, under accession dat-ztfn3cc. The deposition (which is embargoed and will be released for access upon publication of the manuscript) includes both open-access and controlled-access components. The controlled-access components are available for “general research use”; users must agree not to try to identify brain donors or their family members but can use the data for research purposes. Data that can potentially be used to identify individuals (such as raw reads that contain allelic information) are placed in the controlled-access components; all other data are placed in the open-access components.

Open access components

• Single-cell-level count data on gene expression (“DGE” (gene-by-cell) matrices of UMI counts in h5 format)

• Metacells by cell type for snRNA-seq village experiments

• CAG-repeat length measurements from individual cells

• Assignments of cells to individual donors and cell types

• Donor meta-data (age, sex, Vonsattel grade)

Controlled access components

• SNP array data on each donor (Illumina Global Screening Array)

• Raw reads from snRNA-seq experiments (Illumina FASTQ files)

• Aligned reads from snRNA-seq experiments (in bam format)

• Aligned PacBio reads from HTT-CAG experiments (in bam format)

Data set links

NeMO data set: nemo:dat-ztfn3cc

Direct links to NeMO open-access components

README file

Donor metadata and cell assignments of cells to donors and cell types:  10x processed data

Count data for village and deep dive experiments and metacells (h5 format): 10x count data

HTT-CAG repeat-length measurements:  PacBio processed data

All open-access project data