The Babraham Institute Publications database contains details of all publications resulting from our research groups and scientific facilities. Pre-prints by Institute authors can be viewed on the Institute's bioRxiv channel. We believe that free and open access to the outputs of publicly‐funded research offers significant social and economic benefits, as well as aiding the development of new research. We are working to provide Open Access to as many publications as possible and these can be identified below by the padlock icon. Where this hasn't been possible, subscriptions may be required to view the full text.
It is currently assumed that 3D chromosomal organization plays a central role in transcriptional control. However, depletion of cohesin and CTCF affects the steady-state levels of only a minority of transcripts. Here, we use high-resolution Capture Hi-C to interrogate the dynamics of chromosomal contacts of all annotated human gene promoters upon degradation of cohesin and CTCF. We show that a majority of promoter-anchored contacts are lost in these conditions, but many contacts with distinct properties are maintained, and some new ones are gained. The rewiring of contacts between promoters and active enhancers upon cohesin degradation associates with rapid changes in target gene transcription as detected by SLAM sequencing (SLAM-seq). These results provide a mechanistic explanation for the limited, but consistent, effects of cohesin and CTCF depletion on steady-state transcription and suggest the existence of both cohesin-dependent and -independent mechanisms of enhancer-promoter pairing.
RNA localization is an important regulatory layer of gene expression and cell functioning. The protocol guides through the Proximity RNA-seq method, in which RNA molecules are sequenced in their spatial, cellular context to derive RNA co-localization and transcriptome organization. Transcripts in individual subcellular particles from chemically crosslinked cells are tagged with the same, unique DNA barcode in water-in-oil emulsion droplets. First, single DNA barcodes are PCR amplified and immobilized on single, small magnetic beads in droplets. Subsequently, 3' ends of bead-bound barcode copies are tailed with random pentadecamers. Then beads are encapsulated again into droplets together with crosslinked subcellular particles containing RNA. Reverse transcription using random pentadecamers as primers is performed in droplets, which optimally contain one bead and one particle, in order to tag RNAs co-localized to the same particle. Sequencing such cDNA molecules identifies the RNA molecule and the barcode. Subsequent analysis of transcripts that share the same barcode, i.e., co-barcoding, reveals RNA co-localization and interactions. The technique is not restricted to pairs of RNAs but can as well detect groups of transcripts and estimates local RNA density or connectivity for individual transcripts. We provide here a detailed protocol to perform and analyze Proximity RNA-seq on cell nuclei to study spatial, nuclear RNA organization.
Pancreatic cancer is a rare but fatal form of cancer, the fourth highest in absolute mortality. Known risk factors include obesity, diet, and type 2 diabetes; however, the low incidence rate and interconnection of these factors confound the isolation of individual effects. Here, we use epidemiological analysis of prospective human cohorts and parallel tracking of pancreatic cancer in mice to dissect the effects of obesity, diet, and diabetes on pancreatic cancer. Through longitudinal monitoring and multi-omics analysis in mice, we found distinct effects of protein, sugar, and fat dietary components, with dietary sugars increasing Mad2l1 expression and tumor proliferation. Using epidemiological approaches in humans, we find that dietary sugars give a MAD2L1 genotype-dependent increased susceptibility to pancreatic cancer. The translation of these results to a clinical setting could aid in the identification of the at-risk population for screening and potentially harness dietary modification as a therapeutic measure.
In metazoans, the secreted proteome participates in intercellular signalling and innate immunity, and builds the extracellular matrix scaffold around cells. Compared with the relatively constant intracellular environment, conditions for proteins in the extracellular space are harsher, and low concentrations of ATP prevent the activity of intracellular components of the protein quality-control machinery. Until now, only a few bona fide extracellular chaperones and proteases have been shown to limit the aggregation of extracellular proteins. Here we performed a systematic analysis of the extracellular proteostasis network in Caenorhabditis elegans with an RNA interference screen that targets genes that encode the secreted proteome. We discovered 57 regulators of extracellular protein aggregation, including several proteins related to innate immunity. Because intracellular proteostasis is upregulated in response to pathogens, we investigated whether pathogens also stimulate extracellular proteostasis. Using a pore-forming toxin to mimic a pathogenic attack, we found that C. elegans responded by increasing the expression of components of extracellular proteostasis and by limiting aggregation of extracellular proteins. The activation of extracellular proteostasis was dependent on stress-activated MAP kinase signalling. Notably, the overexpression of components of extracellular proteostasis delayed ageing and rendered worms resistant to intoxication. We propose that enhanced extracellular proteostasis contributes to systemic host defence by maintaining a functional secreted proteome and avoiding proteotoxicity.
Zygotic genome activation (ZGA) is an essential transcriptional event in embryonic development that coincides with extensive epigenetic reprogramming. Complex manipulation techniques and maternal stores of proteins preclude large-scale functional screens for ZGA regulators within early embryos. Here, we combined pooled CRISPR activation (CRISPRa) with single-cell transcriptomics to identify regulators of ZGA-like transcription in mouse embryonic stem cells, which serve as a tractable, in vitro proxy of early mouse embryos. Using multi-omics factor analysis (MOFA+) applied to ∼200,000 single-cell transcriptomes comprising 230 CRISPRa perturbations, we characterized molecular signatures of ZGA and uncovered 24 factors that promote a ZGA-like response. Follow-up assays validated top screen hits, including the DNA-binding protein Dppa2, the chromatin remodeler Smarca5, and the transcription factor Patz1, and functional experiments revealed that Smarca5's regulation of ZGA-like transcription is dependent on Dppa2. Together, our single-cell transcriptomic profiling of CRISPRa-perturbed cells provides both system-level and molecular insights into the mechanisms that orchestrate ZGA.
Rule-based modeling is an approach that permits constructing reaction networks based on the specification of rules for molecular interactions and transformations. These rules can encompass details such as the interacting sub-molecular domains and the states and binding status of the involved components. Conceptually, fine-grained spatial information such as locations can also be provided. Through "wildcards" representing component states, entire families of molecule complexes sharing certain properties can be specified as patterns. This can significantly simplify the definition of models involving species with multiple components, multiple states, and multiple compartments. The systems biology markup language (SBML) Level 3 Multi Package Version 1 extends the SBML Level 3 Version 1 core with the "type" concept in the Species and Compartment classes. Therefore, reaction rules may contain species that can be patterns and exist in multiple locations. Multiple software tools such as Simmune and BioNetGen support this standard that thus also becomes a medium for exchanging rule-based models. This document provides the specification for Release 2 of Version 1 of the SBML Level 3 Multi package. No design changes have been made to the description of models between Release 1 and Release 2; changes are restricted to the correction of errata and the addition of clarifications.
Genomic imprinting is an epigenetic phenomenon leading to parental allele-specific expression. Dosage of imprinted genes is crucial for normal development and its dysregulation accounts for several human disorders. This unusual expression pattern is mostly dictated by differences in DNA methylation between parental alleles at specific regulatory elements known as imprinting control regions (ICRs). Although several approaches can be used for methylation inspection, we lack an easy and cost-effective method to simultaneously measure DNA methylation at multiple imprinted regions. Here, we present IMPLICON, a high-throughput method measuring DNA methylation levels at imprinted regions with base-pair resolution and over 1000-fold coverage. We adapted amplicon bisulfite-sequencing protocols to design IMPLICON for ICRs in adult tissues of inbred mice, validating it in hybrid mice from reciprocal crosses for which we could discriminate methylation profiles in the two parental alleles. Lastly, we developed a human version of IMPLICON and detected imprinting errors in embryonic and induced pluripotent stem cells. We also provide rules and guidelines to adapt this method for investigating the DNA methylation landscape of any set of genomic regions. In summary, IMPLICON is a rapid, cost-effective and scalable method, which could become the gold standard in both imprinting research and diagnostics.
Circular DNA can arise from all parts of eukaryotic chromosomes. In yeast, circular ribosomal DNA (rDNA) accumulates dramatically as cells age, however little is known about the accumulation of other chromosome-derived circles or the contribution of such circles to genetic variation in aged cells. We profiled circular DNA in Saccharomyces cerevisiae populations sampled when young and after extensive aging. Young cells possessed highly diverse circular DNA populations but 94% of the circular DNA were lost after ∼15 divisions, whereas rDNA circles underwent massive accumulation to >95% of circular DNA. Circles present in both young and old cells were characterized by replication origins including circles from unique regions of the genome and repetitive regions: rDNA and telomeric Y' regions. We further observed that circles can have flexible inheritance patterns: [HXT6/7circle] normally segregates to mother cells but in low glucose is present in up to 50% of cells, the majority of which must have inherited this circle from their mother. Interestingly, [HXT6/7circle] cells are eventually replaced by cells carrying stable chromosomal HXT6 HXT6/7 HXT7 amplifications, suggesting circular DNAs are intermediates in chromosomal amplifications. In conclusion, the heterogeneity of circular DNA offers flexibility in adaptation, but this heterogeneity is remarkably diminished with age.
The lipid kinase VPS34 orchestrates diverse processes, including autophagy, endocytic sorting, phagocytosis, anabolic responses and cell division. VPS34 forms various complexes that help adapt it to specific pathways, with complexes I and II being the most prominent ones. We found that physicochemical properties of membranes strongly modulate VPS34 activity. Greater unsaturation of both substrate and non-substrate lipids, negative charge and curvature activate VPS34 complexes, adapting them to their cellular compartments. Hydrogen/deuterium exchange mass spectrometry (HDX-MS) of complexes I and II on membranes elucidated structural determinants that enable them to bind membranes. Among these are the Barkor/ATG14L autophagosome targeting sequence (BATS), which makes autophagy-specific complex I more active than the endocytic complex II, and the Beclin1 BARA domain. Interestingly, even though Beclin1 BARA is common to both complexes, its membrane-interacting loops are critical for complex II, but have only a minor role for complex I.
An amendment to this paper has been published and can be accessed via the original article.
The receptor-linked protein tyrosine phosphatases (RPTPs) are key regulators of cell-cell communication through the control of cellular phosphotyrosine levels. Most human RPTPs possess an extracellular receptor domain and tandem intracellular phosphatase domains: comprising an active membrane proximal (D1) domain and an inactive distal (D2) pseudophosphatase domain. Here we demonstrate that PTPRU is unique amongst the RPTPs in possessing two pseudophosphatase domains. The PTPRU-D1 displays no detectable catalytic activity against a range of phosphorylated substrates and we show that this is due to multiple structural rearrangements that destabilise the active site pocket and block the catalytic cysteine. Upon oxidation, this cysteine forms an intramolecular disulphide bond with a vicinal "backdoor" cysteine, a process thought to reversibly inactivate related phosphatases. Importantly, despite the absence of catalytic activity, PTPRU binds substrates of related phosphatases strongly suggesting that this pseudophosphatase functions in tyrosine phosphorylation by competing with active phosphatases for the binding of substrates.
One of the major bottlenecks in building systems biology models is identification and estimation of model parameters for model calibration. Searching for model parameters from published literature and models is an essential, yet laborious task.
How the epigenetic landscape is established in development is still being elucidated. Here, we uncover developmental pluripotency associated 2 and 4 (DPPA2/4) as epigenetic priming factors that establish a permissive epigenetic landscape at a subset of developmentally important bivalent promoters characterized by low expression and poised RNA-polymerase. Differentiation assays reveal that Dppa2/4 double knockout mouse embryonic stem cells fail to exit pluripotency and differentiate efficiently. DPPA2/4 bind both H3K4me3-marked and bivalent gene promoters and associate with COMPASS- and Polycomb-bound chromatin. Comparing knockout and inducible knockdown systems, we find that acute depletion of DPPA2/4 results in rapid loss of H3K4me3 from key bivalent genes, while H3K27me3 is initially more stable but lost following extended culture. Consequently, upon DPPA2/4 depletion, these promoters gain DNA methylation and are unable to be activated upon differentiation. Our findings uncover a novel epigenetic priming mechanism at developmental promoters, poising them for future lineage-specific activation.
Collagen I is a major tendon protein whose polypeptide chains are linked by covalent cross-links. It is unknown how the cross-linking contributes to the mechanical properties of tendon or whether cross-linking changes in response to stretching or relaxation. Since their discovery, imine bonds within collagen have been recognized as being important in both cross-link formation and collagen structure. They are often described as acidic or thermally labile, but no evidence is available from direct measurements of cross-link levels whether these bonds contribute to the mechanical properties of collagen. Here, we used MS to analyze these imine bonds after reduction with sodium borohydride while under tension and found that their levels are altered in stretched tendon. We studied the changes in cross-link bonding in tail tendon from 11-week-old C57Bl/6 mice at 4% physical strain, at 10% strain, and at breaking point. The cross-links hydroxy-lysino-norleucine (HLNL), dihydroxy-lysino-norleucine (DHLNL), and lysino-norleucine (LNL) increased or decreased depending on the specific cross-link and amount of mechanical strain. We also noted a decrease in glycated lysine residues in collagen, indicating that the imine formed between circulating glucose and lysine is also stress-labile. We also carried out mechanical testing, including cyclic testing at 4% strain, stress relaxation tests, and stress-strain profiles taken at breaking point, both with and without sodium borohydride reduction. The results from both the MS studies and mechanical testing provide insights into the chemical changes during tendon stretching and directly link these chemical changes to functional collagen properties.
Noncoding RNA plays essential roles in transcriptional control and chromatin silencing. At antisense transcription quantitatively influences transcriptional output, but the mechanism by which this occurs is still unclear. Proximal polyadenylation of the antisense transcripts by FCA, an RNA-binding protein that physically interacts with RNA 3' processing factors, reduces transcription. This process genetically requires FLD, a homolog of the H3K4 demethylase LSD1. However, the mechanism linking RNA processing to FLD function had not been established. Here, we show that FLD tightly associates with LUMINIDEPENDENS (LD) and SET DOMAIN GROUP 26 (SDG26) in vivo, and, together, they prevent accumulation of monomethylated H3K4 (H3K4me1) over the gene body. SDG26 interacts with the RNA 3' processing factor FY (WDR33), thus linking activities for proximal polyadenylation of the antisense transcripts to FLD/LD/SDG26-associated H3K4 demethylation. We propose this demethylation antagonizes an active transcription module, thus reducing H3K36me3 accumulation and increasing H3K27me3. Consistent with this view, we show that Polycomb Repressive Complex 2 (PRC2) silencing is genetically required by FCA to repress Overall, our work provides insights into RNA-mediated chromatin silencing.
This paper presents a high-throughput reverse transcription quantitative PCR (RT-qPCR) assay for Caenorhabditis elegans that is fast, robust, and highly sensitive. This protocol obtains precise measurements of gene expression from single worms or from bulk samples. The protocol presented here provides a novel adaptation of existing methods for complementary DNA (cDNA) preparation coupled to a nanofluidic RT-qPCR platform. The first part of this protocol, named 'Worm-to-CT', allows cDNA production directly from nematodes without the need for prior mRNA isolation. It increases experimental throughput by allowing the preparation of cDNA from 96 worms in 3.5 h. The second part of the protocol uses existing nanofluidic technology to run high-throughput RT-qPCR on the cDNA. This paper evaluates two different nanofluidic chips: the first runs 96 samples and 96 targets, resulting in 9,216 reactions in approximately 1.5 days of benchwork. The second chip type consists of six 12 x 12 arrays, resulting in 864 reactions. Here, the Worm-to-CT method is demonstrated by quantifying mRNA levels of genes encoding heat shock proteins from single worms and from bulk samples. Provided is an extensive list of primers designed to amplify processed RNA for the majority of coding genes within the C. elegans genome.
An issue often encountered when acquiring image data from fixed or anesthetized C. elegans is that worms cross and cluster with their neighbors. This problem is aggravated with increasing density of worms and creates challenges for imaging and quantification. We developed a FIJI-based workflow, Worm-align, that can be used to generate single- or multi-channel montages of user-selected, straightened and aligned worms from raw image data of C. elegans. Worm-align is a simple and user-friendly workflow that does not require prior training of either the user or the analysis algorithm. Montages generated with Worm-align can aid the visual inspection of worms, their classification and representation. In addition, the output of Worm-align can be used for subsequent quantification of fluorescence intensity in single worms, either in FIJI directly, or in other image analysis software platforms. We demonstrate this by importing the Worm-align output into Worm_CP, a pipeline that uses the open-source CellProfiler software. CellProfiler's flexibility enables the incorporation of additional modules for high-content screening. As a practical example, we have used the pipeline on two datasets: the first dataset are images of heat shock reporter worms that express green fluorescent protein (GFP) under the control of the promoter of a heat shock inducible gene hsp-70, and the second dataset are images obtained from fixed worms, stained for fat-stores with a fluorescent dye.
Axons are diverse. They have different lengths, different branching patterns, and different biological roles. Methods to study axon degeneration are also diverse. The result is a bewildering range of experimental systems in which to study mechanisms of axon degeneration, and it is difficult to extrapolate from one neuron type and one method to another. The purpose of this chapter is to help readers to do this and to choose the methods most appropriate for answering their particular research question.
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Regulatory T (Treg) cell populations are composed of functionally quiescent resting Treg (rTreg) cells which differentiate into activated Treg (aTreg) cells upon antigen stimulation. How rTreg cells remain quiescent despite chronic exposure to cognate self- and foreign antigens is unclear. The transcription factor BACH2 is critical for early Treg lineage specification, but its function following lineage commitment is unresolved. Here, we show that BACH2 is repurposed following Treg lineage commitment and promotes the quiescence and long-term maintenance of rTreg cells. Bach2 is highly expressed in rTreg cells but is down-regulated in aTreg cells and during inflammation. In rTreg cells, BACH2 binds to enhancers of genes involved in aTreg differentiation and represses their TCR-driven induction by competing with AP-1 factors for DNA binding. This function promotes rTreg cell quiescence and long-term maintenance and is required for immune homeostasis and durable immunosuppression in cancer. Thus, BACH2 supports a "division of labor" between quiescent rTreg cells and their activated progeny in Treg maintenance and function, respectively.
Genetic variations underlying susceptibility to complex autoimmune and allergic diseases are concentrated within noncoding regulatory elements termed enhancers. The functions of a large majority of disease-associated enhancers are unknown, in part owing to their distance from the genes they regulate, a lack of understanding of the cell types in which they operate, and our inability to recapitulate the biology of immune diseases in vitro. Here, using shared synteny to guide loss-of-function analysis of homologues of human enhancers in mice, we show that the prominent autoimmune and allergic disease risk locus at chromosome 11q13.5 contains a distal enhancer that is functional in CD4 regulatory T (T) cells and required for T-mediated suppression of colitis. The enhancer recruits the transcription factors STAT5 and NF-κB to mediate signal-driven expression of Lrrc32, which encodes the protein glycoprotein A repetitions predominant (GARP). Whereas disruption of the Lrrc32 gene results in early lethality, mice lacking the enhancer are viable but lack GARP expression in Foxp3 T cells, which are unable to control colitis in a cell-transfer model of the disease. In human T cells, the enhancer forms conformational interactions with the promoter of LRRC32 and enhancer risk variants are associated with reduced histone acetylation and GARP expression. Finally, functional fine-mapping of 11q13.5 using CRISPR-activation (CRISPRa) identifies a CRISPRa-responsive element in the vicinity of risk variant rs11236797 capable of driving GARP expression. These findings provide a mechanistic basis for association of the 11q13.5 risk locus with immune-mediated diseases and identify GARP as a potential target in their therapy.
While colocalization within a bacterial operon enables coexpression of the constituent genes, the mechanistic logic of clustering of nonhomologous monocistronic genes in eukaryotes is not immediately obvious. Biosynthetic gene clusters that encode pathways for specialized metabolites are an exception to the classical eukaryote rule of random gene location and provide paradigmatic exemplars with which to understand eukaryotic cluster dynamics and regulation. Here, using 3C, Hi-C, and Capture Hi-C (CHi-C) organ-specific chromosome conformation capture techniques along with high-resolution microscopy, we investigate how chromosome topology relates to transcriptional activity of clustered biosynthetic pathway genes in Our analyses reveal that biosynthetic gene clusters are embedded in local hot spots of 3D contacts that segregate cluster regions from the surrounding chromosome environment. The spatial conformation of these cluster-associated domains differs between transcriptionally active and silenced clusters. We further show that silenced clusters associate with heterochromatic chromosomal domains toward the periphery of the nucleus, while transcriptionally active clusters relocate away from the nuclear periphery. Examination of chromosome structure at unrelated clusters in maize, rice, and tomato indicates that integration of clustered pathway genes into distinct topological domains is a common feature in plant genomes. Our results shed light on the potential mechanisms that constrain coexpression within clusters of nonhomologous eukaryotic genes and suggest that gene clustering in the one-dimensional chromosome is accompanied by compartmentalization of the 3D chromosome.
Lipopolysaccharide (LPS) endotoxin stimulates pro-inflammatory pathways and is a key player in the pathological mechanisms involved in the development of endometritis. This study aimed to investigate LPS-induced DNA methylation changes in bovine endometrial epithelial cells (bEECs), which may affect endometrial function. Following in vitro culture, bEECs from three cows were either untreated (0) or exposed to 2 and 8 μg/mL LPS for 24 h.
The alarm cytokine interleukin-1β (IL-1β) is a potent activator of the inflammatory cascade following pathogen recognition. IL-1β production typically requires two signals: first, priming by recognition of pathogen-associated molecular patterns leads to the production of immature pro-IL-1β; subsequently, inflammasome activation by a secondary signal allows cleavage and maturation of IL-1β from its pro-form. However, despite the important role of IL-1β in controlling local and systemic inflammation, its overall regulation is still not fully understood. Here we demonstrate that peritoneal tissue-resident macrophages use an active inhibitory pathway, to suppress IL-1β processing, which can otherwise occur in the absence of a second signal. Programming by the transcription factor Gata6 controls the expression of prostacyclin synthase, which is required for prostacyclin production after lipopolysaccharide stimulation and optimal induction of IL-10. In the absence of secondary signal, IL-10 potently inhibits IL-1β processing, providing a previously unrecognized control of IL-1β in tissue-resident macrophages.