check
Publications | Sagiv Shifman Lab

Publications

2023
Rosenski, J., Shifman, S. & Kaplan, T. Predicting gene knockout effects from expression data. BMC Med Genomics 16, 26 (2023). Publisher's VersionAbstract
BACKGROUND: The study of gene essentiality, which measures the importance of a gene for cell division and survival, is used for the identification of cancer drug targets and understanding of tissue-specific manifestation of genetic conditions. In this work, we analyze essentiality and gene expression data from over 900 cancer lines from the DepMap project to create predictive models of gene essentiality. METHODS: We developed machine learning algorithms to identify those genes whose essentiality levels are explained by the expression of a small set of "modifier genes". To identify these gene sets, we developed an ensemble of statistical tests capturing linear and non-linear dependencies. We trained several regression models predicting the essentiality of each target gene, and used an automated model selection procedure to identify the optimal model and hyperparameters. Overall, we examined linear models, gradient boosted trees, Gaussian process regression models, and deep learning networks. RESULTS: We identified nearly 3000 genes for which we accurately predict essentiality using gene expression data of a small set of modifier genes. We show that both in the number of genes we successfully make predictions for, as well as in the prediction accuracy, our model outperforms current state-of-the-art works. CONCLUSIONS: Our modeling framework avoids overfitting by identifying the small set of modifier genes, which are of clinical and genetic importance, and ignores the expression of noisy and irrelevant genes. Doing so improves the accuracy of essentiality prediction in various conditions and provides interpretable models. Overall, we present an accurate computational approach, as well as interpretable modeling of essentiality in a wide range of cellular conditions, thus contributing to a better understanding of the molecular mechanisms that govern tissue-specific effects of genetic disease and cancer.
Alkelai, A. et al. Genetic insights into childhood-onset schizophrenia: The yield of clinical exome sequencing. Schizophr Res 252, 138-145 (2023). Publisher's VersionAbstract
Childhood-onset schizophrenia (COS) is a rare form of schizophrenia with an onset prior to 13 years of age. Although genetic factors play a role in COS etiology, only a few causal variants have been reported to date. This study presents a diagnostic exome sequencing (ES) in 37 Israeli Jewish families with a proband diagnosed with COS. By implementing a trio/duo ES approach and applying a well-established diagnostic pipeline, we detected clinically significant variants in 7 probands (19 %). These single nucleotide variants and indels were mostly inherited. The implicated genes were ANKRD11, GRIA2, CHD2, CLCN3, CLTC, IGF1R and MICU1. In a secondary analysis that compared COS patients to 4721 healthy controls, we observed that patients had a significant enrichment of rare loss of function (LoF) variants in LoF intolerant genes associated with developmental diseases. Taken together, ES could be considered as a valuable tool in the genetic workup for COS patients.
2022
Shohat, S., Vol, E. & Shifman, S. Gene essentiality in cancer cell lines is modified by the sex chromosomes. Genome Res (2022).doi:10.1101/gr.276488.121 Publisher's VersionAbstract
Human sex differences arise from gonadal hormones and sex chromosomes. Studying the direct effects of sex chromosomes in humans is still challenging. Here we studied how the sex chromosomes can modulate gene expression and the outcome of mutations across the genome by exploiting the tendency of cancer cell lines to lose or gain sex chromosomes. We inferred the dosage of the sex chromosomes in 355 female and 408 male cancer cell lines and used it to dissect the contribution of the Y and X Chromosomes to sex-biased gene expression. Furthermore, based on genome-wide CRISPR screens, we identified genes whose essentiality is different between male and female cells depending on the sex chromosomes. The most significant genes were X-linked genes compensated by Y-linked paralogs. Our sex-based analysis identifies genes that, when mutated, can affect male and female cells differently and reinforces the role of the X and Y-Chromosomes in sex-specific cell function.
Herman, N., Kadener, S. & Shifman, S. The chromatin factor ROW cooperates with BEAF-32 in regulating long-range inducible genes. EMBO Rep e54720 (2022).doi:10.15252/embr.202254720 Publisher's VersionAbstract
Insulator proteins located at the boundaries of topological associated domains (TAD) are involved in higher-order chromatin organization and transcription regulation. However, it is still not clear how long-range contacts contribute to transcriptional regulation. Here, we show that relative-of-WOC (ROW) is essential for the long-range transcription regulation mediated by the boundary element-associated factor of 32kD (BEAF-32). We find that ROW physically interacts with heterochromatin proteins (HP1b and HP1c) and the insulator protein (BEAF-32). These proteins interact at TAD boundaries where ROW, through its AT-hook motifs, binds AT-rich sequences flanked by BEAF-32-binding sites and motifs. Knockdown of row downregulates genes that are long-range targets of BEAF-32 and bound indirectly by ROW (without binding motif). Analyses of high-throughput chromosome conformation capture (Hi-C) data reveal long-range interactions between promoters of housekeeping genes bound directly by ROW and promoters of developmental genes bound indirectly by ROW. Thus, our results show cooperation between BEAF-32 and the ROW complex, including HP1 proteins, to regulate the transcription of developmental and inducible genes through long-range interactions.
Dvir, E., Shohat, S., Flint, J. & Shifman, S. Identification of genetic mechanisms for tissue-specific genetic effects based on CRISPR screens. Genetics (2022).doi:10.1093/genetics/iyac134 Publisher's VersionAbstract
A major challenge in genetic studies of complex diseases is to determine how the action of risk genes is restricted to a tissue or cell type. Here we investigate tissue specificity of gene action using CRISPR screens from 786 cancer cell lines originating from 24 tissues. We find that the expression pattern of the gene across tissues explains only a minority of cases of tissue-specificity (9%), while gene amplification and the expression levels of paralogs account for 39.5% and 15.5%, respectively. Additionally, the transfer of small molecules to mutant cells explains tissue-specific gene action in blood. The tissue-specific genes we found are not specific just for human cancer cell lines: we found that the tissue-specific genes are intolerant to functional mutations in the human population and are associated with human diseases more than genes that are essential across all cell types. Our findings offer important insights into genetic mechanisms for tissue specificity of human diseases.
2021
Alkelai, A. et al. Expansion of the GRIA2 phenotypic representation: a novel de novo loss of function mutation in a case with childhood onset schizophrenia. J Hum Genet (2021).doi:10.1038/s10038-020-00846-1 Publisher's VersionAbstract
Childhood-onset schizophrenia (COS) is a rare form of schizophrenia with an onset before 13 years of age. There is rising evidence that genetic factors play a major role in COS etiology, yet, only a few single gene mutations have been discovered. Here we present a diagnostic whole-exome sequencing (WES) in an Israeli Jewish female with COS and additional neuropsychiatric conditions such as obsessive-compulsive disorder (OCD), anxiety, and aggressive behavior. Variant analysis revealed a de novo novel stop gained variant in GRIA2 gene (NM_000826.4: c.1522 G > T (p.Glu508Ter)). GRIA2 encodes for a subunit of the AMPA sensitive glutamate receptor (GluA2) that functions as ligand-gated ion channel in the central nervous system and plays an important role in excitatory synaptic transmission. GluA2 subunit mutations are known to cause variable neurodevelopmental phenotypes including intellectual disability, autism spectrum disorder, epilepsy, and OCD. Our findings support the potential diagnostic role of WES in COS, identify GRIA2 as possible cause to a broad psychiatric phenotype that includes COS as a major manifestation and expand the previously reported GRIA2 loss of function phenotypes.
Shohat, S., Amelan, A. & Shifman, S. Convergence and Divergence in the Genetics of Psychiatric Disorders From Pathways to Developmental Stages. Biol Psychiatry (2021).doi:10.1016/j.biopsych.2020.05.019 Publisher's VersionAbstract
In the past decade, the identification of susceptibility genes for psychiatric disorders has become routine, but understanding the biology underlying these discoveries has proven extremely difficult. The large number of potential risk genes and the genetic overlap between disorders are major obstacles for studying the etiology of these conditions. Systems biology approaches relying on gene ontologies, gene coexpression, and protein-protein interactions are used to identify convergence of the genes in relation to biological processes, cell types, brain areas, and developmental stages. Across psychiatric disorders, there is a clear enrichment for genes expressed in the brain and especially in the cortex, but a higher resolution is vastly dependent on sample size and statistical power. There is indication that susceptibility genes tend to be expressed in the brain during periods preceding the typical onset of the disorders. Thus, the role of genes in prenatal brain development is more pronounced for childhood-onset disorders, such as autism spectrum disorder and attention-deficit/hyperactivity disorder, but is much less so for bipolar disorder and depression. One of the most consistent findings across multiple disorders and classes of genetic variants is the role of genes intolerant to mutations in psychiatric disorders, yet this association is more pronounced for disorders with a clear neurodevelopmental component. Notwithstanding, a detailed understanding of the neurobiology of psychiatric disorders is still lacking. It is possible that it will only be revealed by studying the risk genes at the level of the development and function of neuronal networks and circuits.
2020
Winek, K. et al. Transfer RNA fragments replace microRNA regulators of the cholinergic poststroke immune blockade. Proc Natl Acad Sci U S A (2020).doi:10.1073/pnas.2013542117 Publisher's VersionAbstract
Stroke is a leading cause of death and disability. Recovery depends on a delicate balance between inflammatory responses and immune suppression, tipping the scale between brain protection and susceptibility to infection. Peripheral cholinergic blockade of immune reactions fine-tunes this immune response, but its molecular regulators are unknown. Here, we report a regulatory shift in small RNA types in patient blood sequenced 2 d after ischemic stroke, comprising massive decreases of microRNA levels and concomitant increases of transfer RNA fragments (tRFs) targeting cholinergic transcripts. Electrophoresis-based size-selection followed by qRT-PCR validated the top six up-regulated tRFs in a separate cohort of stroke patients, and independent datasets of small and long RNA sequencing pinpointed immune cell subsets pivotal to these responses, implicating CD14 monocytes in the cholinergic inflammatory reflex. In-depth small RNA targeting analyses revealed the most-perturbed pathways following stroke and implied a structural dichotomy between microRNA and tRF target sets. Furthermore, lipopolysaccharide stimulation of murine RAW 264.7 cells and human CD14 monocytes up-regulated the top six stroke-perturbed tRFs, and overexpression of stroke-inducible tRF-22-WE8SPOX52 using a single-stranded RNA mimic induced down-regulation of immune regulator Z-DNA binding protein 1. In summary, we identified a "changing of the guards" between small RNA types that may systemically affect homeostasis in poststroke immune responses, and pinpointed multiple affected pathways, which opens new venues for establishing therapeutics and biomarkers at the protein and RNA level.
Amir, N. et al. Value-complexity tradeoff explains mouse navigational learning. PLoS Comput Biol 16, e1008497 (2020). Publisher's VersionAbstract
We introduce a novel methodology for describing animal behavior as a tradeoff between value and complexity, using the Morris Water Maze navigation task as a concrete example. We develop a dynamical system model of the Water Maze navigation task, solve its optimal control under varying complexity constraints, and analyze the learning process in terms of the value and complexity of swimming trajectories. The value of a trajectory is related to its energetic cost and is correlated with swimming time. Complexity is a novel learning metric which measures how unlikely is a trajectory to be generated by a naive animal. Our model is analytically tractable, provides good fit to observed behavior and reveals that the learning process is characterized by early value optimization followed by complexity reduction. Furthermore, complexity sensitively characterizes behavioral differences between mouse strains.
Suliman-Lavie, R. et al. Pogz deficiency leads to transcription dysregulation and impaired cerebellar activity underlying autism-like behavior in mice. Nat Commun 11, 5836 (2020). Publisher's VersionAbstract
Several genes implicated in autism spectrum disorder (ASD) are chromatin regulators, including POGZ. The cellular and molecular mechanisms leading to ASD impaired social and cognitive behavior are unclear. Animal models are crucial for studying the effects of mutations on brain function and behavior as well as unveiling the underlying mechanisms. Here, we generate a brain specific conditional knockout mouse model deficient for Pogz, an ASD risk gene. We demonstrate that Pogz deficient mice show microcephaly, growth impairment, increased sociability, learning and motor deficits, mimicking several of the human symptoms. At the molecular level, luciferase reporter assay indicates that POGZ is a negative regulator of transcription. In accordance, in Pogz deficient mice we find a significant upregulation of gene expression, most notably in the cerebellum. Gene set enrichment analysis revealed that the transcriptional changes encompass genes and pathways disrupted in ASD, including neurogenesis and synaptic processes, underlying the observed behavioral phenotype in mice. Physiologically, Pogz deficiency is associated with a reduction in the firing frequency of simple and complex spikes and an increase in amplitude of the inhibitory synaptic input in cerebellar Purkinje cells. Our findings support a mechanism linking heterochromatin dysregulation to cerebellar circuit dysfunction and behavioral abnormalities in ASD.
Dinstein, I. et al. The National Autism Database of Israel: a Resource for Studying Autism Risk Factors, Biomarkers, Outcome Measures, and Treatment Efficacy. J Mol Neurosci 70, 1303-1312 (2020). Publisher's Version
Mandric, I. et al. Profiling immunoglobulin repertoires across multiple human tissues using RNA sequencing. Nat Commun 11, 3126 (2020). Publisher's VersionAbstract
Profiling immunoglobulin (Ig) receptor repertoires with specialized assays can be cost-ineffective and time-consuming. Here we report ImReP, a computational method for rapid and accurate profiling of the Ig repertoire, including the complementary-determining region 3 (CDR3), using regular RNA sequencing data such as those from 8,555 samples across 53 tissues types from 544 individuals in the Genotype-Tissue Expression (GTEx v6) project. Using ImReP and GTEx v6 data, we generate a collection of 3.6 million Ig sequences, termed the atlas of immunoglobulin repertoires (TAIR), across a broad range of tissue types that often do not have reported Ig repertoires information. Moreover, the flow of Ig clonotypes and inter-tissue repertoire similarities across immune-related tissues are also evaluated. In summary, TAIR is one of the largest collections of CDR3 sequences and tissue types, and should serve as an important resource for studying immunological diseases.
2019
Monderer-Rothkoff, G. et al. AUTS2 isoforms control neuronal differentiation. Mol Psychiatry (2019).doi:10.1038/s41380-019-0409-1Abstract
Mutations in AUTS2 are associated with autism, intellectual disability, and microcephaly. AUTS2 is expressed in the brain and interacts with polycomb proteins, yet it is still unclear how mutations in AUTS2 lead to neurodevelopmental phenotypes. Here we report that when neuronal differentiation is initiated, there is a shift in expression from a long isoform to a short AUTS2 isoform. Yeast two-hybrid screen identified the splicing factor SF3B1 as an interactor of both isoforms, whereas the polycomb group proteins, PCGF3 and PCGF5, were found to interact exclusively with the long AUTS2 isoform. Reporter assays showed that the first exons of the long AUTS2 isoform function as a transcription repressor, but the part that consist of the short isoform acts as a transcriptional activator, both influenced by the cellular context. The expression levels of PCGF3 influenced the ability of the long AUTS2 isoform to activate or repress transcription. Mouse embryonic stem cells (mESCs) with heterozygote mutations in Auts2 had an increase in cell death during in vitro corticogenesis, which was significantly rescued by overexpressing the human AUTS2 transcripts. mESCs with a truncated AUTS2 protein (missing exons 12-20) showed premature neuronal differentiation, whereas cells overexpressing AUTS2, especially the long transcript, showed increase in expression of pluripotency markers and delayed differentiation. Taken together, our data suggest that the precise expression of AUTS2 isoforms is essential for regulating transcription and the timing of neuronal differentiation.
Filo, S. et al. Disentangling molecular alterations from water-content changes in the aging human brain using quantitative MRI. Nat Commun 10, 3403 (2019).Abstract
It is an open question whether aging-related changes throughout the brain are driven by a common factor or result from several distinct molecular mechanisms. Quantitative magnetic resonance imaging (qMRI) provides biophysical parametric measurements allowing for non-invasive mapping of the aging human brain. However, qMRI measurements change in response to both molecular composition and water content. Here, we present a tissue relaxivity approach that disentangles these two tissue components and decodes molecular information from the MRI signal. Our approach enables us to reveal the molecular composition of lipid samples and predict lipidomics measurements of the brain. It produces unique molecular signatures across the brain, which are correlated with specific gene-expression profiles. We uncover region-specific molecular changes associated with brain aging. These changes are independent from other MRI aging markers. Our approach opens the door to a quantitative characterization of the biological sources for aging, that until now was possible only post-mortem.
Chow, J. et al. Dissecting the genetic basis of comorbid epilepsy phenotypes in neurodevelopmental disorders. Genome Med 11, 65 (2019).Abstract
BACKGROUND: Neurodevelopmental disorders (NDDs) such as autism spectrum disorder, intellectual disability, developmental disability, and epilepsy are characterized by abnormal brain development that may affect cognition, learning, behavior, and motor skills. High co-occurrence (comorbidity) of NDDs indicates a shared, underlying biological mechanism. The genetic heterogeneity and overlap observed in NDDs make it difficult to identify the genetic causes of specific clinical symptoms, such as seizures. METHODS: We present a computational method, MAGI-S, to discover modules or groups of highly connected genes that together potentially perform a similar biological function. MAGI-S integrates protein-protein interaction and co-expression networks to form modules centered around the selection of a single "seed" gene, yielding modules consisting of genes that are highly co-expressed with the seed gene. We aim to dissect the epilepsy phenotype from a general NDD phenotype by providing MAGI-S with high confidence NDD seed genes with varying degrees of association with epilepsy, and we assess the enrichment of de novo mutation, NDD-associated genes, and relevant biological function of constructed modules. RESULTS: The newly identified modules account for the increased rate of de novo non-synonymous mutations in autism, intellectual disability, developmental disability, and epilepsy, and enrichment of copy number variations (CNVs) in developmental disability. We also observed that modules seeded with genes strongly associated with epilepsy tend to have a higher association with epilepsy phenotypes than modules seeded at other neurodevelopmental disorder genes. Modules seeded with genes strongly associated with epilepsy (e.g., SCN1A, GABRA1, and KCNB1) are significantly associated with synaptic transmission, long-term potentiation, and calcium signaling pathways. On the other hand, modules found with seed genes that are not associated or weakly associated with epilepsy are mostly involved with RNA regulation and chromatin remodeling. CONCLUSIONS: In summary, our method identifies modules enriched with de novo non-synonymous mutations and can capture specific networks that underlie the epilepsy phenotype and display distinct enrichment in relevant biological processes. MAGI-S is available at https://github.com/jchow32/magi-s .
Oron, O. et al. Gene network analysis reveals a role for striatal glutamatergic receptors in dysregulated risk-assessment behavior of autism mouse models. Transl Psychiatry 9, 257 (2019).Abstract
Autism spectrum disorder (ASD) presents a wide, and often varied, behavioral phenotype. Improper assessment of risks has been reported among individuals diagnosed with ASD. Improper assessment of risks may lead to increased accidents and self-injury, also reported among individuals diagnosed with ASD. However, there is little knowledge of the molecular underpinnings of the impaired risk-assessment phenotype. In this study, we have identified impaired risk-assessment activity in multiple male ASD mouse models. By performing network-based analysis of striatal whole transcriptome data from each of these ASD models, we have identified a cluster of glutamate receptor-associated genes that correlate with the risk-assessment phenotype. Furthermore, pharmacological inhibition of striatal glutamatergic receptors was able to mimic the dysregulation in risk-assessment. Therefore, this study has identified a molecular mechanism that may underlie risk-assessment dysregulation in ASD.
Shohat, S. & Shifman, S. Genes essential for embryonic stem cells are associated with neurodevelopmental disorders. Genome Res 29, 1910-1918 (2019).Abstract
Mouse embryonic stem cells (mESCs) are key components in generating mouse models for human diseases and performing basic research on pluripotency, yet the number of genes essential for mESCs is still unknown. We performed a genome-wide screen for essential genes in mESCs and compared it to screens in human cells. We found that essential genes are enriched for basic cellular functions, are highly expressed in mESCs, and tend to lack paralog genes. We discovered that genes that are essential specifically in mESCs play a role in pathways associated with their pluripotent state. We show that 29.5% of human genes intolerant to loss-of-function mutations are essential in mouse or human ESCs, and that the human phenotypes most significantly associated with genes essential for ESCs are neurodevelopmental. Our results provide insights into essential genes in the mouse, the pathways which govern pluripotency, and suggest that many genes associated with neurodevelopmental disorders are essential at very early embryonic stages.
2018
Mangul, S. et al. ROP: dumpster diving in RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues. Genome Biol 19, 36 (2018).Abstract
High-throughput RNA-sequencing (RNA-seq) technologies provide an unprecedented opportunity to explore the individual transcriptome. Unmapped reads are a large and often overlooked output of standard RNA-seq analyses. Here, we present Read Origin Protocol (ROP), a tool for discovering the source of all reads originating from complex RNA molecules. We apply ROP to samples across 2630 individuals from 54 diverse human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Additionally, we use ROP to investigate the functional mechanisms underlying connections between the immune system, microbiome, and disease. ROP is freely available at https://github.com/smangul1/rop/wiki .
2017
Barbash, S. et al. Neuronal-expressed microRNA-targeted pseudogenes compete with coding genes in the human brain. Transl Psychiatry 7, e1199 (2017).Abstract
MicroRNAs orchestrate brain functioning via interaction with microRNA recognition elements (MRE) on target transcripts. However, the global impact of potential competition on the microRNA pool between coding and non-coding brain transcripts that share MREs with them remains unexplored. Here we report that non-coding pseudogene transcripts carrying MREs (PSG) often show duplicated origin, evolutionary conservation and higher expression in human temporal lobe neurons than comparable duplicated MRE-deficient pseudogenes (PSG). PSG participate in neuronal RNA-induced silencing complexes (RISC), indicating functional involvement. Furthermore, downregulation cell culture experiments validated bidirectional co-regulation of PSG with MRE-sharing coding transcripts, frequently not their mother genes, and with targeted microRNAs; also, PSG single-nucleotide polymorphisms associated with schizophrenia, bipolar disorder and autism, suggesting interaction with mental diseases. Our findings indicate functional roles of duplicated PSG in brain development and cognition, supporting physiological impact of the reciprocal co-regulation of PSG with MRE-sharing coding transcripts in human brain neurons.
Shohat, S., Ben-David, E. & Shifman, S. Varying Intolerance of Gene Pathways to Mutational Classes Explain Genetic Convergence across Neuropsychiatric Disorders. Cell Rep 18, 2217-2227 (2017).Abstract
Genetic susceptibility to intellectual disability (ID), autism spectrum disorder (ASD), and schizophrenia (SCZ) often arises from mutations in the same genes, suggesting that they share common mechanisms. We studied genes with de novo mutations in the three disorders and genes implicated in SCZ by genome-wide association study (GWAS). Using biological annotations and brain gene expression, we show that mutation class explains enrichment patterns more than specific disorder. Genes with loss-of-function mutations and genes with missense mutations were associated with different pathways across disorders. Conversely, gene expression patterns were specific for each disorder. ID genes were preferentially expressed in the cortex; ASD genes were expressed in the fetal cortex, cerebellum, and striatum; and genes associated with SCZ were expressed in the adolescent cortex. Our study suggests that convergence across neuropsychiatric disorders stems from common pathways that are consistently vulnerable to genetic variations but that spatiotemporal activity of genes contributes to specific phenotypes.