Here we showcase exceptional, recently published open source tools and open data that could be useful for data-driven life science research, as well as research resulting from their use. The Data Highlights are written by the SciLifeLab Data Platform editorial team or external contributors.Please click on the button below to suggest work that should be the subject of a data highlight. You can suggest your own tool/data/research, or someone else’s, and the editorial team will get in touch with you as soon as possible to discuss your suggestion.
February 24, 2023
Discovery of sex determination genes in a malaria parasite that are essential for mosquito transmission
Russell et al. (2023) details the discovery of sex determination genes in the malaria parasite Plasmodium berghei, and how they are essential in transmission of malaria via mosquitoes. Data and code shared openly.
January 20, 2023
Schaal et al. (2022) found that long-read sequencing was more sensitive than Sanger sequencing for detecting mutations associated with the development of resistance to certain cancer drugs. Custom software produced as part of this study is shared openly.
December 19, 2022
Cryo-electron tomography allows new knowledge about poliovirus replication and assembly sites in situ
Dahmane et al (2022) used cryo-electron tomography to provide an integrated structural framework for multiple stages of the poliovirus life cycle. Data and code are shared openly.
October 21, 2022
Study by Jamy et al. show that hundreds of transitions have occurred between marine and non-marine habitats over the course of two billion years of eukaryotic evolution. Code and data are shared openly.
October 13, 2022
GenErode pipeline can compare patterns of genomic erosion using genomic data from historical, ancient and modern samples
The GenErode pipeline is the first bioinformatics pipeline that can process and analyse ancient, historical, and modern sequencing data from the same species with the aim of generating comparable estimates of genomic erosion indices.
October 6, 2022
The study uses a Bayesian deep-learning model trained using fossil evidence of mammal-plant interactions to explore the origin and expansion of open grassland in North America. The study used open data sources, and openly shares the code and functionalities produced.
September 29, 2022
‘SubCellBarCode’ – a subcellular proteome resource and analysis pipeline now available on the Data Platform
The SubCellBarCode project focusses on determining the subcellular localisation of proteins. We explore two major publications produced by the project in detail as well as the resources openly shared from it.
August 12, 2022
New study from Feiner and colleagues shows that environmental stressors induce DNA methylation in D. magna and that this epigenetic effect is heritable. Sequence data and code to analyse DNA-methylation and fitness are openly shared.
July 7, 2022
Recent publication shows how scientists at SciLifeLab use cryo-EM to reveal the sequential steps involved in mitoribosome assembly. Data has been made publicly available.
June 21, 2022
The study shows that deep learning frameworks can be used as a viable approach to estimate patterns of biodiversity over a large area. Publicly available data sources were used in the study, all code related to the deep learning framework is openly shared.
May 10, 2022
Leo and colleagues used multi-omic analysis for studies of childhood Acute lymphoblastic leukemia (ALL) . New tool for improving childhood ALL cancer treatment developed and shared.
May 6, 2022
Study gives insight into host-viral interactions of Crimean-Congo hemorrhagic fever, an infectious disease without available treatments. Raw RNAseq, mass spectrometry proteomics data, and code shared.
May 5, 2022
Novel FoldDock pipeline uses AlphaFold2 to provide accurate predictions of heterodimeric complexes structures
New pipeline called FoldDock uses AlphaFold2 to provide accurate predictions of heterodimeric complexes structures. This pipeline has the potential for rapidly expanding knowledge about structural protein interactions at a low cost. The code required to run FoldDock and reproduce the analysis has been published on GitLab.
April 25, 2022
Expression of 602 host proteins were evaluated in populations of infected and non-infected cells using immunofluorescence. ≈75,000 images have been published as a resource for further studies.