Data Highlights

Highlighting new open source tools and open datasets, and research based on them.

Here we showcase exceptional, recently published open source tools and open data that could be useful for data-driven life science research, as well as research resulting from their use. The Data Highlights are written by the SciLifeLab Data Platform editorial team or external contributors.

Please click on the button below to suggest work that should be the subject of a Data Highlight. You can suggest your own tool/data/research, or someone else’s, and the editorial team will get in touch with you as soon as possible to discuss your suggestion.
October 21, 2022
Study by Jamy et al. show that hundreds of transitions have occurred between marine and non-marine habitats over the course of two billion years of eukaryotic evolution. Code and data are shared openly.
October 13, 2022
The GenErode pipeline is the first bioinformatics pipeline that can process and analyse ancient, historical, and modern sequencing data from the same species with the aim of generating comparable estimates of genomic erosion indices.
October 6, 2022
The study uses a Bayesian deep-learning model trained using fossil evidence of mammal-plant interactions to explore the origin and expansion of open grassland in North America. The study used open data sources, and openly shares the code and functionalities produced.
September 29, 2022
The SubCellBarCode project focusses on determining the subcellular localisation of proteins. We explore two major publications produced by the project in detail as well as the resources openly shared from it.
August 12, 2022
New study from Feiner and colleagues shows that environmental stressors induce DNA methylation in D. magna and that this epigenetic effect is heritable. Sequence data and code to analyse DNA-methylation and fitness are openly shared.
July 7, 2022
Recent publication shows how scientists at SciLifeLab use cryo-EM to reveal the sequential steps involved in mitoribosome assembly. Data has been made publicly available.
June 21, 2022
The study shows that deep learning frameworks can be used as a viable approach to estimate patterns of biodiversity over a large area. Publicly available data sources were used in the study, all code related to the deep learning framework is openly shared.
May 10, 2022
Leo and colleagues used multi-omic analysis for studies of childhood Acute lymphoblastic leukemia (ALL) . New tool for improving childhood ALL cancer treatment developed and shared.
May 6, 2022
Study gives insight into host-viral interactions of Crimean-Congo hemorrhagic fever, an infectious disease without available treatments. Raw RNAseq, mass spectrometry proteomics data, and code shared.
May 5, 2022
New pipeline called FoldDock uses AlphaFold2 to provide accurate predictions of heterodimeric complexes structures. This pipeline has the potential for rapidly expanding knowledge about structural protein interactions at a low cost. The code required to run FoldDock and reproduce the analysis has been published on GitLab.
April 25, 2022
Expression of 602 host proteins were evaluated in populations of infected and non-infected cells using immunofluorescence. ≈75,000 images have been published as a resource for further studies.