It is now possible for non-bioinformaticians to create knowledge-networks – a powerful way for biologists to visualise deep connections between genes and phenotypes – quickly and efficiently thanks to the integration of Rothamsted Research’s open-source KnetMiner software into the Genestack platform.
These new software tools make it easier for plant breeders and others to mine genomics data to find novel ways to improve the performance of all kinds of crops.
Dr. Keywan Hassani-Pak, head of bioinformatics at Rothamsted Research and leader of the KnetMiner project, explains:
“Genotype to phenotype analysis is at the core of what biologists do. With KnetMiner we have created software that enables biologists to take their own high-throughput experimental data and to see them in the context of all the public knowledge that is out there. This can help them interpret their own data faster and more effectively.
“For a particular target species, such as a crop plant, KnetMiner integrates all the relevant genomics and omics information that is present in more than 25 sources under a multitude of formats. KnetMiner brings it together in the form of a heterogeneous knowledge network. We don’t only integrate the data; we also create new relationships based, for example, on co-occurrences of genes and phenotypes in the scientific literature. We are the first in the UK to develop such detailed networks and make them mineable. We are talking about up to a million nodes here.”
Plant scientists and others saw the potential of KnetMiner and approached Rothamsted to help them create a secure system they could use with their own data.
KnetMiner is an exciting visualisation tool but it could take many months for each network to be created for a new species and it was complex to use. With the benefit of Innovate UK funding, Rothamsted worked with Cambridge-based Genestack to migrate KnetMiner onto the Genestack platform.
Dr. Misha Kapushesky of Genestack explains: “The Rothamsted researchers could spend months collecting all the data that was available for a particular organism, cleaning the data and writing scripts to transfer it into a format that was usable in KnetMiner and then presenting it so that other scientists could use the information.
“We migrated the visualisation software and automated the collection process by making it part of our Genestack ecosystem. It is now possible to simply ‘point and click’ on data that is in the public domain to create a network and then overlay your own data, using KnetMiner to visualise it. You can build your own network with collaborators in a secure environment. It is no longer a fixed set of data on the Rothamsted website but a dynamic tool that can be made commercially available.”
Genestack now hosts over 40 plant and crop networks, as well as a prototype human disease network. Although it originated in agri-research, network mining for gene discovery is generic and Genestack provides an environment for building and distributing these large-scale knowledge networks.
Knowledge networks are a way of showing visually the connection between phenotypes with the genotype of a given species. The nodes are different shapes to represent various biological entities (such as genes, publications, or pathways), which are connected by relevant relationships (such as encodes, published_in, interacts_with). They are a very good way to show complex and highly interconnected biological data.
“There are a lot of tools out there that will return a list of ranked genes when you are conducting a gene candidate analysis, and of course KnetMiner also does that with its evidence-based gene rank algorithm. But most of them also stop there,” explains Dr. Hassani-Pak.
“KnetMiner is unique as it allows users to see how and why the prediction was made. They can fully understand the results because the process is completely transparent and the provenance is visualised. There is no black box approach here.”
With “Network view” users are able to leverage information present in the network for new discoveries and hypotheses; this in turn can spur ideas for further analysis.
Dr. Hassani-Pak and Dr. Kapushesky believe that this approach supports human augmented knowledge discovery, which puts human experts – rather than machines - at the core of the decision making process.
“The human brain is so powerful; we need to free it from tedious tasks,” comments Dr. Kapushesky. “By reducing the complexity it makes it easier for researchers to see the patterns and links that push the frontiers of science further, and the tools also make it possible for others to apply the findings in a commercial environment.”
To download the KnetMiner user guide, request access to a free tester account, and register your interest for an upcoming webinar on Knetminer visit: www.genestack.com/landings/ebook-knetminer-multi-omics-network-mining/
To discover the potential of KnetMiner running on Genestack watch: www.youtube.com/watch?v=5z0wGxywArQ
Image shows: Misha Kapushesky, Genestack CEO, [left] with Keywan Hassani-Pak, head of bioinformatics at Rothamsted Research [right] – image credit: Rothamsted Research