ARTIFICIAL INTELLIGENCE IN GENETICS

Artificial intelligence algorithms have brought genetic analysis onto a new era of capabilities. Find out how DARWIN used advanced systems to boost predictive performance.

1.1

Billion genetic variations 

4.7

Million Publications on genetics

9 %

Growth in scientific understanding

ARTIFICIAL INTELLIGENCE AND GENETIC DATA

The number scientific publications in the field of genetics have begun to grow exponentially. It is clear, that humans can no longer keep track of the so essential scientific knowledge required to interpret genomic data of an individual.

In addition, the field of genetics is beginning to move beyond the traditional approach of “One Gene – one Effect” and begins to look at the interactions of many thousands of genes and patterns in the genome. Find out how DARWIN companies tackle these new challenges to use artificial intelligence in the new era of genetics.

THE GROWTH OF OUR SCIENTIFIC UNDERSTANDING

In 2021, there were more than 28 million scientific publications listed on the life science database PUBMED. Of these, around 4.7 million are focused on the topic of genetics. These publications contain all of what humanity knows about the effect of certain genetic variations on the health and traits of an individual. And our understanding is growing fast. With the average growth rate of 9% per year, humanity will have captured more than 10 million genetics publications by the year 2030.

Even at the current state, 1000 scientists would take 5 years to read every one of the publications on genetics today. And by the time they are done, there will be another 2.5 million new studies published. It is clear, that humans can no longer keep track of the advances in genetics, and we need to find other solutions, if we ever want to perform a comprehensive analysis of a person’s genome.

USING AI TO CAPTURE SCIENTIFIC KNOWLEDGE

In 2019, DARWIN has begun to build DEEP GENOME AI, a deep learning, neural net engine that is learning to read and understand publications on human genetics. The system extracts relevant data such as the type and identity of a genetic variation, the disease or trait it causes, the likelihood it causes the trait, the population this was studied in and other relevant information. The gathered data is automatically curated and saved to a high performance database suitable for automated genome interpretation.
As we teach the system how to understand a new publication, it can apply this knowledge to decipher hundreds of other publications and thus, the database and AI system continue to grow and expand. This data is working towards a future, where a human genome with its 3.2 billion genetic letters can be uploaded and automatically interpreted according to the current state of science. The following day, when roughly another 1 150 new publications will be published, the system will analyse these publications automatically and update the previous genome interpretation with this new scientific knowledge.

FREE ACCESS FOR EVERYONE

DARWIN is providing limited access (one genetic variation at a time) through its free “Wikipedia” for genetic variations, called Genopedia. Users from the general public and scientists in the field of genetics can browse through more than 1 billion genetic variations known to science. Each genetic variation has its own “facebook page” listing all of the relevant information such as:

Browse through the current curated data DEEP GENOME AI has generated.

Browse through the current curated data DEEP GENOME AI has generated.

Distribution and frequency in different ethnicities
A list of all scientific publication evaluating this particular variation
An AI-interpreted extract of the effect of this variation has on the body

THE FUTURE OF GENOPEDIA and DEEP GENOME AI

As Genopedia and DEEP GENOME AI continue to grow, so does the analytical power for genome analysis. This technology- and data platform initially allows users to upload their own genetic data and use Genopedia as their personal genome browser – a gateway to the information hidden within their genes. DEEP GENOME AI keeps feeding the database, so the scope of possible interpretations will grow daily.

In future iterations of Genopedia, users will be able to employ “Gene Apps” on their genetic code, to determine overall risks and traits and concrete action plans for nutrition, athletic performance, therapy optimization and health benefits.

But the technology will be able to do much more. The application of automated, scientifically up to date genome interpretation will revolutionize medical diagnostics, newborn screening and all other applications in the field of medical genetics. DARWIN will be able to provide the technology back end for the next generation of personalized medicine and human genetics.

USING AI TO FIND NEW GENETIC INTERACTIONS

Traditionally, the field of genetics has relied on the human curated principle of: “single genetic defect = single trait”. AI allows us to move beyond this too simplistic interpretation of human genetics. By training DEEP GENOME AI on 100 000 human genome sequences, complete with medical and trait data, we are able to find genetic interactions of thousands of genes, to give more accurate disease risk predictions than traditional approaches. This powerful approach will give us.