AI framework can identify and track new and concerning COVID-19 variants

2024-03-11

Scientists at The Universities of Manchester and Oxford have developed an AI framework that can identify and track new and concerning COVID-19 variants and could help with other infections in the future.

The framework combines dimension reduction techniques and a new explainable clustering algorithm called CLASSIX, developed by mathematicians at The University of Manchester. This enables the quick identification of groups of viral genomes that might present a risk in the future from huge volumes of data.

The study, presented this week in the journal PNAS, could support traditional methods of tracking viral evolution, such as phylogenetic analysis, which currently require extensive manual curation.

Since the emergence of COVID-19, we have seen multiple waves of new variants, heightened transmissibility, evasion of immune responses, and increased severity of illness.

Scientists are now intensifying efforts to pinpoint these worrying new variants, such as alpha, delta and omicron, at the earliest stages of their emergence. If we can find a way to do this quickly and efficiently, it will enable us to be more proactive in our response, such as tailored vaccine development and may even enable us to eliminate the variants before they become established." Roberto Cahuantzi, researcher at The University of Manchester and first and corresponding author of the paper

Like many other RNA viruses, COVID-19 has a high mutation rate and short time between generations meaning it evolves extremely rapidly. This means identifying new strains that are likely to be problematic in the future requires considerable effort.

Currently, there are almost 16 million sequences available on the GISAID database (the Global Initiative on Sharing All Influenza Data), which provides access to genomic data of influenza viruses.

Mapping the evolution and history of all COVID-19 genomes from this data is currently done using extremely large amounts of computer and human time.

The described method allows automation of such tasks. The researchers processed 5.7 million high-coverage sequences in only one to two days on a standard modern laptop; this would not be possible for existing methods, putting identification of concerning pathogen strains in the hands of more researchers due to reduced resource needs.

Thomas House, Professor of Mathematical Sciences at The University of Manchester, said: "The unprecedented amount of genetic data generated during the pandemic demands improvements to our methods to analyse it thoroughly. The data is continuing to grow rapidly but without showing a benefit to curating this data, there is a risk that it will be removed or deleted.

"We know that human expert time is limited, so our approach should not replace the work of humans all together but work alongside them to enable the job to be done much quicker and free our experts for other vital developments."

The proposed method works by breaking down genetic sequences of the COVID-19 virus into smaller "words" (called 3-mers) represented as numbers by counting them. Then, it groups similar sequences together based on their word patterns using machine learning techniques.

Stefan Güttel, Professor of Applied Mathematics at the University of Manchester, said: "The clustering algorithm CLASSIX we developed is much less computationally demanding than traditional methods and is fully explainable, meaning that it provides textual and visual explanations of the computed clusters."

Roberto Cahuantzi added: "Our analysis serves as a proof of concept, demonstrating the potential use of machine learning methods as an alert tool for the early discovery of emerging major variants without relying on the need to generate phylogenies.

"Whilst phylogenetics remains the 'gold standard' for understanding the viral ancestry, these machine learning methods can accommodate several orders of magnitude more sequences than the current phylogenetic methods and at a low computational cost."

Source: news-medical

Next post DXVX "mRNA 활용 항암백신 후보물질 합성 진행 중"

Previous post “디지털 치료기기 상용화 위해선 환자 만족도 높아야”

AI framework can identify and track new and concerning COVID-19 variants

Most Popular

‘디지털 치료제’ 새 패러다임 될까?

"게임ㆍ앱으로 조현병, 고혈압 치료"…‘디지털 치료제’ 상용화 속도

FDA, 뇌졸증 회복 돕는 '디지털 치료제' 승인

코로나19로 디지털치료제 수요 ↑…美 FDA, 올해만 6개 허가

AI framework can identify and track new and concerning COVID-19 variants

Related Articles

온라인팜, 한미사이언스 버팀목됐다

온라인팜 출범 3년, 온•오프라인 통합 약국서비스 호평

에이아이트릭스, 인공지능 학회 ICML서 연구논문 4편 발표

Most Popular

‘디지털 치료제’ 새 패러다임 될까?

"게임ㆍ앱으로 조현병, 고혈압 치료"…‘디지털 치료제’ 상용화 속도

FDA, 뇌졸증 회복 돕는 '디지털 치료제' 승인

코로나19로 디지털치료제 수요 ↑…美 FDA, 올해만 6개 허가