ViralMSA: massively scalable reference-guided multiple sequence alignment of viral genomes


Combating viral spread and their associated disease burden is a tremendous challenge requiring significant research efforts and dedicated measures, informing public health, veterinary care, and agriculture strategies. Viral sequence data is a major asset in the characterization of pathogens. Understanding the processes that generate genetic diversity assists in the struggle against viral infections and enhances our understanding of past evolutionary and epidemiological events. It can also help in the identification of the origins of new epidemics, in monitoring the effectiveness of therapeutic strategies, and eventually in predicting the behavior of viral epidemics. Advances in bioinformatics have led to improved approaches to investigating viral outbreaks that have been successfully applied to viruses including the Human immunodeficiency viruses (HIV), the Ebola virus, the Dengue and Zika viruses, the Chikungunya virus and the Influenza virus.

The Virus Evolution and Molecular Epidemiology (VEME) workshop provides both theoretical and practical training in phylogenetic inference and evolutionary hypothesis testing as applied in virology and molecular epidemiology. This covers sequence analysis, phylogenetics, phylodynamics methods, and large-scale methods for next-generation sequencing (NGS) analytics. The VEME workshop is recognized as one of the best international virus bioinformatics courses.

I was invited to give a talk at the VEME 2021 workshop, which was hosted by the University of Florida. My talk was titled ViralMSA: massively scalable reference-guided multiple sequence alignment of viral genomes.