Whole-Genome Sequencing-based characterization of staphylococcus aureus in hospital settings within East Africa
Abstract
Background: Staphylococus aureus, a Gram-positive facultative anaerobic bacterium, is a prominent pathogen responsible for a wide range of infections, from minor skin conditions to severe, life- threatening diseases. Particularly concerning is Methicillin Resistant Staphylococus aureus (MRSA), a major nosocomial pathogen exhibiting resistance to multiple antibiotics. In East Africa, S. aureus infections, especially MRSA, pose significant public health challenges, with high prevalence rates observed in hospital settings. Objective: This study aimed to assess the population structure, antibiotic resistance, virulence genes,
housekeeping genes, mobile genetic elements, prophage content, and pangenome analysis of whole- genome sequenced Staphylococus aureus obtained from hospital and community settings within East
Africa. Methods: This cross-sectional study assessed Whole Genome Sequenced (WGS) data generated from
research studies carried out within East Africa in hospital settings. The WGS data was downloaded from
the National Center for Biotechnology Information (NCBI) database using the Sequence Read Archive
(SRA) toolkit. A total of 496 WGS S. aureus samples were obtained across three countries: 41 from
Uganda (PRJEB40863), 95 and 185 from two studies in Kenya (PRJEB23611 and PRJEB15413), and
10 and 165 from two studies in Tanzania (PRJEB75012 and PRJEB71932). The downloaded sequences,
generated from Illumina sequencing technology with paired-end reads, underwent quality control checks
using Fastq. Poor-quality bases and adapters were trimmed using Trimmomatic, and an overall quality
report was generated with MultiQC. Both DeNovo assembly and reference genome mapping were
performed using SPAdes to generate scaffolds and draft assemblies for downstream analysis, which
were polished using Pilon. The quality of assemblies was assessed with QUAST, and annotations were
performed using Prokka. Housekeeping genes were identified with the Multi Locus Sequencing Typing
(MLST) tool, while the ABRicate tool was used to determine virulence and antimicrobial resistance
genes. Comparative genome analysis was conducted using dREP to identify clusters based on genome
similarities, followed by pangenome analysis using the ROARY pipeline. Mobile genetic elements
(MGEs) were identified using PlasmidFinder, PHASTER, Mobile Element Finder, SCCmecFinder,
AlienHunter, and IslandViewer. Prophages were detected using PHASTER and integrated into the
analysis. 15 Results: A total of 94 antimicrobial resistance (AMR) genes were identified across the 496 genomes, linked to resistance against 21 different drugs, indicating a significant level of multidrug resistance among S. aureus isolates in East Africa. The presence of virulence genes such as F-subunit of Panton–Valentine leukocidin (lukF-PV), Hyaluronate lyase (hysA), and icaD suggests a heightened pathogenic potential, with implications for severe clinical outcomes. MLST analysis revealed diverse
sequence types among the isolates, highlighting the genetic variability within S. aureus populations. Additionally, the detection of prophages suggests potential horizontal gene transfer mechanisms, further complicating the resistance landscape. Comparative genomic analyses revealed distinct clusters of S. aureus isolates, indicating genetic diversity and potential transmission pathways across different settings. Pangenome analysis underscores the genomic flexibility of S. aureus, highlighting its capacity
to thrive in diverse environments through a combination of shared essential genes and a large pool of variable, strain-specific genes. Conclusion: The findings highlight the alarming presence of antimicrobial resistance and virulence factors in S. aureus populations in East Africa. The study underscores the urgent need for enhanced surveillance and infection control strategies to address the rising threat of MRSA and other S. aureus infections in the region, as well as the importance of monitoring mobile genetic elements and prophages that may facilitate the spread of resistance genes.