An improved catalogue for whole-genome sequencing prediction of bedaquiline resistance in Mycobacterium tuberculosis using a reproducible algorithmic approach
Adlard D., Joseph L., Webster H., O'Reilly A., Knaggs J., Peto TEA., Crook DW., Omar SV., Fowler PW.
Bedaquiline (BDQ) has only been approved for use for just over a decade and is a key drug for treating multidrug-resistant tuberculosis; however, rising levels of resistance threaten to reduce its effectiveness. Catalogues of mutations associated with resistance to BDQ are key to detecting resistance genetically for either diagnosis or surveillance. At present, building catalogues requires considerable expert knowledge, often requires the use of complex grading rules and is an irreproducible process. We developed an automated method, catomatic, that associates genetic variants with resistance (or susceptibility) using a two-tailed binomial test with a stated background rate and applied it to a dataset of 11,867 Mycobacterium tuberculosis samples with whole-genome and BDQ susceptibility testing data. Using this framework, we investigated how to best classify variants and the phenotypic significance of minor alleles. The genes mmpS5 and mmpL5 are not directly associated with BDQ resistance, and our catalogue of Rv0678, atpE and pepQ variants attains a cross-validated sensitivity and specificity of 79.4±1.8% and 98.5±0.3%, respectively, for 94±0.4% of samples. Identifying samples with subpopulations containing Rv0678 variants improves sensitivity, and detection thresholds in bioinformatic pipelines should therefore be lowered. By using a more permissive and deterministic algorithm trained on a sufficient number of resistant samples, we have reproducibly constructed a catalogue of BDQ resistance-associated variants that is comprehensive and accurate.