biostudies-literatureHudson DRosalind Franklin InstituteNational Institute for Health and Care ResearchWellcome TrustBiotechnology and Biological Sciences Research CouncilMRCNonehttps://www.ebi.ac.uk/biostudies/studies/S-EPMC10955519biostudies-literatureUnknown13The vast potential sequence diversity of TCRs and their ligands has presented an historic barrier to computational prediction of TCR epitope specificity, a holy grail of quantitative immunology. One common approach is to cluster sequences together, on the assumption that similar receptors bind similar epitopes. Here, we provide the first independent evaluation of widely used clustering algorithms for TCR specificity inference, observing some variability in predictive performance between models, and marked differences in scalability. Despite these differences, we find that different algorithms produce clusters with high degrees of similarity for receptors recognising the same epitope. Our analysis strengthens the case for use of clustering models to identify signals of common specificity from large repertoires, whilst highlighting scope for improvement of complex models over simple comparators.Immunoinformatics (Amsterdam, Netherlands)A comparison of clustering models for inference of T cell receptor antigen specificity.PMC10955519203141_Z_16_ZMC_UU_12010/3203141BB_T008784_1Basham MHudson DLubbock AKoohy HfalseA comparison of clustering models for inference of T cell receptor antigen specificity.The vast potential sequence diversity of TCRs and their ligands has presented an historic barrier to computational prediction of TCR epitope specificity, a holy grail of quantitative immunology. One common approach is to cluster sequences together, on the assumption that similar receptors bind similar epitopes. Here, we provide the first independent evaluation of widely used clustering algorithms for TCR specificity inference, observing some variability in predictive performance between models, and marked differences in scalability. Despite these differences, we find that different algorithms produce clusters with high degrees of similarity for receptors recognising the same epitope. Our analysis strengthens the case for use of clustering models to identify signals of common specificity from large repertoires, whilst highlighting scope for improvement of complex models over simple comparators.2024-01-01T00:00:00Z2024 Mar2024-11-14T21:30:02.905Z2024-11-14T21:30:02.905ZS-EPMC109555193852504710.1016/j.immuno.2024.100033