<HashMap><database>biostudies-literature</database><scores/><additional><submitter>Heaton H</submitter><funding>British Heart Foundation</funding><funding>Medical Research Council</funding><funding>National Institute for Health Research (NIHR)</funding><funding>Wellcome Trust</funding><pagination>615-620</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC7617080</full_dataset_link><repository>biostudies-literature</repository><omics_type>Unknown</omics_type><volume>17(6)</volume><pubmed_abstract>Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.</pubmed_abstract><journal>Nature methods</journal><pubmed_title>Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes.</pubmed_title><pmcid>PMC7617080</pmcid><funding_grant_id>206194</funding_grant_id><funding_grant_id>207492</funding_grant_id><funding_grant_id>RG/18/13/33946</funding_grant_id><funding_grant_id>HDR-9004</funding_grant_id><funding_grant_id>WT098051</funding_grant_id><funding_grant_id>G1100339</funding_grant_id><funding_grant_id>MR/L003120/1</funding_grant_id><funding_grant_id>WT098503</funding_grant_id><funding_grant_id>RG/13/13/30194</funding_grant_id><funding_grant_id>206194/Z/17/Z</funding_grant_id><funding_grant_id>WT207492</funding_grant_id><funding_grant_id>RG/13/13/30194; RG/18/13/33946</funding_grant_id><funding_grant_id>098051</funding_grant_id><pubmed_authors>Talman AM</pubmed_authors><pubmed_authors>Heaton H</pubmed_authors><pubmed_authors>Imaz M</pubmed_authors><pubmed_authors>Gaffney DJ</pubmed_authors><pubmed_authors>Lawniczak MKN</pubmed_authors><pubmed_authors>Knights A</pubmed_authors><pubmed_authors>Hemberg M</pubmed_authors><pubmed_authors>Durbin R</pubmed_authors></additional><is_claimable>false</is_claimable><name>Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes.</name><description>Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.</description><dates><release>2020-01-01T00:00:00Z</release><publication>2020 Jun</publication><modification>2026-06-02T22:02:36.449Z</modification><creation>2025-04-06T22:40:48.731Z</creation></dates><accession>S-EPMC7617080</accession><cross_references><pubmed>32366989</pubmed><doi>10.1038/s41592-020-0820-1</doi></cross_references></HashMap>