<HashMap><database>biostudies-arrayexpress</database><scores/><additional><submitter>Stephen Sansom</submitter><organism>Homo sapiens</organism><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/E-MTAB-14722</full_dataset_link><description>A single cell atlas of developing human embryonic proximal and distal interphalangeal finger joints. This deposition holds the raw 10x single-cell sequencing data in FASTQ format and processed data in anndata format. All of the anndata objects contain normalised counts in the .X slot and gene names in the anndata.var. They hold the cluster assignments for the related figure(s) and are ready for use with cellxgene. The six anndata objects are described below:  (1) anndata_allQCed_cells_SF2.h5ad. This is the primary object which additionally holds the raw counts for all cells in the .layers[‘counts’] slot and the Ensembl gene IDs in the .var table. It is recommended to use this object for data analysis. The object contains the data shown in Extended Data Fig. S2a. Integration with scVI was performed with default parameters with regression of  % mito  and cell cycle phases (both S and G2M). The following parameters were used for the analysis: nearest neighbour graph (NNG): nSCVI_comp=30, k=15; UMAP: min.dist 0.1; Clustering: Leiden, resolution 2. No cell types were removed.  (2) anndata_Fig1.h5ad. This object contains cells understood to form the joint tissue as shown in Fig. 1. Integration with scVI was performed with regression of % mito content and cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: nSCVI_comp=30, k=20; UMAP:min.dist 0.5; Clustering: Leiden, resolution 2. The following cell types were removed: keratinocytes, myocyte-related cells, erythrocytes, cluster with high mito content, NRK+IRF+ cluster.  (3) anndata_stromal1_SF3.h5ad. The object contains all the stromal cells as shown in Extended Data Fig. 3a (stromal cells 1). Integration with scVI was performed with regression of % mito content and cell cycle phases (both S and G2M) . The following parameters were used for the analysis: NNG: nSCVI_comp=30, k=15; UMAP:min.dist 0; Clustering: Leiden, resolution 1.2. The following cell types were removed: myeloid cells, endothelial cells, pericytes, lymphatic endothelial cells, glial cells.  (4) anndata_stromal2_Fig2_SF4.h5ad. The object contains the stromal cells as shown in Extended Data Fig. 4a, Fig. 2a (stromal cells 2). Integration with scVI was performed with regression of cell cycle gene expression . The following parameters were used for the analysis: NNG: k=30, UMAP: min dist 0.1, Leiden clustering resolution 1 (with further manual curation). The following cell types were removed from the 'stromal 1' subset: Skin precursors (TWIST2+), tenocytes, MSX1+ cells, low quality cells.  (5) anndata_STFs_Fig3a.h5ad. The object contains the soft-tissue fibroblasts as shown in Fig.3a. Integration with scVI was performed with regression of cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: k=20, Leiden clustering resolution 0.5. The following cell types were removed from the 'stromal 2' subset: Cartilage and CZSC.   (6) anndata_CZSC_chondro_Fig3b.h5ad. The object contains the cartilage cells shown in Fig. 3b. Integration with scVI was performed with regression of cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: k=20, Leiden clustering resolution 0.5 . The following cell types were removed from the 'stromal 2' subset: soft-tissue fibroblasts.  The following columns are present in the .obs of the anndata objects: \\"sample_id\\": sample identifier; \\"alias\\": the sample name; \\"subject_id\\": Anonymized subject identifier \\"sex\\": Biological sex; \\"post_conception_week\\": Post conception week that sample was derived from; \\"joint_development_stage\\": ‘early’ corresponds to 8-9 post conception weeks and ‘late’ corresponds to 12-14 weeks post conception; \\"joint\\": DIP = distal interphalangeal joint, PIP = proximal interphalangeal joint; \\"group\\": Categories: ‘PIP_early’, ‘DIP_early’, ‘PIP_late’, ‘DIP_late’; \\"cluster_id\\" and \\"cluster\\": Cluster assignments for all single cells in the object. \\"cluster_id\\" corresponds to numeric cluster (details in Table above), while \\"cluster\\" is a given name used for the respective Figure.; \\"pct_mitochondrial\\": percentage mitochondrial reads; \\"S_score\\": Cell cycle S phase score; \\"G2M_score\\": cell cycle G2M phase score.</description><repository>biostudies-arrayexpress</repository><sample_protocol>Sequencing - Libraries were sequenced on an Illumina NovaSeq6000 sequencer.</sample_protocol><sample_protocol>Library Construction - As per 10x Genomics Single Cell 3' Reagent Kits v3.1 manufacturer user guide (CG000330 Rev B)</sample_protocol><sample_protocol>Nucleic Acid Extraction - As per 10x Genomics Single Cell 3' Reagent Kits v3.1 manufacturer user guide (CG000330 Rev B)</sample_protocol><sample_protocol>Sample Collection - Skin and tendon material were removed from each finger, and DIP and PIP joints isolated by cutting either side of the joint space. Joints were cut into pieces and enzymatically digested with 1mg/ml collagenase D in DMEM, at 37 degrees C for 1h 30mins. Supernatant containing released cells was collected every 10-15 mins, neutralized in DMEM 10% FBS, and stored at 4 degrees C. Once supernatant was removed, fresh enzyme was applied to continue sample digestion. After digestion, the remaining tissue was manually dissociated, and combined with collected supernatants, in DMEM 10% FBS. All isolated cells were then passed through a 70μm filter. Where possible, cells were sorted for viability using the 7-AAD dye (Biolegend 420403) and loaded onto the 10X Genomics Chromium Controller (Chip G) following the manufacturer’s protocol to capture 10,000 cells per reaction.</sample_protocol><sample_protocol>Sample Collection - Skin and tendon material were removed from each finger, and DIP and PIP joints isolated by cutting either side of the joint space. Joints were cut into pieces and enzymatically digested with 0.1mg/ml of liberaseTL in DMEM, at 37 degrees C for 1h 30mins. Supernatant containing released cells was collected every 10-15 mins, neutralized in DMEM 10% FBS, and stored at 4 degrees C. Once supernatant was removed, fresh enzyme was applied to continue sample digestion. After digestion, the remaining tissue was manually dissociated, and combined with collected supernatants, in DMEM 10% FBS. All isolated cells were then passed through a 70μm filter. Where possible, cells were sorted for viability using the 7-AAD dye (Biolegend 420403) and loaded onto the 10X Genomics Chromium Controller (Chip G) following the manufacturer’s protocol to capture 10,000 cells per reaction.</sample_protocol><figure_sub>Organization</figure_sub><figure_sub>MINSEQE Score</figure_sub><figure_sub>Assays and Data</figure_sub><figure_sub>Processed Data</figure_sub><figure_sub>MAGE-TAB Files</figure_sub><data_protocol>Data Transformation - scRNAseq data were aligned and quantified using the 10x Genomics Cell Ranger pipeline (version 4.0.0, human cellranger reference GRCh38 and Ensembl 98, script ‘pipeline_cellranger.py’ in https://github.com/sansomlab/tenx/). The quality filtering and initial data processing was performed using the script ‘pipeline_scxl.py’ (https://github.com/sansomlab/tenx/ ). Cells with more than 10% mitochondrial reads or fewer than 500 genes were excluded from further analysis. The difference between G2M and S phase was estimated using the expression of known cell cycle marker genes (CellCycleScoring, Seurat package). Doublets were removed using the mean(scrublet score) + 3SD (filter value: 0.26, scrublet v0.2.3). For integration, batch correction and dimensionality reduction, we used scVI with correction for mitochondrial content and cell cycle phase (3000 highly variable genes, 30 scVI latent variables, scVI v0.12.0). The neighbour graph was created from the scVI components using hnsw (as implemented in scVelo v0.2.5). To analyse cell types in more detail, various stages of removing populations, re-embedding and re-clustering were performed (see the description of the anndata files in the General Information).</data_protocol><omics_type>Metabolomics</omics_type><omics_type>Unknown</omics_type><omics_type>Transcriptomics</omics_type><omics_type>Genomics</omics_type><omics_type>Proteomics</omics_type><instrument_platform>Illumina NovaSeq 6000</instrument_platform><study_type>RNA-seq of coding RNA from single cells</study_type><species>Homo sapiens</species><pubmed_authors>Stephen Sansom</pubmed_authors></additional><is_claimable>false</is_claimable><name>Single-cell RNA-sequencing data from developing embryonic human finger joints</name><description>A single cell atlas of developing human embryonic proximal and distal interphalangeal finger joints. This deposition holds the raw 10x single-cell sequencing data in FASTQ format and processed data in anndata format. All of the anndata objects contain normalised counts in the .X slot and gene names in the anndata.var. They hold the cluster assignments for the related figure(s) and are ready for use with cellxgene. The six anndata objects are described below:  (1) anndata_allQCed_cells_SF2.h5ad. This is the primary object which additionally holds the raw counts for all cells in the .layers[‘counts’] slot and the Ensembl gene IDs in the .var table. It is recommended to use this object for data analysis. The object contains the data shown in Extended Data Fig. S2a. Integration with scVI was performed with default parameters with regression of  % mito  and cell cycle phases (both S and G2M). The following parameters were used for the analysis: nearest neighbour graph (NNG): nSCVI_comp=30, k=15; UMAP: min.dist 0.1; Clustering: Leiden, resolution 2. No cell types were removed.  (2) anndata_Fig1.h5ad. This object contains cells understood to form the joint tissue as shown in Fig. 1. Integration with scVI was performed with regression of % mito content and cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: nSCVI_comp=30, k=20; UMAP:min.dist 0.5; Clustering: Leiden, resolution 2. The following cell types were removed: keratinocytes, myocyte-related cells, erythrocytes, cluster with high mito content, NRK+IRF+ cluster.  (3) anndata_stromal1_SF3.h5ad. The object contains all the stromal cells as shown in Extended Data Fig. 3a (stromal cells 1). Integration with scVI was performed with regression of % mito content and cell cycle phases (both S and G2M) . The following parameters were used for the analysis: NNG: nSCVI_comp=30, k=15; UMAP:min.dist 0; Clustering: Leiden, resolution 1.2. The following cell types were removed: myeloid cells, endothelial cells, pericytes, lymphatic endothelial cells, glial cells.  (4) anndata_stromal2_Fig2_SF4.h5ad. The object contains the stromal cells as shown in Extended Data Fig. 4a, Fig. 2a (stromal cells 2). Integration with scVI was performed with regression of cell cycle gene expression . The following parameters were used for the analysis: NNG: k=30, UMAP: min dist 0.1, Leiden clustering resolution 1 (with further manual curation). The following cell types were removed from the 'stromal 1' subset: Skin precursors (TWIST2+), tenocytes, MSX1+ cells, low quality cells.  (5) anndata_STFs_Fig3a.h5ad. The object contains the soft-tissue fibroblasts as shown in Fig.3a. Integration with scVI was performed with regression of cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: k=20, Leiden clustering resolution 0.5. The following cell types were removed from the 'stromal 2' subset: Cartilage and CZSC.   (6) anndata_CZSC_chondro_Fig3b.h5ad. The object contains the cartilage cells shown in Fig. 3b. Integration with scVI was performed with regression of cell cycle phases (both S and G2M). The following parameters were used for the analysis: NNG: k=20, Leiden clustering resolution 0.5 . The following cell types were removed from the 'stromal 2' subset: soft-tissue fibroblasts.  The following columns are present in the .obs of the anndata objects: \\"sample_id\\": sample identifier; \\"alias\\": the sample name; \\"subject_id\\": Anonymized subject identifier \\"sex\\": Biological sex; \\"post_conception_week\\": Post conception week that sample was derived from; \\"joint_development_stage\\": ‘early’ corresponds to 8-9 post conception weeks and ‘late’ corresponds to 12-14 weeks post conception; \\"joint\\": DIP = distal interphalangeal joint, PIP = proximal interphalangeal joint; \\"group\\": Categories: ‘PIP_early’, ‘DIP_early’, ‘PIP_late’, ‘DIP_late’; \\"cluster_id\\" and \\"cluster\\": Cluster assignments for all single cells in the object. \\"cluster_id\\" corresponds to numeric cluster (details in Table above), while \\"cluster\\" is a given name used for the respective Figure.; \\"pct_mitochondrial\\": percentage mitochondrial reads; \\"S_score\\": Cell cycle S phase score; \\"G2M_score\\": cell cycle G2M phase score.</description><dates><release>2026-04-24T00:00:00Z</release><modification>2026-04-24T10:12:06.014Z</modification><creation>2024-12-23T17:34:53.994Z</creation></dates><accession>E-MTAB-14722</accession><cross_references><ENA>ERP167400</ENA><EFO>EFO_0002944</EFO><EFO>EFO_0004170</EFO><EFO>EFO_0005684</EFO><EFO>EFO_0005518</EFO><EFO>EFO_0003816</EFO><EFO>EFO_0004184</EFO></cross_references></HashMap>