Machine learning approach for distinguishing malignant and benign lung nodules utilizing standardized perinodular parenchymal features from CT.
ABSTRACT: PURPOSE:Computed tomography (CT) is an effective method for detecting and characterizing lung nodules in vivo. With the growing use of chest CT, the detection frequency of lung nodules is increasing. Noninvasive methods to distinguish malignant from benign nodules have the potential to decrease the clinical burden, risk, and cost involved in follow-up procedures on the large number of false-positive lesions detected. This study examined the benefit of including perinodular parenchymal features in machine learning (ML) tools for pulmonary nodule assessment. METHODS:Lung nodule cases with pathology confirmed diagnosis (74 malignant, 289 benign) were used to extract quantitative imaging characteristics from computed tomography scans of the nodule and perinodular parenchyma tissue. A ML tool development pipeline was employed using k-medoids clustering and information theory to determine efficient predictor sets for different amounts of parenchyma inclusion and build an artificial neural network classifier. The resulting ML tool was validated using an independent cohort (50 malignant, 50 benign). RESULTS:The inclusion of parenchymal imaging features improved the performance of the ML tool over exclusively nodular features (P < 0.01). The best performing ML tool included features derived from nodule diameter-based surrounding parenchyma tissue quartile bands. We demonstrate similar high-performance values on the independent validation cohort (AUC-ROC = 0.965). A comparison using the independent validation cohort with the Fleischner pulmonary nodule follow-up guidelines demonstrated a theoretical reduction in recommended follow-up imaging and procedures. CONCLUSIONS:Radiomic features extracted from the parenchyma surrounding lung nodules contain valid signals with spatial relevance for the task of lung cancer risk classification. Through standardization of feature extraction regions from the parenchyma, ML tool validation performance of 100% sensitivity and 96% specificity was achieved.
Project description:Radiomics, which extract large amount of quantification image features from diagnostic medical images had been widely used for prognostication, treatment response prediction and cancer detection. The treatment options for lung nodules depend on their diagnosis, benign or malignant. Conventionally, lung nodule diagnosis is based on invasive biopsy. Recently, radiomics features, a non-invasive method based on clinical images, have shown high potential in lesion classification, treatment outcome prediction.Lung nodule classification using radiomics based on Computed Tomography (CT) image data was investigated and a 4-feature signature was introduced for lung nodule classification. Retrospectively, 72 patients with 75 pulmonary nodules were collected. Radiomics feature extraction was performed on non-enhanced CT images with contours which were delineated by an experienced radiation oncologist.Among the 750 image features in each case, 76 features were found to have significant differences between benign and malignant lesions. A radiomics signature was composed of the best 4 features which included Laws_LSL_min, Laws_SLL_energy, Laws_SSL_skewness and Laws_EEL_uniformity. The accuracy using the signature in benign or malignant classification was 84% with the sensitivity of 92.85% and the specificity of 72.73%.The classification signature based on radiomics features demonstrated very good accuracy and high potential in clinical application.
Project description:We test the hypothesis that a model including clinical and computed tomography (CT) features may allow discrimination between benign and malignant lung nodules in patients with soft-tissue sarcoma (STS). Seventy-one patients with STS undergoing their first lung metastasectomy were examined. The performance of multiple logistic regression models including CT features alone, clinical features alone, and combined features, was tested to evaluate the best model in discriminating malignant from benign nodules. The likelihood of malignancy increased by more than 11, 2, 6 and 7 fold, respectively, when histological synovial sarcoma sub-type was associated with the following CT nodule features: size ? 5.6 mm, well defined margins, increased size from baseline CT, and new onset at preoperative CT. Likewise, in the case of grade III primary tumor, the odds ratio (OR) increased by more than 17 times when the diameter of pulmonary nodules (PNs) was >5.6 mm, more than 13 times with well-defined margins, more than 7 times with PNs increased from baseline CT, and more than 20 times when there were new-onset nodules. Finally, when CT nodule was ?5.6 in size, it had well-defined margins, it increased in size from baseline CT, and when new onset nodules at preoperative CT were concomitant to residual primary tumor R2, the risk of malignancy increased by more than 10, 6, 25 and 28 times, respectively. The combination of clinical and CT features has the highest predictive value for detecting the malignancy of pulmonary nodules in patients with soft tissue sarcoma, allowing early detection of nodule malignancy and treatment options.
Project description:BACKGROUND:Lung cancer is the most commonly diagnosed cancer worldwide. Its survival rate can be significantly improved by early screening. Biomarkers based on radiomics features have been found to provide important physiological information on tumors and considered as having the potential to be used in the early screening of lung cancer. In this study, we aim to establish a radiomics model and develop a tool to improve the discrimination between benign and malignant pulmonary nodules. METHODS:A retrospective study was conducted on 875 patients with benign or malignant pulmonary nodules who underwent computed tomography (CT) examinations between June 2013 and June 2018. We assigned 612 patients to a training cohort and 263 patients to a validation cohort. Radiomics features were extracted from the CT images of each patient. Least absolute shrinkage and selection operator (LASSO) was used for radiomics feature selection and radiomics score calculation. Multivariate logistic regression analysis was used to develop a classification model and radiomics nomogram. Radiomics score and clinical variables were used to distinguish benign and malignant pulmonary nodules in logistic model. The performance of the radiomics nomogram was evaluated by the area under the curve (AUC), calibration curve and Hosmer-Lemeshow test in both the training and validation cohorts. RESULTS:A radiomics score was built and consisted of 20 features selected by LASSO from 1288 radiomics features in the training cohort. The multivariate logistic model and radiomics nomogram were constructed using the radiomics score and patients' age. Good discrimination of benign and malignant pulmonary nodules was obtained from the training cohort (AUC, 0.836; 95% confidence interval [CI]: 0.793-0.879) and validation cohort (AUC, 0.809; 95% CI: 0.745-0.872). The Hosmer-Lemeshow test also showed good performance for the logistic regression model in the training cohort (P = 0.765) and validation cohort (P = 0.064). Good alignment with the calibration curve indicated the good performance of the nomogram. CONCLUSIONS:The established radiomics nomogram is a noninvasive preoperative prediction tool for malignant pulmonary nodule diagnosis. Validation revealed that this nomogram exhibited excellent discrimination and calibration capacities, suggesting its clinical utility in the early screening of lung cancer.
Project description:Each year, millions of pulmonary nodules are discovered by computed tomography and subsequently biopsied. Because most of these nodules are benign, many patients undergo unnecessary and costly invasive procedures. We present a 13-protein blood-based classifier that differentiates malignant and benign nodules with high confidence, thereby providing a diagnostic tool to avoid invasive biopsy on benign nodules. Using a systems biology strategy, we identified 371 protein candidates and developed a multiple reaction monitoring (MRM) assay for each. The MRM assays were applied in a three-site discovery study (n = 143) on plasma samples from patients with benign and stage IA lung cancer matched for nodule size, age, gender, and clinical site, producing a 13-protein classifier. The classifier was validated on an independent set of plasma samples (n = 104), exhibiting a negative predictive value (NPV) of 90%. Validation performance on samples from a nondiscovery clinical site showed an NPV of 94%, indicating the general effectiveness of the classifier. A pathway analysis demonstrated that the classifier proteins are likely modulated by a few transcription regulators (NF2L2, AHR, MYC, and FOS) that are associated with lung cancer, lung inflammation, and oxidative stress networks. The classifier score was independent of patient nodule size, smoking history, and age, which are risk factors used for clinical management of pulmonary nodules. Thus, this molecular test provides a potential complementary tool to help physicians in lung cancer diagnosis.
Project description:We hypothesized that distinct protein expression features of benign and malignant pulmonary nodules may reveal novel candidate biomarkers for the early detection of lung cancer. We performed proteome profiling by liquid chromatography-tandem mass spectrometry to characterize 34 resected benign lung nodules, 24 untreated lung adenocarcinomas (ADCs), and biopsies of bronchial epithelium. Group comparisons identified 65 proteins that differentiate nodules from ADCs and normal bronchial epithelium and 66 proteins that differentiate ADCs from nodules and normal bronchial epithelium. We developed a multiplexed parallel reaction monitoring (PRM) assay to quantify a subset of 43 of these candidate biomarkers in an independent cohort of 20 benign nodules, 21 ADCs, and 20 normal bronchial biopsies. PRM analyses confirmed significant nodule-specific abundance of 10 proteins including ALOX5, ALOX5AP, CCL19, CILP1, COL5A2, ITGB2, ITGAX, PTPRE, S100A12, and SLC2A3 and significant ADC-specific abundance of CEACAM6, CRABP2, LAD1, PLOD2, and TMEM110-MUSTN1. Immunohistochemistry analyses for seven selected proteins performed on an independent set of tissue microarrays confirmed nodule-specific expression of ALOX5, ALOX5AP, ITGAX, and SLC2A3 and cancer-specific expression of CEACAM6. These studies illustrate the value of global and targeted proteomics in a systematic process to identify and qualify candidate biomarkers for noninvasive molecular diagnosis of lung cancer.
Project description:Low-dose CT (LDCT) is widely accepted as the preferred method for detecting pulmonary nodules. However, the determination of whether a nodule is benign or malignant involves either repeated scans or invasive procedures that sample the lung tissue. Noninvasive methods to assess these nodules are needed to reduce unnecessary invasive tests. In this study, we have developed a pulmonary nodule classifier (PNC) using RNA from whole blood collected in RNA-stabilizing PAXgene tubes that addresses this need. Samples were prospectively collected from high-risk and incidental subjects with a positive lung CT scan. A total of 821 samples from 5 clinical sites were analyzed. Malignant samples were predominantly stage 1 by pathologic diagnosis and 97% of the benign samples were confirmed by 4 years of follow-up. A panel of diagnostic biomarkers was selected from a subset of the samples assayed on Illumina microarrays that achieved a ROC-AUC of 0.847 on independent validation. The microarray data were then used to design a biomarker panel of 559 gene probes to be validated on the clinically tested NanoString nCounter platform. RNA from 583 patients was used to assess and refine the NanoString PNC (nPNC), which was then validated on 158 independent samples (ROC-AUC = 0.825). The nPNC outperformed three clinical algorithms in discriminating malignant from benign pulmonary nodules ranging from 6-20 mm using just 41 diagnostic biomarkers. Overall, this platform provides an accurate, noninvasive method for the diagnosis of pulmonary nodules in patients with non-small cell lung cancer. SIGNIFICANCE: These findings describe a minimally invasive and clinically practical pulmonary nodule classifier that has good diagnostic ability at distinguishing benign from malignant pulmonary nodules.
Project description:OBJECTIVE:Lung cancer usually presents as a solitary pulmonary nodule (SPN) on diagnostic imaging during the early stages of the disease. Since the early diagnosis of lung cancer is very important for treatment, the accurate diagnosis of SPNs has much importance. The aim of this study was to evaluate the discriminant power of dual time point imaging (DTPI) PET/CT in the differentiation of malignant and benign FDG-avid solitary pulmonary nodules by using neighborhood gray-tone difference matrix (NGTDM) texture features. METHODS:Retrospective analysis was carried out on 116 patients with SPNs (35 benign and 81 malignant) who had DTPI 18F-FDG PET/CT between January 2005 and May 2015. Both PET and CT images were acquired at 1 h and 3 h after injection. The SUVmax and NGTDM texture features (coarseness, contrast, and busyness) of each nodule were calculated on dual time point images. Patients were randomly divided into training and validation datasets. Receiver operating characteristic (ROC) curve analysis was performed on all texture features in the training dataset to calculate the optimal threshold for differentiating malignant SPNs from benign SPNs. For all the lesions in the testing dataset, two visual interpretation scores were determined by two nuclear medicine physicians based on the PET/CT images with and without reference to the texture features. RESULTS:In the training dataset, the AUCs of delayed busyness, delayed coarseness, early busyness, and early SUVmax were 0.87, 0.85, 0.75 and 0.75, respectively. In the validation dataset, the AUCs of visual interpretations with and without texture features were 0.89 and 0.80, respectively. CONCLUSION:Compared to SUVmax or visual interpretation, NGTDM texture features derived from DTPI PET/CT images can be used as good predictors of SPN malignancy. Improvement in discriminating benign from malignant nodules using SUVmax and visual interpretation can be achieved by adding busyness extracted from delayed PET/CT images.
Project description:Rationale: Screening for non-small cell lung cancer is associated with earlier diagnosis and reduced mortality but also increased harm caused by invasive follow-up of benign pulmonary nodules. Lung tumorigenesis activates the immune system, components of which could serve as tumor-specific biomarkers. Objectives: To profile tumor-derived autoantibodies as peripheral biomarkers of malignant pulmonary nodules. Methods: High-density protein arrays were used to define the specificity of autoantibodies isolated from B cells of 10 resected lung tumors. These tumor-derived autoantibodies were also examined as free or complexed to antigen in the plasma of the same 10 patients and matched benign nodule control subjects. Promising autoantibodies were further analyzed in an independent cohort of 250 nodule-positive patients. Measurements and Main Results: Thirteen tumor B-cell-derived autoantibodies isolated ex vivo showed greater than or equal to 50% sensitivity and greater than or equal to 70% specificity for lung cancer. In plasma, 11 of 13 autoantibodies were present both complexed to and free from antigen. In the larger validation cohort, 5 of 13 tumor-derived autoantibodies remained significantly elevated in cancers. A combination of four of these autoantibodies could detect malignant nodules with an area under the curve of 0.74 and had an area under the curve of 0.78 in a subcohort of indeterminate (8-20 mm in the longest diameter) pulmonary nodules. Conclusions: Our novel pipeline identifies tumor-derived autoantibodies that could effectively serve as blood biomarkers for malignant pulmonary nodule diagnosis. This approach has future implications for both a cost-effective and noninvasive approach to determine nodule malignancy for widespread low-dose computed tomography screening.
Project description:The aim of this study was to compare the performance of 2- (2D) and 3-dimensional (3D) quantitative computed tomography (CT) methods for classifying lung nodules as lung cancer, metastases, or benign.Using semiautomated software and computerized analysis, we analyzed more than 50 quantitative CT features of 96 solid nodules in 94 patients, in 2D from a single slice and in 3D from the entire nodule volume. Multivariable logistic regression was used to classify nodule types. Model performance was assessed by the area under the receiver operating characteristic curve (AUC) using leave-one-out cross-validation.The AUC for distinguishing 53 primary lung cancers from 18 benign nodules and 25 metastases ranged from 0.79 to 0.83 and was not significantly different for 2D and 3D analyses (P = 0.29-0.78). Models distinguishing metastases from benign nodules were statistically significant only by 3D analysis (AUC = 0.84).Three-dimensional CT methods did not improve discrimination of lung cancer, but may help distinguish benign nodules from metastases.