Project description:Mini-Abstract ChatGPT is an artificial intelligence (AI) technology that has begun to transform academics through its ability to create human-like text. This has raised ethical concerns about its assistance in writing scientific literature. Our aim is to highlight the benefits and risks that this technology may pose to the surgical field.
Project description:In the realm of advanced technology, deep learning capabilities are harnessed to analyze and predict novel data, once it has absorbed existing information. When applied to the sphere of education, this transformative technology becomes a catalyst for innovation and reform, leading to advancements in teaching modes, methodologies, and curricula. In light of these possibilities, the application of deep learning technology to teaching resource recommendations is explored in this article. Within the context of the study, a bespoke recommendation algorithm for teaching resources is devised, drawing upon the integration of deep learning and cognitive diagnosis (ADCF). This intricately constructed model consists of two core elements: the Multi-layer Perceptron (MLP) and the Generalized Matrix Factorization (GMF), operating cohesively through stages of linear representation and nonlinear learning of the interaction function. The empirical analysis reveals that the ADCF model achieves 0.626 and 0.339 in the hits ratio (HR) and the Normalized Discounted Cumulative Gain (NDCG) respectively due to the traditional model, signifying its potential to add significant value to the domain of teaching resource recommendations.
Project description:BackgroundLetters of recommendation (LORs) are an important part of applications for residency and fellowship programs. Despite anecdotal use of a "code" in LORs, research on program director (PD) perceptions of the value of these documents is sparse.ObjectiveWe analyzed PD interpretations of LOR components and discriminated between perceived levels of applicant recommendations.MethodsWe conducted a cross-sectional, descriptive study of pediatrics residency and fellowship PDs. We developed a survey asking PDs to rate 3 aspects of LORs: 13 letter features, 10 applicant abilities, and 11 commonly used phrases, using a 5-point Likert scale. The 11 phrases were grouped using principal component analysis. Mean scores of components were analyzed with repeated-measures analysis of variance. Median Likert score differences between groups were analyzed with Mann-Whitney U tests.ResultsOur survey had a 43% response rate (468 of 1079). "I give my highest recommendation" was rated the most positive phrase, while "showed improvement" was rated the most negative. Principal component analysis generated 3 groups of phrases with moderate to strong correlation with each other. The mean Likert score for each group from the PD rating was calculated. Positive phrases had a mean (SD) of 4.4 (0.4), neutral phrases 3.4 (0.5), and negative phrases 2.6 (0.6). There was a significant difference among all 3 pairs of mean scores (all P < .001).ConclusionsCommonly used phrases in LORs were interpreted consistently by PDs and influenced their impressions of candidates. Key elements of LORs include distinct phrases depicting different degrees of endorsement.
Project description:ObjectiveTo investigate whether or not gender influences letters of recommendation for cardiothoracic surgery fellowship.MethodsFrom applications to an Accreditation Council Graduate Medical Education cardiothoracic surgery fellowship program between 2016 and 2021, applicant and author characteristics were examined with descriptive statistics, analysis of variance, and Pearson χ2 tests. Linguistic software was used to assess communication differences in letters of recommendation, stratified by author and applicant gender. An additional higher-level analysis was then performed using a generalized estimating equations model to examine linguistic differences among author-applicant gender pairs.ResultsSeven hundred thirty-nine recommendation letters extracted from 196 individual applications were analyzed; 90% (n = 665) of authors were men and 55.8% (n = 412) of authors were cardiothoracic surgeons. Compared with women authors, authors who are men wrote more authentic (P = .01) and informal (P = .03) recommendation letters. When writing for women applicants, authors who are men were more likely to display their own leadership and status (P = .03) and discuss women applicants' social affiliations (P = .01), like occupation of applicant's father or husband. Women authors wrote longer letters (P = .03) and discussed applicants' work (P = .01) more often than authors who are men. They also mentioned leisure activities (P = .03) more often when writing for women applicants.ConclusionsOur work identifies gender-specific differences in letters of recommendation. Women applicants may be disadvantaged because their recommendation letters are significantly more likely to focus on their social ties, leisure activities, and the status of the letter writer. Author and reviewer awareness of gender-biased use of language will aid in improvements to the candidate selection process.
Project description:Artificial intelligence (A.I.) increasingly suffuses everyday life. However, people are frequently reluctant to interact with A.I. systems. This challenges both the deployment of beneficial A.I. technology and the development of deep learning systems that depend on humans for oversight, direction, and regulation. Nine studies (N = 3,300) demonstrate that social-cognitive processes guide human interactions across a diverse range of real-world A.I. systems. Across studies, perceived warmth and competence emerge prominently in participants' impressions of A.I. systems. Judgments of warmth and competence systematically depend on human-A.I. interdependence and autonomy. In particular, participants perceive systems that optimize interests aligned with human interests as warmer and systems that operate independently from human direction as more competent. Finally, a prisoner's dilemma game shows that warmth and competence judgments predict participants' willingness to cooperate with a deep-learning system. These results underscore the generality of intent detection to perceptions of a broad array of algorithmic actors.
Project description:BackgroundWork circumstances can substantially negatively impact health. To explore this, large occupational cohorts of free-text job descriptions are manually coded and linked to exposure. Although several automatic coding tools have been developed, accurate exposure assessment is only feasible with human intervention.MethodsWe developed OPERAS, a customizable decision support system for epidemiological job coding. Using 812,522 entries, we developed and tested classification models for the Professions et Catégories Socioprofessionnelles (PCS)2003, Nomenclature d'Activités Française (NAF)2008, International Standard Classifications of Occupation (ISCO)-88, and ISCO-68. Each code comes with an estimated correctness measure to identify instances potentially requiring expert review. Here, OPERAS' decision support enables an increase in efficiency and accuracy of the coding process through code suggestions. Using the Formaldehyde, Silica, ALOHA, and DOM job-exposure matrices, we assessed the classification models' exposure assessment accuracy.ResultsWe show that, using expert-coded job descriptions as gold standard, OPERAS realized a 0.66-0.84, 0.62-0.81, 0.60-0.79, and 0.57-0.78 inter-coder reliability (in Cohen's Kappa) on the first, second, third, and fourth coding levels, respectively. These exceed the respective inter-coder reliability of expert coders ranging 0.59-0.76, 0.56-0.71, 0.46-0.63, 0.40-0.56 on the same levels, enabling a 75.0-98.4% exposure assessment accuracy and an estimated 19.7-55.7% minimum workload reduction.ConclusionsOPERAS secures a high degree of accuracy in occupational classification and exposure assessment of free-text job descriptions, substantially reducing workload. As such, OPERAS significantly outperforms both expert coders and other current coding tools. This enables large-scale, efficient, and effective exposure assessment securing healthy work conditions.
Project description:BackgroundIBM Watson for Oncology (WFO) is a cognitive computing system helping physicians quickly identify key information in a patient's medical record, surface relevant evidence, and explore treatment options. This study assessed the possibility of using WFO for clinical treatment in lung cancer patients.MethodsWe evaluated the level of agreement between WFO and multidisciplinary team (MDT) for lung cancer. From January to December 2018, newly diagnosed lung cancer cases in Chonnam National University Hwasun Hospital were retrospectively examined using WFO version 18.4 according to four treatment categories (surgery, radiotherapy, chemoradiotherapy, and palliative care). Treatment recommendations were considered concordant if the MDT recommendations were designated 'recommended' by WFO. Concordance between MDT and WFO was analyzed by Cohen's kappa value.ResultsIn total, 405 (male 340, female 65) cases with different histology (adenocarcinoma 157, squamous cell carcinoma 132, small cell carcinoma 94, others 22 cases) were enrolled. Concordance between MDT and WFO occurred in 92.4% (k=0.881, P<0.001) of all cases, and concordance differed according to clinical stages. The strength of agreement was very good in stage IV non-small cell lung carcinoma (NSCLC) (100%, k=1.000) and extensive disease small cell lung carcinoma (SCLC) (100%, k=1.000). In stage I NSCLC, the agreement strength was good (92.4%, k=0.855). The concordance was moderate in stage III NSCLC (80.8%, k=0.622) and relatively low in stage II NSCLC (83.3%, k=0.556) and limited disease SCLC (84.6%, k=0.435). There were discordant cases in surgery (7/57, 12.3%), radiotherapy (2/12, 16.7%), and chemoradiotherapy (15/129, 11.6%), but no discordance in metastatic disease patients.ConclusionsTreatment recommendations made by WFO and MDT were highly concordant for lung cancer cases especially in metastatic stage. However, WFO was just an assisting tool in stage I-III NSCLC and limited disease SCLC; so, patient-doctor relationship and shared decision making may be more important in this stage.
Project description:PurposeTo evaluate the value of artificial intelligence (AI) for recommendation of pupil dilation test using medical interview and basic ophthalmologic examinations.DesignRetrospective, cross-sectional study.SubjectsMedical records of 56,811 patients who visited our outpatient clinic for the first time between 2017 and 2020 were included in the training dataset. Patients who visited the clinic in 2021 were included in the test dataset. Among these, 3,885 asymptomatic patients, including eye check-up patients, were initially included in test dataset I. Subsequently, 14,199 symptomatic patients who visited the clinic in 2021 were included in test dataset II.MethodsAll patients underwent a medical interview and basic ophthalmologic examinations such as uncorrected distance visual acuity, corrected distance visual acuity, non-contact tonometry, auto-keratometry, slit-lamp examination, dilated pupil test, and fundus examination. A clinically significant lesion in the lens, vitreous, and fundus was defined by subspecialists, and the need for a pupil dilation test was determined when the participants had one or more clinically significant lesions in any eye. Input variables of AI consisted of a medical interview and basic ophthalmologic examinations, and the AI was evaluated with predictive performance for the need of a pupil dilation test.Main outcome measuresAccuracy, sensitivity, specificity, and positive predictive value.ResultsClinically significant lesions were present in 26.5 and 59.1% of patients in test datasets I and II, respectively. In test dataset I, the model performances were as follows: accuracy, 0.908 (95% confidence interval (CI): 0.880-0.936); sensitivity, 0.757 (95% CI: 0.713-0.801); specificity, 0.962 (95% CI: 0.947-0.977); positive predictive value, 0.878 (95% CI: 0.834-0.922); and F1 score, 0.813. In test dataset II, the model had an accuracy of 0.949 (95% CI: 0.934-0.964), a sensitivity of 0.942 (95% CI: 0.928-956), a specificity of 0.960 (95% CI: 0.927-0.993), a positive predictive value of 0.971 (95% CI: 0.957-0.985), and a F1 score of 0.956.ConclusionThe AI model performing a medical interview and basic ophthalmologic examinations to determine the need for a pupil dilation test had good sensitivity and specificity for symptomatic patients, although there was a limitation in identifying asymptomatic patients.
Project description:BackgroundArtificial intelligence (AI) has been extensively used in a range of medical fields to promote therapeutic development. The development of diverse AI techniques has also contributed to early detections, disease diagnoses, and referral management. However, concerns about the value of advanced AI in disease diagnosis have been raised by health care professionals, medical service providers, and health policy decision makers.ObjectiveThis review aimed to systematically examine the literature, in particular, focusing on the performance comparison between advanced AI and human clinicians to provide an up-to-date summary regarding the extent of the application of AI to disease diagnoses. By doing so, this review discussed the relationship between the current advanced AI development and clinicians with respect to disease diagnosis and thus therapeutic development in the long run.MethodsWe systematically searched articles published between January 2000 and March 2019 following the Preferred Reporting Items for Systematic reviews and Meta-Analysis in the following databases: Scopus, PubMed, CINAHL, Web of Science, and the Cochrane Library. According to the preset inclusion and exclusion criteria, only articles comparing the medical performance between advanced AI and human experts were considered.ResultsA total of 9 articles were identified. A convolutional neural network was the commonly applied advanced AI technology. Owing to the variation in medical fields, there is a distinction between individual studies in terms of classification, labeling, training process, dataset size, and algorithm validation of AI. Performance indices reported in articles included diagnostic accuracy, weighted errors, false-positive rate, sensitivity, specificity, and the area under the receiver operating characteristic curve. The results showed that the performance of AI was at par with that of clinicians and exceeded that of clinicians with less experience.ConclusionsCurrent AI development has a diagnostic performance that is comparable with medical experts, especially in image recognition-related fields. Further studies can be extended to other types of medical imaging such as magnetic resonance imaging and other medical practices unrelated to images. With the continued development of AI-assisted technologies, the clinical implications underpinned by clinicians' experience and guided by patient-centered health care principle should be constantly considered in future AI-related and other technology-based medical research.
Project description:Background: This survey aims to identify the relative value and the critical components of anesthesiology letters of recommendation(LORs) from the perspective of Program Directors (PDs) and Associate/Assistant Program Directors (APDs). Knowledge and insights originating from this survey might add to the understanding of the anesthesiology residency selection process and mitigate unintended linguistic biases.Methodology: Anonymous online surveys were sent to anesthesiology PDs/APDs from the Accreditation Council for Graduate Medical Education (ACGME) accredited anesthesiology residency Programs in the USA (US), as listed on the ACGME website and the American Medical Association Fellowship and Residency Electronic Interactive Database (AMA FREIDA) Residency Program Database. The survey authors were blinded to the identity of the respondents.Results: 62 out of 183 (33.8%) invited anesthesiology PDs/APDs completed the survey anonymously. In our survey, LORs are reported as more important in granting an interview than in making the rank list. 64% of respondents prefer narrative LORs. 77.4% of respondents look for specific keywords in LORs. Keywords such as 'top % of students' and 'we are recruiting this candidate' indicate a strong letter of recommendation while keywords such as 'I recommend to your program' or non-superlative descriptions indicate a weak letter of recommendation. Other key components of LORs include the specialty of the letter-writer, according to 84% of respondents, with anesthesiology as the most valuable specialty. Although narrative LORs are preferred, 55.1% of respondents are not satisfied with the content of narrative LORs.Conclusion: LORs containing specific keywords play an important role in the application to anesthesiology residency, particularly when submitted by an anesthesiologist. While narrative LORs are still the preferred format, most of our respondents feel they need improvements. The authors suggest specific LOR improvements including creating formalized LOR training, adding a style guide, and applying comparative scales, with standardized vocabulary in the narrative LOR.