Project description:BackgroundThe AGREE II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within six domains. AGREE II also includes two overall assessments (overall guideline quality, recommendation for use). Our aim was to investigate how strongly the 23 AGREE II items influence the two overall assessments.MethodsAn online survey of authors of publications on guideline appraisals with AGREE II and guideline users from a German scientific network was conducted between 10th February 2015 and 30th March 2015. Participants were asked to rate the influence of the AGREE II items on a Likert scale (0 = no influence to 5 = very strong influence). The frequencies of responses and their dispersion were presented descriptively.ResultsFifty-eight of the 376 persons contacted (15.4%) participated in the survey and the data of the 51 respondents with prior knowledge of AGREE II were analysed. Items 7-12 of Domain 3 (rigour of development) and both items of Domain 6 (editorial independence) had the strongest influence on the two overall assessments. In addition, Items 15-17 (clarity of presentation) had a strong influence on the recommendation for use. Great variations were shown for the other items. The main limitation of the survey is the low response rate.ConclusionsIn guideline appraisals using AGREE II, items representing rigour of guideline development and editorial independence seem to have the strongest influence on the two overall assessments. In order to ensure a transparent approach to reaching the overall assessments, we suggest the inclusion of a recommendation in the AGREE II user manual on how to consider item and domain scores. For instance, the manual could include an a-priori weighting of those items and domains that should have the strongest influence on the two overall assessments. The relevance of these assessments within AGREE II could thereby be further specified.
Project description:IntroductionThe Appraisal of Guidelines for Research & Evaluation (AGREE) II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within 6 domains and 2 overall assessments (1. overall guideline quality; 2. recommendation for use). The aim of this systematic review was twofold. Firstly, to investigate how often AGREE II users conduct the 2 overall assessments. Secondly, to investigate the influence of the 6 domain scores on each of the 2 overall assessments.Materials and methodsA systematic bibliographic search was conducted for publications reporting guideline appraisals with AGREE II. The impact of the 6 domain scores on the overall assessment of guideline quality was examined using a multiple linear regression model. Their impact on the recommendation for use (possible answers: "yes", "yes, with modifications", "no") was examined using a multinomial regression model.Results118 relevant publications including 1453 guidelines were identified. 77.1% of the publications reported results for at least one overall assessment, but only 32.2% reported results for both overall assessments. The results of the regression analyses showed a statistically significant influence of all domains on overall guideline quality, with Domain 3 (rigour of development) having the strongest influence. For the recommendation for use, the results showed a significant influence of Domains 3 to 5 ("yes" vs. "no") and Domains 3 and 5 ("yes, with modifications" vs. "no").ConclusionsThe 2 overall assessments of AGREE II are underreported by guideline assessors. Domains 3 and 5 have the strongest influence on the results of the 2 overall assessments, while the other domains have a varying influence. Within a normative approach, our findings could be used as guidance for weighting individual domains in AGREE II to make the overall assessments more objective. Alternatively, a stronger content analysis of the individual domains could clarify their importance in terms of guideline quality. Moreover, AGREE II should require users to transparently present how they conducted the assessments.
Project description:PurposeTo quantify differences in computationally estimated computed tomography (CT) organ doses for patient-specific voxel phantoms to estimated organ doses in matched computational phantoms using different matching criteria.Materials and methodsFifty-two patient-specific computational voxel phantoms were created through CT image segmentation. In addition, each patient-specific phantom was matched to six computational phantoms of the same gender based, respectively, on age and gender (reference phantoms), height and weight, effective diameter (both central slice and exam range average), and water equivalent diameter (both central slice and exam range average). Each patient-specific phantom and matched computational phantom were then used to simulate six different torso examinations using a previously validated Monte Carlo CT dosimetry methodology that accounts for tube current modulation. Organ doses for each patient-specific phantom were then compared with the organ dose estimates of each of the matched phantoms.ResultsRelative to the corresponding patient-specific phantoms, the root mean square of the difference in organ dose was 39.1%, 20.3%, 22.7%, 21.6%, 20.5%, and 17.6%, for reference, height and weight, effective diameter (central slice and scan average), and water equivalent diameter (central slice and scan average), respectively. The average magnitude of difference in organ dose was 24%, 14%, 16.9%, 16.2%, 14%, and 11.9%, respectively.ConclusionOverall, these data suggest that matching a patient to a computational phantom in a library is superior to matching to a reference phantom. Water equivalent diameter is the superior matching metric, but it is less feasible to implement in a clinical and retrospective setting. For these reasons, height-and-weight matching is an acceptable and reliable method for matching a patient to a member of a computational phantom library with regard to CT dosimetry.
Project description:Background & aimsOver the last decade, clinical experiences and research studies raised concerns regarding use of proton pump inhibitors (PPIs) as part of the diagnostic strategy for eosinophilic esophagitis (EoE). We aimed to clarify the use of PPIs in the evaluation and treatment of children and adults with suspected EoE to develop updated international consensus criteria for EoE diagnosis.MethodsA consensus conference was convened to address the issue of PPI use for esophageal eosinophilia using a process consistent with standards described in the Appraisal of Guidelines for Research and Evaluation II. Pediatric and adult physicians and researchers from gastroenterology, allergy, and pathology subspecialties representing 14 countries used online communications, teleconferences, and a face-to-face meeting to review the literature and clinical experiences.ResultsSubstantial evidence documented that PPIs reduce esophageal eosinophilia in children, adolescents, and adults, with several mechanisms potentially explaining the treatment effect. Based on these findings, an updated diagnostic algorithm for EoE was developed, with removal of the PPI trial requirement.ConclusionsEoE should be diagnosed when there are symptoms of esophageal dysfunction and at least 15 eosinophils per high-power field (or approximately 60 eosinophils per mm2) on esophageal biopsy and after a comprehensive assessment of non-EoE disorders that could cause or potentially contribute to esophageal eosinophilia. The evidence suggests that PPIs are better classified as a treatment for esophageal eosinophilia that may be due to EoE than as a diagnostic criterion, and we have developed updated consensus criteria for EoE that reflect this change.
Project description:BackgroundDifferent methodological choices such as inclusion/exclusion criteria and analytical models can yield different results and inferences when meta-analyses are performed. We explored the range of such differences, using several methodological choices for indirect comparison meta-analyses to compare nalmefene and naltrexone in the reduction of alcohol consumption as a case study.MethodsAll double-blind randomized controlled trials (RCTs) comparing nalmefene to naltrexone or one of these compounds to a placebo in the treatment of alcohol dependence or alcohol use disorders were considered. Two reviewers searched for published and unpublished studies in MEDLINE (August 2017), the Cochrane Library, Embase, and ClinicalTrials.gov and contacted pharmaceutical companies, the European Medicines Agency, and the Food and Drug Administration. The indirect comparison meta-analyses were performed according to different inclusion/exclusion criteria (based on medical condition, abstinence of patients before inclusion, gender, somatic and psychiatric comorbidity, psychological support, treatment administered and dose, treatment duration, outcome reported, publication status, and risk of bias) and different analytical models (fixed and random effects). The primary outcome was the vibration of effects (VoE), i.e. the range of different results of the indirect comparison between nalmefene and naltrexone. The presence of a "Janus effect" was investigated, i.e. whether the 1st and 99th percentiles in the distribution of effect sizes were in opposite directions.ResultsNine nalmefene and 51 naltrexone RCTs were included. No study provided a direct comparison between the drugs. We performed 9216 meta-analyses for the indirect comparison with a median of 16 RCTs (interquartile range = 12-21) included in each meta-analysis. The standardized effect size was negative at the 1st percentile (- 0.29, favouring nalmefene) and positive at the 99th percentile (0.29, favouring naltrexone). A total of 7.1% (425/5961) of the meta-analyses with a negative effect size and 18.9% (616/3255) of those with a positive effect size were statistically significant (p < 0.05).ConclusionsThe choice of inclusion/exclusion criteria and analytical models for meta-analysis can result in entirely opposite results. VoE evaluations could be performed when overlapping meta-analyses on the same topic yield contradictory result.Trial registrationThis study was registered on October 19, 2016, in the Open Science Framework (OSF, protocol available at https://osf.io/7bq4y/ ).
Project description:Lung cancer (LC) screening often focuses heavy smokers as a target for screening group. Heavy smoking can thus be regarded as an LC pre-screening test with sensitivities and specificities being different in various populations due to the differences in smoking histories. We derive here expected sensitivities and specificities of various criteria to preselect individuals for LC screening in 27 European countries with diverse smoking prevalences. Sensitivities of various heavy-smoking-based pre-screening criteria were estimated by combining sex-specific proportions of people meeting these criteria in the target population for screening with associations of heavy smoking with LC risk. Expected specificities were approximated by the proportion of individuals not meeting the heavy smoking definition. Estimated sensitivities and specificities varied widely across countries, with sensitivities being generally higher among men (range: 33-80%) than among women (range: 9-79%), and specificities being generally lower among men (range: 48-90%) than among women (range: 70-99%). Major variation in sensitivities and specificities was also seen across different pre-selection criteria for LC screening within individual countries. Our results may inform the design of LC screening programs in European countries and serve as benchmarks for novel alternative or complementary tests for selecting people at high risk for CT-based LC screening.
Project description:Multiple guidelines on cutaneous melanoma (CM) are available from several consortia and countries. To provide up-to-date guidance in the rapidly changing field of melanoma treatment, guideline developers have to provide regular updates without compromises of quality. We performed a systematic search in guideline databases, Medline and Embase to identify guidelines on CM. The methodological quality of the identified guidelines was independently assessed by five reviewers using the instruments "Appraisal of Guidelines for Research and Evaluation" (AGREE II) and "Recommendation EXcellence" (AGREE-REX). We performed descriptive analysis, explored subgroup differences using the Kruskal-Wallis (H) test and examined the relationship between distinct domains and items of the instruments with Spearman's correlation. Six guidelines by consortia from Australia, France, Germany, Scotland, Spain and the United States of America were included. The German guideline fulfilled 71%-98% of criteria in AGREE II and 78%-96% for AGREE-REX, obtaining the highest scores. Deficiencies in the domains of "applicability" and "values and preferences" were observed in all guidelines. The German and Spanish guidelines significantly differed from each other in most of the domains. The domains "applicability" and "values and preferences" were identified as methodological weaknesses requiring careful revision and improvement in the future.
Project description:PurposeClinical practice guidelines provide recommendations for the management of diseases. In orphan conditions such as uveal melanoma (UM), guideline developers are challenged to provide practical and useful guidance even in the absence of high-quality evidence. Here, we assessed the methodological quality and identified deficiencies of international guidelines on UM as a base for future guideline development.MethodsA systematic search was carried out in guideline databases, Medline and Embase until 27th May 2019 for guidelines on UM published between 2004 and 2019. Five independent reviewers assessed the methodological quality of the identified guidelines using the instruments "Appraisal of Guidelines for Research and Evaluation II" (AGREE II) and AGREE-REX (Recommendation EXcellence). Descriptive analysis was performed and subgroup differences were explored with the Kruskal-Wallis (H) test. The relationship between the individual domains and items of the instruments were examined using Spearman's correlation.ResultsFive guidelines published from 2014 to 2018 by consortia of the United States of America, Canada and the United Kingdom (UK) were included. The highest scores were obtained by the UK guideline fulfilling 48-86% of criteria in AGREE II and 30-60% for AGREE-REX. All guidelines showed deficiencies in the domains "editorial independence", "applicability", and "recommendation". Subgroup differences were identified only for the domain "editorial independence".ConclusionThe UK guideline achieved the highest scores with both instruments and may serve as a basis for future guideline development in UM. The domains "editorial independence", "recommendation", and "applicability" were identified as methodological weaknesses and require particular attention and improvement in future guidelines.