Authors' results and conclusions: The definite search resulted in 428 references. 390 documents remained after discarding 38 duplicates. After the first selection was completed, which had been based on reading title and summary, 315 papers were excluded, so 75 remained to be read on its whole. Of the latter, 68 papers were excluded following the inclusion and exclusion criteria so that 11 original studies were finally included in the review. All of them were observational studies without control group. Most of them assessed more than one score. It has been studied the accuracy of the scores as opposed to empiricism in three, good quality works. The studies did not find any relevant difference in the diagnostic performances of these two options when estimating the pre-test probability of pulmonary embolism. When accuracy of the scores was compared with each other, the results were much variable. 2 categories-Wells score has proved to be more specific than sensitive in two good quality studies; sensitivity varied in both of them from 60% to 62%. 3 categories- Wells score has been assessed in 9 studies. Two studies assessed it in groups of patients with low frequency of pulmonary embolism, providing different results on sensitivity and specificity (it was estimated 92% of sensitivity and moderate specificity in a study while another presented 54% of sensitivity), although both studies coincided in high negative predictive value of the score (over 96%). In patients with moderate and high frequency of pulmonary embolism, the studies matched in presenting the 3 categories Wells score as more sensitive than specific: sensitivity was over 90% in patients with moderate frequency of embolism (two studies), and it varied between 66 and 91% (five studies) in patients with high frequency of embolism. The Geneva score, which was assessed in five studies, was characterised by being more sensitive than specific. Its results matched those of Wells scores favourably in the only one study that had applied the score prospectively. Both results achieved over 70% of sensitivity. The diagnostic performance of Wells algorithm, Charlotte rules and Rodger and Pisa models have been assessed each of them in one study, only. Of all them, just the Pisa model has proved, in a statistically significant way, a superior diagnostic performance than Wells and Geneva scores (94% of the area under the ROC curve for this model as opposed to 54% for the Geneva score and 75% for Wells score). Although it is a good quality study, problems related to external validity jeopardise its applicability. It has been reported no result on clinical effectiveness of the scores application in these studies.
Authors' recomendations: The diagnostic performance of the scores used to determine the pre-test probability in diagnosing pulmonary embolism does not present relevant differences with the diagnostic performance provided by the empirical assessment. On the basis of the evidence available, there cannot be established the superiority of the diagnostic performance of a specific score over the rest of them. 3 categories-Wells score and Geneva score have been studied the most. 3 categories-Wells score has proved high sensitivity (higher than 90%) in patients with moderate frequency of pulmonary embolism, but in patients with high frequency, sensitivity data did not match (between 66 and 91%). In good quality papers in which it has been assessed the accuracy of Geneva score in patients with high frequency of pulmonary embolism, sensitivity has proved to be high (over 86%). The studies on diagnostic performance found in this review have not assessed clinical effectiveness of the said scores in terms of health outcomes.
