For a given sensitivity and specificity, the utility of a diagnostic criteria depends on the prevalence of the disease. Here you can explore this problem. In the following, H stands for Healthy and D for Disease. (+) Indicates a positive result of the test and (-) a negative result. P(+/H) is the probability of a positive test amnog healthy people, etc.
The red line indicates the Positive Predictive Value (PPV). The blue line indicates the Negative Predictive value (NPV). You can move the prevalence for evaluating the utility of the test with the indicated sensitivity and especificty. Also, for a fixed value of the prevalence, you can explore the required sensitivity and especificity for attaining useful predictive values.
Both the PPV and the NPV should be greater that 0.5. The horizontal black line at 0.5 indicates this limit. You should explore the validity of a test with a given sensitivity and especificity for matching this requirement for different prevalences.
Low prevalences correspond to cases in which the disease affects few people. In that case, it is easy to attain high NPV. However, it will be difficult to achieve high PPV. The reverse occurs when a disease is present with high prevalence.
In this exercise, we explore the effect of selecting a given value of a biomarker for discriminating between healthy and disease people. You can move the diagnostic point and see the resulting sensitivity and especificity. It is important to note that increasing sensitivity diminishes especificity and viceversa. The distribution of the biomarker in each population is determinant in the performance of any criteria. The ROC curve evaluates the discriminant ability of this biomarker in a given scenario. The AUC measures the performance in each case. An AUC=0.5 indicates that the choosen biomarker performs as toosing a money. The optimal discrimination is attained for high values of AUC (closed to 1).
Supose you have a test and apply it to two samples of healthy and disease people. In this exercise, you can compute the resulting sensitity and especificty as a function of the point you define for defining positive and negative results. Data is generated randomly from two normal distributions.
For simplicity, we will consider that the disease increases the value of the biomarkes, i.e. the mean value is higher for people suffering from that disease.
Based on the data, the (+) and (-) results of the test are computed according to the diagnostic point defined.
Data are randomly generated from samples of two biomarker values. In each case the AUC is computed for assesing test performance.