Sample statistics

Mean and median

The mean of a sample: $(x_1,x_2,...,x_n)$ of size $n$ of a quantitative variable is computed as: $$\bar{x}=\frac{\sum_{\forall i}x_i}{n}$$ The median is the value that separates the sample in two, so that the 50% of the observation are below the median and the other 50% are above the median. In this exercise you can compare the mean and the median of a sample and the different statistics related to a boxplot. The residual for each observation in the value $(x_i-\bar{x})$. Here, the residual for the minimum value is shown. In the variance option, we will examine this concept for all the points in the sample.

Meaning of the variance

The sample variance is computed as: $$var(x)=\frac{\sum_{\forall i} (x_i-\bar{x})^2}{n-1}$$ The term $(x_i-\bar{x})$ is known as the residual and computes the deviation of an observation with respect the mean of the sample. The variance is the average of the squares of these residuals. In this application you can fix the population mean and standard deviation and obtain a sample assuming the biomarker values are normally distributed. In the panel, you can select a data point and see the residual value.

Observation number

Quantiles and percentiles

The percentile $x_q$ is the variable value for which a $q\%$ of the values are below $x_q$. For instance, if in a population the 95% of the males have a value of a biomarker below 3.24mg/ml, then this value is the percentile 95% for this biomarker. The quantiles are the 25%, 50%, and 75% percentiles. In this exercise, we show the sample quantiles as computed in the boxplot. We also include a statistic summary of the groups resulting from deviding the sample by quantiles. Finally, you can obtain a given percentile for the sample using 9 different methods. For small samples, the different methods produce different results.

Desired percentile

Reference intervals

A reference interval of probability $(1-\alpha)$ is an interval $(a,b)$ defined as: $$P(X\leq a)=\alpha/2$$ $$P(X\leq b)=1-\alpha/2$$ Thus, $a$ is the $(\alpha/2)$ percentile, and $b$ is the $(1-\alpha/2)$ percentile. In this exercise, you can obtain the reference interval estimated from the sample using 9 different methods. The actual values, assuming the sample comes from a normal distribution are shown in blue. The confidence interval for the reference interval can also be plotted. For small samples, the results are not precise. Try to increase the sample size to obtained a good result. If the intervals are not shown, you should change the range of the plot using the slider.

Probability of the interval

0.9

0.95

0.99

Show CI for reference limits?

Select number of points

Population parameters

Mean and median

Meaning of the variance

Quantiles and percentiles

Reference intervals

(c) Albert Sorribas, Ester Vilaprino, Montse Rue, Rui Alves. Biomodels Grup, Departament de Ciencies Mediques Basiques. Universitat de Lleida, Institut de Recerca Biomedica de Lleida (IRBLleida