If you are a site or Investigator just starting out in clinical trials or if you already undertake a certain volume of studies it can be worthwhile to develop an appreciation of certain statistical principles for interpreting the results of those clinical trials to evaluate the safety and effectiveness of the health care interventions you are administering.

Besides managing a site, conducting the visits and looking after the participants, site personnel have an opportunity to gain meaningful insight into the interpretation of clinical trial data, as well as an appreciation of the considerations required to reach appropriate decisions about progressive evaluation of interventions and their potential adoption into standard clinical practice.

With this in mind, it can be advantageous for clinical trial coordinators and Investigators to have an understanding of some of the summary statistics commonly used to describe certain types of clinical data and provide estimates of treatment effects, as well as explain the role of p-values and confidence intervals, which although may be a little confusing to begin with, provide an additional foundation that can assure us that what we are doing is statistically and clinically meaningful (or not).

Clinical trials are prospective studies designed to discover the effects of interventions on the health of humans. (Martin & Stockler, 2016) Understanding of a clinical trial therefore requires an evaluation of its methods and an interpretation of its findings before a cumulative judgement can be made about what it means. With this in mind in, the appreciable significance of a trial, in regards to site and patient management, includes not only its results, but also the conclusions that can be drawn from these results, and the implications these have for subsequent research, practice and policy.  A significant component of the trial appraisal process is accordingly assessing the appropriate correlation between the study question (the aim of the trial), the study’s methods (design and conduct), and interpretation of the study’s findings (are the inferences, conclusions, and recommendations justified?).

To be able to thoroughly interpret a clinical trial it is handy to first acknowledge the Phases of clinical development. Briefly:

  • The objective of a Phase 1 trial is to determine the feasibility and safety of intervention/treatment regimens for subsequent evaluation.
  • The objective of a Phase 2 trial is to determine if an intervention has satisfactory action and an acceptable safety profile to justify undertaking a larger more definitive trial.
  • Phase 3 trials aim to determine the role of an intervention in clinical practice by comparing it to the best available standard therapy.

The ideal overall design of clinical trials including the populations, objectives, endpoints and subsequent analyses differ for each of these phases. It is fair to say that both Phases 1 and 2 are conceived to guide further research, whereas Phase 3 studies are primarily designed to inform or guide clinical practice.

Inquiries about additional research into the advancement of an intervention are important, but are of less consequence than questions regarding changes to conventional practice. The clinical and statistical evidence required to conclude how research should proceed in a limited study sample and highly regulated trial context is less than that needed to determine how all subsequent patients should be treated in conventional clinical practice. Therefore careful appraisal of the evidence from a clinical trial to inform prospective decisions requires a full comprehension of a trial’s study population, the balance and composition of control (reference) groups, the selection of comparator interventions/treatments and endpoints when appraising evidence for decision making.


The types of data collected in a clinical trial can be typically classified as:

Continuous – values or numbers that fall on a continuum such as height, blood pressure or weight, including Time to Event (TTE) data.

  • Mean and median are the most common measures of central tendency for continuous data.
  • TTE data may be presented in graphical format (i.e. a survival curve) and/or as estimates of the median TTE.
  • For continuous data, the difference between two treatments is usually summarized as difference between two means.

Categorical – quantities that can take on one of a number of various categories. In addition categorical data can be further classified as ordinal (e.g. pain severity: mild, moderate, and severe) or nominal (e.g. pain quality: sharp, dull, aching, throbbing, or gnawing) or binary e.g. yes/no)

  • Percentages are often employed used to summarise ordinal data (e.g. what percentage of participants responded to treatment X?).
  • For categorical data where the outcome is binary (response/non-response) there are a number of ways to summarize a treatment difference such as Odds Ratio, Relative Risk, Relative Risk Reduction, and Number Needed to Treat

Summarising the key characteristics of clinical data can be achieved by applying distinctive descriptive statistics which produce techniques for interpretation of study results and allow provision of evidence of effect. Relative measures (Relative Risk, Relative Risk Reduction) of treatment effects can be thought of as characterising a property of an intervention, whereas an absolute measure of treatment effects reflects the benefit of the treatment to the individual. Even though a particular summary statistic may numerically favour one or other treatment arms, it does not necessarily guarantee that a true treatment effect actually exists.

An alternative explanation for results like this is bias. Bias in this context signifies something unrelated to the treatment has had a systematic effect on results. Another possible reason is chance. Differences between study groups or interventions in a clinical trial may be interpreted as evidence of a true treatment effect provided that bias and chance can be ruled out as plausible explanations.

In this regard creation of, and adherence to robust study designs and meticulous study conduct procedures are essential approaches to protect against bias and data errors. Principles of statistical inference are used to quantify whether study results observed could be attributed to chance.


The probability of chance alone producing a result at least as extreme as that observed when in reality no treatment effect exists is quantified by a statistic known as the p-value. P-values and point estimates do not supply all the data required to fully interpret the statistical outcomes of a clinical trial. These statistics do not reveal how big or small real differences could be, nor what range of treatment differences is consistent with the data collected from the trial. These further inquiries can be addressed by calculating a confidence interval.

Confidence Interval:

Confidence intervals are employed in various strengths; the most common used is a 95% confidence interval (95%CI). One interpretation, although not quite technically correct, of a confidence interval is that it contains the range of plausible values for the true treatment effect where the 95% reflects the level of plausibility. A more technically correct interpretation of the confidence interval is that if the current clinical trial using the exact design were repeated many times under identical conditions, 95% of these studies would produce confidence intervals that include the true treatment effect. The implication ultimately is that chance alone is not a likely explanation for the results that were detected.

So if an observed treatment effect was within the specified confidence interval we would consequently expect the p-value to be small and then the result to be considered statistically significant.  It is important to remember that a statistically significant treatment effect does not necessarily suggest that the results are clinically meaningful — result estimates and confidence intervals approximated for the treatment effect may lie in a region that is considered clinically insignificant.

Of note is that for a single arm trial estimates do not provide a direct assessment of the treatment effect and this is due to the absence of a comparator arm. Any possible inferences about treatment effects must require some form of comparison to be made between comparable groups.

Complete statistical and study analysis should ideally provide a point estimate for the effect (representing the optimum single estimate available of the true effect), a p- value and a confidence interval. Appreciation of these statistical principles, data types and summary measures can be very useful when learning to both analyse and interpret trial results.


Martin, A., & Stockler, M, 2016. Interpreting Clinical Trial Results– An Introduction. University of Sydney