Disclaimer: This is an example of a student written essay.
Click here for sample essays written by our professional writers.

This essay may contain factual inaccuracies or out of date material. Please refer to an authoritative source if you require up-to-date information on any health or medical issue.

Priniciples of Biostatistics

Paper Type: Free Essay Subject: Health
Wordcount: 3097 words Published: 21st Sep 2017

Reference this


Even though it is common practice to dichotomize continuous variables prior to analysis, it is not always a good idea. Dichotomization of an outcome variable involves a process in which researchers divide that numeric (in our case: continuous) variable into a dichotomous variable by splitting the scale of the variable’s values at some point, for example: High and Low. This cut-off point may be the sample median, but it may also be some other fixed point, such as the mean, or a standard deviation above/below the mean, or any point the researchers choose for some reason. The rationale for dichotomizing most often is that the researchers claim that it simplifies the interpretation and the analysis, or because they have reason to believe that they are dealing with heavily skewed or nonlinear data.

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Essay Writing Service

In our case, we want to design a clinical trial in order to evaluate the ability of two treatment regimens to reduce cholesterol. The interpretation of the outcome variable would be very different if we treat the outcome variable, the decrease in cholesterol, dichotomous versus continuous. If we were only interested in whether or not the two treatment groups reduce cholesterol below a certain cut-off point, we could treat the outcome variable as dichotomous. In this case, there would be two categories that would be combined into one variable, for example high cholesterol/low cholesterol. The cut-off point for the outcome variable could, for example, be “Decrease Cholesterol 50 mg/dL”. Below this level we would consider it to be “Unfavorable” (small decrease), above this level we would consider it “Favorable” (substantial decrease). The results would be easy to interpret as we are only interested in whether or not the treatment would bring cholesterol to Low versus High level. We could explain the results expressed as proportions or odds ratios.

It would give us little information, however, to what extent the treatment lowers cholesterol, by what amounts and what the individual differences are. In fact, we would be saying that there is no difference between a decrease in cholesterol level of 60 and 80, nor between levels of 20 and 45. At the same time, we are making a huge difference between scores of 50 and 51. This would not be logical, or in other words: where interpretation may seem easier, there is a trade-off because we would simultaneously lose a lot of information if we treat the outcome as dichotomous.

Another argument mentioned above, is that dichotomization would simplify the analysis. This could be interpreted as a desirable thing, but also (primarily) as undesirable. A binary split—for example, at the median—leads to a comparison of groups of individuals with high or low values of cholesterol, leading in the simplest case to a t-test or χ2 test and an estimate of the difference between the groups (with its confidence interval). [1]

Some authors have suggested, however, that there is no good reason in general to suppose that there is an underlying dichotomy, and if one exists there is no reason why it should be at the median. [2] Treating the decrease in cholesterol as dichotomous, would ignore the issue of measurement error. Every observed score of cholesterol, a numeric measure in our study, is made up of 2 parts: a “true” score, which is never seen, plus some error. The more reliable the scale, the smaller the error and the closer the observed score to the true score. But, since no measurement has a reliability of 1.00, every score has some degree of error associated with it and we assume that the errors are random and have a mean of 0. [3] So if we treat the cholesterol levels as numbers along a continuum, we may misplace a person to some degree. But if we dichotomize, that may even lead to that person being identified in the wrong group. So if we treat decrease in cholesterol level as dichotomous, we will deal with the consequent risk of a Type II error, meaning that we would fail to detect real differences.

Apart from this risk of misclassification error by dichotomizing, we would also see an effect on the statistical significance of our tests. If we treat the decrease in cholesterol as dichotomous, we lose effect size in our study population, and corresponding loss in sample size which can affect the outcome of tests of statistical significance. [2] As mentioned, dichotomization would lead to a loss of detailed information that we have about cholesterol levels, for example if we test the difference between group means, so that the test results will be less powerful. The loss of power caused by dichotomization can be viewed alternatively as an effective loss of sample size. [4] Our “cut-off” point of decrease of 50 mg/dL may be clinically significant, but may statistically not be the best cut-off point. In fact, previous research determined that a dichotomized outcome is at best only 67% as efficient as a continuous one; that is, if you need 50 subjects in each group to demonstrate statistical significance on a continuous scale, you would need 75 subjects per group to show the same effect after dichotomizing. [5] In fact, the more the split deviates from 50-50, the more the correlation is reduced. By the time the division is 90-10, it has been established that the correlation is reduced by 41%. [6] In other words: where statistical analysis may seem simpler or more straightforward at first, there is a trade off because treating our outcome as dichotomous will decrease the statistical power of tests, increase the probability of a Type II error, and may even induce spurious results because of sampling error.

The last reason provided to dichotomize is that if we have reason to believe that we are dealing with skewed data or the relation between the treatments and the cholesterol level decrease is nonlinear. For example, after studying the diagnostic plots we may conclude that our data deviates heavily from normality. (for instance, when we have many observations with a value of zero or one and a few with very high values) Transforming the outcome variable would not help us, so we could dichotomize. The same applies if we suspect a “S-form” relation: it is not linear in the extreme values but may be linear around the mean value. But even if we had these good reasons to treat the outcome as dichotomous, it would still have disadvantages. For example, we would not be able to compare our study results to another study, because our dichotomy would become useless if we change the exact cut-off point (so we couldn’t compare). In other words: where we deal with skewed data or nonlinear relationships, it could be a good idea to treat the outcome as dichotomous, but there would be a trade-off for comparison with other studies.

In short the trade-offs involved in treating the outcome variable, the decrease in cholesterol, as dichotomous or continuous are:

-Easier to Interpret  Loss of information.

-Statistical analysis easier (t-tests, X2)  Loss of effect size, loss of statistical power (especially with small sample size which could lead to spurious study results), threat of Type II error.

-Most parametric statistical tests assume that the variables are non-normally distributed – dichotomization would deal with skewed data or nonlinear relationships  our study would not be comparable to other, similar studies about treatments and decrease in cholesterol levels.


The power of a statistical test is the probability that a false positive null hypothesis will be rejected. In other words: it is the ability of a statistical test to avoid a Type II error. A type II error means that we would reject the alternative hypothesis of a test when it is true, meaning that we draw a negative conclusion about a hypothesis that is actually true. (also know is the β-error) So if in the actual situation the null hypothesis I false and the alternative hypothesis is true and the test result shows that we fail to reject H0 and we reject Ha we are committing a Type II error. The power of a statistical test is thus equal to 1-β. Most studies are said to be acceptable if they have a power of 0.8 or 80%. (although we can of course choose different values for the calculated power).

The power of a test is influenced by several factors:

-the α-level:

This is the significance level: the upper bound for the probability that the null hypothesis is rejected when it is in fact true (Type I error).

If the α-level increases (for example from 1% to 5%) , a test’s power will also increase (so power is an increasing function of the significance level), because it will make the critical values less extreme, so we would be increasing the size of the areas of rejection and make rejection of the null hypothesis more likely. This also illustrates the “trade-off” between Type I and Type II errors: if we increase the α-level, we are likely decreasing the probability of a Type II error but increasing the probability of a Type I error.

If α => power

-the standardized effect size (effect size and variation/variability):

Effect size refers to the difference between the sample mean and the hypothesized population mean.

The larger the effect size, the more extreme the value of t will be, so power is an increasing function of the effect size.

If gap between µ0 and µ => power

-the sample size:

The sample size n of a study determines how likely we will be able to detect a false null hypothesis, because it reduces the estimated standard error Sx̄ and thus increase the calculated value of t.

Power is also an increasing function of sample size n”

If n => power

-the sampling error/random error:

The sampling error can also be described as the variance of the distribution of individual observations. A lower sampling error means that the sample standard deviation is smaller so the estimated standard error Sx̄ is also lower and the calculated value of t will be more extreme. In other words: we are more likely to fall within one of the two areas of rejection (left or right).

So: power is a decreasing function of the sampling error:

If sampling error => power (or, of course, reversely)


=> The calculation of the sample size depends on the study design (in our case: RCT), the outcome variable (continuous or dichotomous) and the statistical test used. If we treat the outcome variable, the decrease in cholesterol, as a continuous variable, we base our analysis on means (statistical test: t-test) and to calculate the sample size, we need information about:

  • σ: the standard deviation.
  • Difference desired to detect |µo −µ1| or: the effect size of interest (and margin of error). We need to know what the smallest effect of interest is, so that we can detect the minimal difference between two groups: or the “minimal clinically relevant difference”;
  • 1- β: the power of the test. β represents the probability that there is no difference between two groups or treatments when in reality there is (Type II error) The power (1-β) reflects the probability of detecting a difference in |µo −µ1| if it exists.
  • α : we need to specify the α-level, a conventional level like 0.05 (The α level represents the probability of a falsely rejecting H0 – Type I error)

=> To calculate sample size for the continuous outcome variable, we must make assumptions:

  • Normal distribution: (that the population data are normally distributed – CLT allows this assumption to be made, provided that random samples of sufficient size are used, n>30)
  • Unequal variance
  • Independent samples (that the values in one sample have no information about those of the other sample)

=> If we treat the outcome variable as dichotomous, the nonparametric test (Wilcoxon rank-sum or MWU test) would be analogous to the parametric independent sample t-test. If we want to calculate the required sample size for this study to be analyzed by this nonparametric test, we have to make an assumption about the distribution of the values: that it approximates a normal distribution. (Depending on the distribution of the data, we might need more or fewer patients for the nonparametric test)

To calculate sample size, when treating the outcome variable as a dichotomous variable, we can then use the information from above. If we know the standard deviation and we have chosen (a conventional) α-level, and we have decided on the margin of error, we can derive a z-score. We can then calculate the sample size:

Sample Size = (Z-score)² * StdDev * (1-StdDev) / (margin of error)²


At α = 0.05 (2-sided), we will be able to detect the 10-mg/dL difference between the two groups with a power of approximately 71% (0.7054) (SAS: 0.71)



At α = 0.05 (2-sided), if we want to detect the 10-mg/dL difference between the groups with a probability of 0.9, we need to enroll 85 subjects in each group. (SAS: 85)



Based on the information in Question 4:

.5000 (50%) of the Diet group will have a Favorable change in cholesterol

.6915 (69.15%) of the Diet + Exercise group will have a Favorable change in cholesterol

We will be able to detect the difference in these proportions with a power of .5000 (50%) (SAS: 0.5)

The proportions would look like this:









Diet + Exercise










Even though it is common practice to dichotomize continuous variables prior to analysis, we would not recommend changing the outcome variable, a decrease in cholesterol level, from a continuous measure into a binary: Favorable versus Unfavorable, where the cut-off point is at a decrease of 50 mg/dL.

As we have seen in the results of questions 4 and 6, we would decrease the statistical power of our test from 71% to 50%, meaning that we would decrease the probability that a false positive null hypothesis will be rejected, or decrease the probability we will be able to avoid a Type II error. We thus will thus not likely be able to demonstrate an association or causal relationship between the treatments and the decrease in cholesterol level. If we dichotomize, our study has a 50% chance of ending up with a p-value that indicates that we found a statistically significant treatment effect. (whereas conventionally, 80% is an acceptable level of power, so even the 71% is below that) We also see a loss in effect size, meaning the difference between the sample mean and the hypothesized population mean, if we dichotomize.

Find Out How UKEssays.com Can Help You!

Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.

View our services

Other than the loss of statistical power, we also see that we lose in Question 7: we are only able to draw conclusions about the cut-off point, but we are not able to say something about individual differences. We would be saying that there is no difference between a decrease in cholesterol level of 50 and 60, or between levels of 20 and 45, while we know that decreases like that would be deemed clinically significant. Since we have chosen this specific cut-off point for a decrease to be “Favorable”, it has also become impossible to compare the results of this trial to other, similar studies.


Since we fail to reject the null hypothesis of no difference, we must conclude that there is no difference between the two treatment regimens, so statistically there is no significant relationship between the type of treatment received (diet or diet + exercise) and Favorable decrease in cholesterol level (chi-square with one degree of freedom = 3.4048, p = 0.0650).

Note that our results are in line with the SAS output of the Chi-Square test; we do not look at the results of the Fisher’s exact test because none of the cells in the study results table are smaller than 5, so we go with the results of the chi-square (which always assumes that each cell has an expected frequency of five or more).


[1] Altman DG, The cost of dichotomizing continuous variables, BMJ 2006;332:1080.

[2] MacCallum RC, Zhang S, Preacher KJ, Rucker DD, On the Practice of Dichotomization of Quantitative Variables, Psychological Methods 2002, Vol. 7, No. 1, 19–40.

[3] Streiner DL, Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data, Can J Psychiatry 2002;47:262-266.

[4] Cohen, J. (1983). The cost of dichotomization. Applied Psychological Measurement, 7, 249–253.

[5] Suissa S, Binary methods for continuous outcomes: a parametric alternative. J Clin Epidemiol 1991:44:241-8.

[6] Hunter JE, Schmidt FL. Dichotomization of continuous variables: the implications for meta-analysis. J Appl Psychol 1990:75:334-49.



Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please: