My watch list  


Case-control studies are one type of epidemiological study design. They are used to identify factors that may contribute to a medical condition by comparing a group of patients who have that condition with a group of patients who do not.

Case-control studies are a relatively inexpensive and frequently-used type of epidemiological study that can be carried out by small teams or individual researchers in single facilities in a way that more structured trials often cannot be. They have pointed the way to a number of important discoveries and advances, but their very success has led some to place excessive faith in them to the point where their credibility has been significantly undermined. This is largely the result of misconceptions regarding the nature of such studies. These misconceptions are particularly widespread in the medical community.

The great triumph of the case-control study was the demonstration of the link between tobacco smoking and lung cancer, by Sir Richard Doll and others after him. Doll was able to show a statistically significant association between the two in a large case control study. Skeptics, usually backed by the tobacco industry, argued (correctly) for many years that this type of study cannot prove causation, but the eventual results of double-blind prospective studies confirmed the causal link which the case-control studies suggested, and it is now accepted that tobacco smoking is the cause of about 87% of all lung cancer mortality in the US.


Case-control studies

While the 'gold standard' of study design is the double blind randomized controlled trial, randomized controlled trials cannot be used to evaluate the effects of toxic substances. Studying infrequent events such as death from cancer using randomized clinical trials or other controlled prospective studies requires that large populations be tracked for lengthy periods to observe disease development. In the case of lung cancer this could involve 20 to 40 years, longer than the careers of most epidemiologists. In addition, these studies, which generally rely on government funding, are unlikely to be supported because of the low likelihood that the population will develop the disease. Case-control studies use patients who already have a disease or other condition and look back to see if there are characteristics of these patients that differ from those who don’t have the disease.

The case-control study provides a cheaper and quicker study of risk factors; if the evidence found is convincing enough, then resources can be allocated to more "credible" and comprehensive studies.

Study methodology

Comparison with cross-sectional studies

Cross-sectional studies (usually from "snapshot" surveys), sometimes called prevalence studies, can frequently be carried out on pre-existing data, such as that collected by the Census Bureau or the Centers for Disease Control. Such studies can cover study groups as large as the entire population of the United States; However, others could be small and geographically limited.

Cross-sectional study can contain individual-level data (one record per individual, for example, in national Health surveys). Others, however, might only convey group-level information; that is, no individual records are available to the researcher. Instead data are aggregated at the group level. For example, by zip code, urban zone, or even by states/provinces or country. For example, census will never give you individual-level data, as private information must be protected at all cost and individual data is never released. For example, although cross-sectional studies confirm that people who consume large amounts of alcohol also show high rates of many other diseases, they can not determine with certainty which variable is the cause and which one is the effect. It only shows that these variables were "associated" at some point in time. The temporality or succession of events is not objectively certain. Another problem may occur if the survey or cross-sectional study gathers information at the group level (or the administrator of the survey provides you only with group-level data and no individual-level data). In this case, you may not be able to access information needed to assess the contributions of other variables. For instance, high alcohol consumption is also associated with improper nutrition and hygiene, high rates of smoking and abuse of illegal drugs, and many other risk factors for disease. Ecological design containing only group-level information Cross-sectional studies cannot differentiate between these possible causes, but case-control studies can determine that gastrointestinal bleeding, say, is directly associated with high alcohol consumption, whereas memory deterioration is more associated with improper nutrition among alcoholics.

The advantage of case-control studies over cross-sectional studies that contain only group-level information, then, is the ability to determine the association between potential cause and effect on an individual basis. In the cross-sectional study individual variables are aggregated over the population as a whole, then an association is sought between the aggregated variables; These limitations, however, do not apply to case-control studies containing individual-level data [In this case, the "ecological fallacy" is not present. Cross-sectional design with individual-level data allows for computation of absolute risks (from prevalence) and relative risks or odds ratios].

In the case-control study, the association is determined for each individual case-control pair, then aggregated. This provides a more specific analysis of the possible associations, and potentially determines more accurately which possible causes are directly related to the effect being studied, and which are merely related by a common cause.

One benefit of cross-sectional studies is that they are considered to be "hypothesis generating", such that clues to exposure/disease relationships can often be seen in these studies, and then other studies, such as case-control, cohort studies or even sometimes randomized trials can be implemented to study this relationship.

Problems with case-control studies

One major disadvantage of case-control studies is that they do not give any indication of the absolute risk of the factor in question. For instance, a case-control study may tell you that a certain behavior may be associated with a tenfold increased risk of death as compared with the control group. Although this sounds alarming, it would not tell you that the actual risk of death would change from one in ten million to one in one million, which is quite a bit less alarming. For that information, data from outside the case-control study must be consulted.

Another problem is that of confounding. The nature of case-control studies is such that it is difficult, often impossible, to separate the chooser from the choice. For example, studies of road accident victims found that those wearing seat belts were 80% less likely to suffer serious injury or death in a collision, but data comparing rates for those collisions involving two front-seat occupants of a vehicle, one belted and one unbelted, show a measured efficacy only around half that. Many case-control studies have shown a link between bicycle helmet use and reductions in head injury, but long-term trends - including from countries which have substantially increased helmet use through compulsion - show no such benefit. Analysis of the studies shows substantial differences between the 'case' and 'control' populations, with much of the measured benefit being due to fundamental differences between those who choose to wear helmets voluntarily and those who do not.

More controversially, a significant number of case-control studies identified a link between combined hormone replacement therapy (HRT) and reductions in incidence of coronary heart disease (CHD) in women. Credible mechanisms were advanced as to why this might be, and a consensus arose that HRT was protective against CHD (e.g. Estrogen replacement therapy and coronary heart disease; a quantitive assessment of the epidemiological evidence Stampfer M, Colditz G. Int Jour Epid 2004;33:445-53). The evidence was sufficiently compelling that a full clinical trial was initiated - and this indicated that the effect was both far smaller and in the opposite direction - combined HRT showed a small but significant increase in risk of CHD in the study population. Subsequent analysis has shown that the group of women opting for HRT were predominantly from higher socio-economic groups and therefore had, on average, better diet and exercise habits. The studies had falsely attributed the benefits of these confounding factors to the intervention itself (see The hormone replacement - coronary heart disease conundrum: is this the death of observational epidemiology? Lawlor DA, Smith GD & Ebrahim S, International Journal of Epidemiology, 2004;33:464-467). There have been similar controversies regarding links between vitamins and cancer; MMR and autism; antibiotics and asthma; cannabis and psychosis. All these have been identified through small-scale case-control studies but fail to show any effect in whole population time series or other investigations.

A comparison with the tobacco/cancer link is instructive. Here the case-control studies pointed the way, but further confirmation was available in the form of time series showing rates of lung cancer tracking levels of smoking in whole populations, and in the form of laboratory experiments on animals.

Recent research has shown that a substantial majority of highly cited case-control studies are subsequently contradicted or found to be substantially over-ambitious when more rigorous investigations are conducted.

As a result the following guidelines have been proposed when assessing case-control evidence [1]:

  • Do not turn a blind eye to contradiction. Do not ignore contradictory evidence but try to understand the reasons behind the contradictions.
  • Do not be seduced by mechanism. Even where a plausible mechanism exists, do not assume that we know everything about that mechanism and how it might interact with other factors.
  • Suspend belief. Of the researchers defending observational studies, Pettiti says this: "belief caused them to be unstrenuous in considering confounding as an explanation for the studies". Do not be seduced by your desire to prove your case.
  • Maintain scepticism. Question whether the factor under investigation can really be that important; consider what other differences might characterise the case and control groups. Do not extrapolate results beyond the limits of reasonable certainty (e.g. with grandiose forecasts of "lives saved").

Case-control studies are a valuable investigative tool, providing rapid results at low cost, but caution should be exercised unless results are confirmed by other, more robust evidence.


  1. ^ Hormone replacement therapy and coronary heart disease: four lessons. Petitti D, International Journal of Epidemiology, 2004;33:461-463
  • BMJ (formerly British Medical Journal) on Case-control and cross-sectional studies
  • Schlesselman, JJ. Case-Control Studies: Design, Conduct, Analysis . New York : Oxford University Press, 1982, xv + 354 pp. Still a very useful book, and a great place to start, but now a bit out of date.

See also

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Case-control". A list of authors is available in Wikipedia.
Your browser is not current. Microsoft Internet Explorer 6.0 does not support some functions on Chemie.DE