The breast cancer screening programmes in the United Kingdom currently invite women aged 50–70 years for screening mammography every 3 years. Since the time the screening programmes were established, there has been debate, at times sharply polarised, over the magnitude of their benefit and harm, and the balance between them. The expected major benefit is reduction in mortality from breast cancer. The major harm is overdiagnosis and its consequences; overdiagnosis refers to the detection of cancers on screening, which would not have become clinically apparent in the woman's lifetime in the absence of screening.
Professor Sir Mike Richards, National Cancer Director, England, and Dr Harpal Kumar, Chief Executive Officer of Cancer Research UK, asked Professor Sir Michael Marmot to convene and chair an independent panel to review the evidence on benefits and harms of breast screening in the context of the UK breast screening programmes. The panel, authors of this report, reviewed the extensive literature and heard testimony from experts in the field who were the main contributors to the debate.
The nature of information communicated to the public, which too has sparked debate, was not part of the terms of reference of the panel, which are listed in Appendix 1.
1.2 Relative mortality benefit
The purpose of screening is to advance the time of diagnosis so that prognosis can be improved by earlier intervention. A consequence of earlier diagnosis is that it increases the apparent incidence of breast cancer in a screened population and extends the average time from diagnosis to death, even if screening were to confer no benefit. The appropriate measure of benefit, therefore, is reduction in mortality from breast cancer in women offered screening compared with women not offered screening.
In the panel's judgement, the best evidence for the relative benefit of screening on mortality reduction comes from 11 randomised controlled trials (RCTs) of breast screening. Meta-analysis of these trials with 13 years of follow-up estimated a 20% reduction in breast cancer mortality in women invited for screening. The relative reduction in mortality will be higher for women actually attending screening, but by how much is difficult to say because women who do not attend are likely to have a different background risk. Three types of uncertainties surround this estimate of 20% reduction in breast cancer mortality. The first is statistical: the 95% confidence interval (CI) around the relative risk (RR) reduction of 20% was 11–27%. The second is bias: there are a number of potential sources of distortion in the trials that have been widely discussed in the literature ranging from suboptimal randomisation to problems in adjudicating cause of death. The third is the relevance of these old trials to the current screening programmes. The panel acknowledged these uncertainties, but concluded that a 20% reduction is still the most reasonable estimate of the effect of the current UK screening programmes on breast cancer mortality. Most other reviews of the RCTs have yielded similar estimates of relative benefit.
The RCTs were all conducted at least 20–30 years ago. More contemporary estimates of the benefit of breast cancer screening come from observational studies. The panel reviewed three types of observational studies. The first were ecological studies comparing areas, or time periods, when screening programmes were and were not in place. These have generated diverse findings, partly because of the major advances in treatment of breast cancer, which have a demonstrably larger influence on mortality trends than does screening, and partly because of the difficulty of excluding imbalances in other factors that could affect breast cancer mortality. The panel did not consider these studies helpful in estimating the effect of screening on mortality. The other two types of studies, case–control studies and incidence-based mortality studies, showed breast screening to confer a greater benefit than did the trials. Although these studies, in general, attempted to control for non-comparability of screened and unscreened women, the panel was concerned that residual bias could inflate the estimate of benefit. However, the panel notes that these studies' findings are in the same direction as the trials.
1.3 Absolute mortality benefit
Estimates of absolute benefit of screening have varied from one breast cancer death avoided for 2000 women invited to screening to 1 avoided for about 100 women screened, about a 20-fold difference. Major determinants of that large variation are the age of women screened, and the durations of screening and follow-up. The age of the women invited is important, as mortality from breast cancer increases markedly with age. The panel therefore applied the relative mortality reduction of 20% to achieve the observed cumulative absolute risk of breast cancer mortality over the ages 55–79 years for women in the United Kingdom, assuming that women who began screening at 50 years would gain no benefit in the first 5 years, but that the mortality reduction would continue for 10 years after screening ended. This yielded the estimate that for every 235 women invited to screening, one breast cancer death would be prevented; correspondingly 180 women would need to be screened to prevent one breast cancer death. Uncertainties in the figure of a 20% RR reduction would carry through to these estimates of absolute mortality benefit. Nonetheless, the panel's estimate of benefit is in the range of one breast cancer death prevented for ∼250 women invited, rather than the range of 1 in 2000.
The major harm of screening considered by the panel was that of overdiagnosis. Given the definition of an overdiagnosed cancer, either invasive or non-invasive, as one diagnosed by screening, which would not otherwise have come to attention in the woman's lifetime, there is need for a long follow-up to assess the frequency of overdiagnosis. In the view of the panel, some cancers detected by screening will be overdiagnosed, but the uncertainty surrounding the extent of overdiagnosis is greater than that for the estimate of mortality benefit because there are few sources of reliable data. The issue for the UK screening programmes is the magnitude of overdiagnosis in women who have been in a screening programme from age 50 to 70, then followed for the rest of their lives. There are no data to answer this question directly. Any estimate will therefore be, at best, provisional.
Although the definition of an overdiagnosed case, and thus the numerator in a ratio, is clear, the choice of denominator has been the source of further variability in published estimates. Different studies have used: only the cancers found by screening; cancers found during the whole screening period, both screen-detected and interval; cancers diagnosed during the screening period and for the remainder of the women's lifetime. The panel focused on two estimates: the first from a population perspective using as the denominator the number of breast cancers, both invasive and ductal carcinoma in situ (DCIS), diagnosed throughout the rest of a woman's lifetime after the age that screening begins, and the second from the perspective of a woman invited to screening using the total number of breast cancers diagnosed during the screening period as the denominator.
The panel thought that the best evidence came from three RCTs that did not systematically screen the control group at the end of the screening period and followed these women for several more years. The frequency of overdiagnosis was of the order of 11% from a population perspective, and about 19% from the perspective of a woman invited to screening. Trials that included systematic screening of the control group at the end of the active part of the trial were not considered to provide informative estimates of the frequency of overdiagnosis.
Information from observational studies was also considered. One method that has been used is investigation of time trends in incidence rates of breast cancer for different age groups over the period that population screening was introduced. The published results of these studies varied greatly and have been interpreted as providing either reassurance or cause for alarm. So great was the variation in results that the panel conducted an exercise by varying the assumptions and statistical methods underlying these studies, using the same data sets; estimates of overdiagnosis rates were found to vary across the range of 0–36% of invasive breast cancers diagnosed during the screening period. The panel had no reason to favour one set of estimates over another, and concluded that this method could give no reliable estimate of the extent of overdiagnosis.
Were it possible to distinguish at screening those cancers that would not otherwise have come to attention from those that, untreated, would lead to death, the overdiagnosis problem could be much reduced, at least in terms of unnecessary worry and treatment. Currently this is not possible, so neither the woman nor her doctor can know whether a screen-detected cancer is an ‘overdiagnosed' case or not. In particular, DCIS, most often diagnosed at screening, does not inevitably equate to overdiagnosis – screen-detected DCIS, after wide local excision (WLE) only, is associated with subsequent development of invasive breast cancer in 10% of women within 10 years.
The consequences of overdiagnosis matter, women are turned into patients unnecessarily, surgery and other forms of cancer treatment are undertaken, and quality of life and psychological well being are adversely affected.
1.5 The balance of benefit and harm
The panel estimates that an invitation to breast screening delivers about a 20% reduction in breast cancer mortality. For the UK screening programmes, this currently corresponds to about 1300 deaths from breast cancer being prevented each year, or equivalently about 22 000 years of life being saved. However, this benefit must be balanced against the harms of screening, especially the risk of overdiagnosis. In the panel's view, overdiagnosed cancers certainly occur, but the frequency in a screening programme of 20 years duration is unknown. Estimates from trials of shorter duration suggest overdiagnosis of about 11% as a proportion of breast cancer incidence during the screening period and for the remainder of the woman's lifetime, or equivalently about 19% as a proportion of cancers diagnosed during the screening period. Any excess mortality stemming from the investigation and treatment of breast cancer is considered by the panel to be small and considerably outweighed by the benefits of treatment. Some other harms, including increased anxiety and discomfort caused by screening, are also acknowledged.
Notionally, for 10 000 women invited to screening, from age 50 for 20 years, it is estimated that 681 cancers (invasive and DCIS) will be diagnosed, of which 129 will represent overdiagnosis (using the 19% estimate of overdiagnosis) and 43 deaths from breast cancer will be prevented.
Given that the treatment for breast cancer has improved, is screening no longer relevant? The panel's view is that the benefits of screening and those of better treatments are reasonably considered independent. Uncertainty about possible interaction between the benefits of screening and of contemporary treatments is not a reason for stopping breast screening.
The panel was not asked to comment on costs, both of interventions and the consequences of overdiagnosis. With accurate figures an estimate of cost-benefit could be made and compared with other interventions, but would be a significant piece of work in its own right.
An individual woman cannot know whether she is one of the numbers who will benefit or be harmed from screening. If she chooses to be screened, it should be in the knowledge that she is accepting the chance of benefit, having her life extended, knowing that there is also a risk of overdiagnosis and unnecessary treatment. Similarly, a woman who declines the invitation to screening needs to recognise that she runs a slightly higher risk of dying from breast cancer.
1.6 Conclusions and recommendations
Breast screening extends lives. The panel's review of the evidence on benefit – the older RCTs, and those more recent observational studies – points to a 20% reduction in mortality in women invited to screening. A great deal of uncertainty surrounds this estimate, but it represents the panel's overview of the evidence. This corresponds to one breast cancer death averted for every 235 women invited to screening for 20 years, and one death averted for every 180 women who attend screening.
The panel's best estimate is that the breast screening programmes in the United Kingdom, inviting women aged 50–70 every 3 years, prevent about 1300 breast cancer deaths a year, a most welcome benefit to women and to the public health.
However, there is a cost to women's well being. In addition to extending some lives by early detection and treatment, mammographic screening detects cancers, proven to be cancers by pathological testing, that would not have come to clinical attention in the woman's life, were it not for screening - called overdiagnosis. The consequence of overdiagnosis is that women have their cancer treated by surgery, radiotherapy and medication, but neither the woman nor her doctor can know whether this particular cancer would be one that could possibly lead to death, or one that would have remained undetected for the rest of the woman's life.
The panel sought to estimate the level of overdiagnosis in women screened for 20 years and followed to the end of their lives. Estimates of overdiagnosis abound, from near to zero to 50%, but there is a paucity of reliable data to answer this question. There has not even been agreement on how to measure overdiagnosis. On the basis of follow-up of three RCTs, the panel estimated that in women invited to screening, about 11% of the cancers diagnosed in their lifetime constitute overdiagnosis, and about 19% of the cancers diagnosed during the period that women are actually in the screening programme; but the panel emphasises these figures are the best estimates from a paucity of reliable data.
Putting together benefit and overdiagnosis from the above figures, the panel estimates that for 10 000 UK women invited to screening from age 50 for 20 years, about 681 cancers will be found of which 129 will represent overdiagnosis, and 43 deaths from breast cancer will be prevented. In round terms, therefore, for each breast cancer death prevented, about three overdiagnosed cases will be identified and treated. Of the ∼307 000 women aged 50–52 who are invited to screening each year, just >1% would have an overdiagnosed cancer during the next 20 years. Given the uncertainties around the estimates, the figures quoted give a spurious impression of accuracy.
The panel concludes that the UK breast screening programmes confer significant benefit and should continue. The greater the proportion of women who accept the invitation to be screened, the greater is the benefit to the public health in terms of reduction in mortality from breast cancer. However for each woman the choice is clear: on the plus side screening confers a likely reduction in mortality from breast cancer because of early detection and treatment. On the negative side, is the knowledge that she has perhaps a 1% chance of having a cancer diagnosed, and treated with surgery and other modalities, which would never have caused problems had she not been screened.
Evidence from a focus group conducted by Cancer Research UK and attended by two panel members, and in line with previous similar studies, was that this was an offer many women will feel is worth accepting: the treatment of overdiagnosed cancer may cause suffering and anxiety, but that suffering is worth the gain from the potential reduction in breast cancer mortality. Clear communication of these harms and benefits to women is of utmost importance and goes to the heart of how a modern health system should function. There is a body of knowledge on how women want information presented, and this should inform the design of information to the public.
2.1 The UK NHS breast screening programmes
The NHS breast cancer screening programme in England began inviting women to be screened in 1988. This followed the recommendations made by Professor Sir Patrick Forrest in his report on breast screening in 1986 (Forrest, 1986). The breast screening programmes in the United Kingdom currently invite women aged 50–70 years for a screening mammography every 3 years. The mammography is designed to detect changes in the breast tissue that may indicate the presence of cancer. The screening programme in England is currently conducting a randomised trial to ascertain whether there would be benefit in extending the age at which women are invited to 47–73 years.
2.2 Principles of screening
Screening is concerned with the detection of disease at an early stage, with the expectation that treatment will be more effective if begun earlier in the disease process. Screening is therefore based on the principle of there being an effective treatment. It is well recognised that an apparent benefit of increased survival time could be illusory because of simply bringing forward the time of diagnosis without changing the course of the disease. Therefore, the appropriate way to assess benefit is to look at breast cancer mortality of screened and unscreened cohorts rather than just survival time from diagnosis (see section 3).
As the principle of screening is to diagnose cases earlier, at any particular time point during the period of successive screenings, there will be more cases of breast cancer in a group of screened women compared with a similar group of unscreened women. However, it is possible that some of these additional cases may be cancers that would not otherwise have been diagnosed or caused the woman any problem during her lifetime. These cancers are referred to as overdiagnosis (see section 4).
2.3 The debate over benefits and harms of breast screening
Since the screening programmes were established, there has been debate over the potential benefits and harms. Recently, the debate has focussed on the reduction in mortality attributable to screening, the numbers of women overdiagnosed, and the way that the risks and benefits are communicated to women invited for screening. The arguments have become quite polarised between those who believe that the benefit of decreased breast cancer mortality outweighs the harms and those who believe the harms outweigh the benefit. These differing views of the evidence have arisen, in part, from disagreements over the validity and applicability of the available RCTs of breast screening, and from questions about the usefulness and interpretation of observational data on breast cancer incidence and mortality.
The debate over the benefits and harms of breast screening is not unique to the UK and the NHS breast screening programmes. In 2002, the International Agency for Research on Cancer at the World Health Organisation reviewed the evidence on breast screening, and put forward recommendations on further research and on implementing screening programmes (IARC, 2002). The US Preventive Services Task Force in 2009 re-examined the efficacy of different screening modalities. They recommended that women under the age of 50 not be routinely screened, and that women aged 50–74 have biennial rather than annual screens (Woolf, 2010). The Canadian Taskforce on Preventative Health Care updated their guidelines on breast screening in 2011, and concluded that the reduction in mortality associated with screening mammography is small for women aged 40–74 years at average risk of breast cancer. They also found a greater reduction in mortality for women aged ⩾50 compared with those <50, and that harms of overdiagnosis and unnecessary biopsy may be greater for younger women than for older women. They recommended that women aged 50–74 be routinely screened but state that appreciable uncertainty exists around the evidence for this (Canadian Task Force on Preventive Health Care, 2011). Published reports from the Nordic Cochrane Centre concluded that, despite their substantial methodological limitations, the trials of screening showed that screening saved lives, but at the cost of considerable harm from overdiagnosis (Gøtzsche and Nielsen, 2011).
2.4 Breast cancer in the UK
Incidence and mortality
In the United Kingdom, breast cancer remains the most commonly diagnosed cancer in women (48 417 cases in 2009) and is the second most common cause of death from cancer in women (11 556 deaths in 2010). UK breast cancer incidence rates have been rising in all age groups since the late 1970s (Figure 1A). The causes of these increasing rates are thought to include: increased use of hormone replacement therapy; later age at child birth; lower parity; and increasing obesity and alcohol intake in women. Also, there is believed to be better ascertainment, especially in older women. In common with most countries, the introduction of the screening programme for women aged 50–64 in 1988 and those aged 65–70 in 2001 led to additional increases in incidence (Figure 1A).
(A) European age-standardised incidence rates per 1 00 000 population, females, by age, Great Britain. (B) Breast cancer mortality at ages 35–69, UK, 1950–2009 (World Health Organization, 2012).
By contrast with incidence rates, since the early 1990s, the mortality rates for breast cancer have been decreasing – shown both as annual mortality rates and 35-year cumulative risk of dying from breast cancer (Figure 1B). It is believed that the causes of these decreases may include: improvement in treatment, in particular adjuvant therapies; specialisation and better organisation of cancer care; screening; and increased breast awareness (Appendix 2).
Contribution of screening to decreased breast cancer mortality
It is widely agreed that screening alone cannot be the major factor responsible for the decrease in breast cancer mortality over the last 20 years. Improvements in treatment and service delivery are likely to have made the largest contribution to decreased mortality (Berry et al, 2005). Indeed, without effective treatment, screening for breast cancer is redundant. However, it is important to establish what contribution, if any, screening makes, given that it requires the use of substantial resources within the health system, and nearly two million women each year in England alone accept the invitation and agree to be screened (The NHS Information Centre, 2012).
2.5 Independent review of breast screening
It is within this context that Professor Sir Mike Richards, National Cancer Director, England and Dr Harpal Kumar, Chief Executive Officer of Cancer Research UK, asked Professor Sir Michael Marmot to chair an independent panel to review breast screening. The panel's terms of reference are shown in Appendix 1. This panel has reviewed the extensive literature and heard testimony from many of the experts in the field. This report details its findings and recommendations for the breast screening programme in England.
2.6 Independent review panel membership
The independent panel consisted of nationally and internationally recognised experts in epidemiology and/or medical statistics, as well as in current breast cancer diagnosis and treatment practices. A patient advocate was an integral member of the panel. No panel member had previously published on breast screening, thus helping to ensure an objective and independent assessment of the evidence.
The panel was chaired by Professor Sir Michael G Marmot, Director of the Institute of Health Equity, University College London; Chair, WHO Commission on Social Determinants of Health; Chair, Marmot Review – Strategic Review of Health Inequalities in England after 2010; Chair, European Review on the Social Determinants of Health and the Health Divide; MRC Research Professor of Epidemiology and Public Health, University College London with long-standing research on social determinants of health and health inequalities.
The other panellists were:
Professor Douglas G Altman, Director of the Centre for Statistics in Medicine and Cancer Research UK Medical Statistics Group, University of Oxford. Doug's varied research interests include the use and abuse of statistics in medical research, studies of prognosis, regression modelling, systematic reviews, randomised trials, and studies of medical measurement. He is actively involved in efforts to improve the quality of scientific publications by promoting transparent and accurate reporting of health research.
Professor David A Cameron, Clinical Director of the Edinburgh Cancer Research Centre, Director of Cancer Services at NHS Lothian, and Professor of Oncology at Edinburgh University. Previously, David was the Director of the NIHR National Cancer Research Network and Professor of Oncology at Leeds University. His research interests are in translational and clinical trials in breast cancer, and he is the principal investigator of several clinical trials looking at treatment of early breast cancer. Before qualifying as a medical doctor, he completed an undergraduate degree in Mathematics.
Professor John A Dewar, Consultant and honorary Professor of Clinical Oncology. Until recently, John was Head of Oncology at Ninewells Hospital, Dundee. John has a long-standing interest in the management of patients with breast cancer and has been closely involved in clinical trials of both radiotherapy and systemic therapy for breast cancer.
Professor Simon G Thompson, Director of Research in Biostatistics at the University of Cambridge. Simon's research interests are in meta-analysis and evidence synthesis, clinical trial methodology, health economic evaluation, and cardiovascular epidemiology. He has collaborated on a number of major clinical trials, recently including all the major UK national trials of screening and treatment for abdominal aortic aneurysms.
Maggie Wilcox, patient advocate. Maggie was a health visitor for many years before working as Clinical Nurse Specialist in palliative care before her breast cancer diagnosis in 1997. After early retirement following her treatment, she became involved in patient advocacy in cancer services and research. She now provides a patient voice at national and local level as a member of various organisations, including the National Cancer Research Institute Breast Clinical Study Group and the Surrey, West Sussex and Hampshire Network Breast Site Specific Group.
2.7 Independent review process and role of secretariat
As set out in the review's terms of reference, the secretariat provided initial key literature on breast cancer screening, including publications recommended from both sides of the debate. The panel then called on a range of experts (see Appendix 1 for full list) to give evidence.
Cancer Research UK and the Department of Health provided the secretariat function for the review comprising:
Dr Dulcie McBride, Consultant in Public Health Medicine, Department of Health
Sara Hiom, Director of Information, Cancer Research UK
Nick Ormiston-Smith, Data Analysis and Research Manager, Cancer Research UK
Dr Martine Bomb, Programme Manager, Cancer Research UK
Samantha Harrison, Programme Officer, Cancer Research UK
The secretariat acted purely as support to the panel in the practical, writing, and dissemination functions and having no say in the conclusions or recommendations. Further information can be found in Appendix 1.
3. The effect of breast screening on mortality
This section summarises the panel's views of the effect of breast screening on mortality. Specifically, the aim is to estimate the effect of the current national screening programmes in the United Kingdom on breast cancer mortality. Estimates of relative risk reduction, absolute risk reduction, and increase in life expectancy are discussed.
Randomised controlled trials potentially provide the most reliable information about the effects of breast screening. Well-conducted RCTs are prone to fewer distorting effects, or biases, than observational studies. Systematic reviews and meta-analyses of RCTs are widely accepted as the highest level of evidence for guiding policy decisions on medical interventions. For this reason, our quantitative estimate of the benefits of breast screening comes from the randomised trials of breast screening. Given the wealth of observational studies on this issue, in section 3.6 we look to observational studies as a possible guide to more contemporary estimates of the effects of screening on mortality.
Randomised controlled trials, however, are not without their problems in practice. Lack of internal validity, for example, through failures in proper randomisation, losses to follow-up and misclassification of end points, can lead to biased estimates of effects. Differences between the trials and the current UK context, for example, in the type of screening undertaken or in the length of follow-up, lead to a lack of external validity. Both the internal and external validity of the RCTs of breast screening have been widely discussed.
A specific issue raised by some commentators is that most of the randomised trials of breast screening date from the 1980s or earlier. Treatment and overall management of breast cancer have improved considerably since that time. Are the trials still relevant? Such a question can be asked of any area of medical investigation and treatment; trials refer to the past and our use of interventions relates to the future. It is an important area of judgement and one that the panel kept at the forefront of its consideration.
The purpose of screening is to prolong survival, but length of survival from diagnosis of breast cancer to death cannot be used as an end point in the RCTs, because the cancers diagnosed by screening are diagnosed earlier than those diagnosed without screening. Thus, even in the absence of any therapy, a cancer diagnosed earlier by screening will have a better survival than the same cancer presenting later symptomatically. Mortality after invitation to screening is the appropriate end point. However, concerns have been raised about the use of breast cancer mortality. If the adjudication of a death as due to breast cancer is influenced by the woman's screening history, then the estimate of the effects on breast cancer mortality can become biased. For this reason, some have argued that death from all cancers, or indeed all-cause mortality, should be the primary outcome of interest in the trials. The panel disagrees with this view (section 3.5). We also comment on the estimation of absolute risk differences, as opposed to RRs, and the difference between the effects expressed per woman invited and per woman screened.
The panel's view is that although the trials are far from perfect, they offer the most reliable evidence on the RR reduction in breast cancer mortality to be derived from screening.
3.2 Available randomised trials
Eleven randomised trials have been undertaken and reported (New York health insurance plan (HIP), Malmö I and II, Swedish Two County (Kopparberg and Östergötland), Canada I and II, Stockholm, Göteborg, UK Age trial, and Edinburgh; Table 1). The three trials with two parts have sometimes but not always been reported separately in publications. Three other randomised trials are mentioned in the Cochrane Review (Gøtzsche and Nielsen, 2011), but were excluded because they compared multiple interventions (not just mammography), or made major post-randomisation exclusions. We also exclude these three studies from our assessment.
Characteristics of the randomised trials of breast cancer screening
All the trials compared women invited to screening with a control group not invited. However, they varied considerably, for example, in terms of the method of randomisation, age group of women invited, type of mammography employed, whether physical examination or self-examination was also used in either the invited or control groups, interval between screens, number of screens, length of follow-up, and system used for adjudicating breast cancer deaths (Table 1).
The invited and control groups in the trials were constructed either by randomising individuals, or by randomising clusters (geographical areas or general practices), or by allocation according to day of birth. Individual randomisation, with adequate allocation concealment, is rightly regarded as the most reliable method. For population screening studies, however, cluster randomisation can also be adequate, provided sufficient clusters are randomised and balance in social and other characteristics is achieved. Women are identified through existing registers, and so it is unlikely that participation bias, which afflicts some cluster trials (Puffer et al, 2003), would apply (for example, through women moving between areas in order to avoid or obtain an invitation to breast screening). Similarly, using allocation by day of birth would seem to be adequate for population screening trials. Of the trials considered, the Edinburgh trial suffered the most problems in terms of its cluster randomisation (Gøtzsche and Nielsen, 2011), with some re-allocations and post-randomisation exclusions of clusters, which led to severe baseline imbalances (26% of women in the control group and 53% in the invited group were in the highest socioeconomic group). For this reason, like the Cochrane Review, we exclude the Edinburgh trial from our main summary and comment on its results separately (section 3.5).
The trials recruited women of different ages (Table 1). Most overlapped extensively with the age group 50–70 years, relevant to the UK programmes, but some (e.g. UK Age trial, Malmö II) did not. We base our primary conclusions about RR on all the trials, as this appears fairly constant across age groups (Nyström et al, 2002). There is some evidence, however, that the RR may be attenuated in women under age 50 (Canadian Task Force on Preventive Health Care, 2011), so we also consider an analysis that excludes these women.
Duration of follow-up
Even in the pre-screening era, the median survival from diagnosis of breast cancer was several years, so any benefits of screening in terms of mortality are not immediate, but will accrue over time. So the best evidence would come from a trial with a long duration of follow-up, comparing the invited group with a control group who are never invited to screening. The data that come nearest to this are for the age group 55–69 in Malmö I, with a follow-up of 19 years. Most of the trials, however, started systematic screening of the control group after 4–10 years. Little effect on mortality is seen within the first 5 years of screening, so we regard a follow-up period of about 10–15 years after randomisation as providing the most reliable estimate of the RR. A shorter follow-up time would put too much weight on the early period after initial screening, whereas a longer period would include a greater diluting effect of screening in the control group. So we base our primary conclusions about breast cancer mortality on the data reported in the Cochrane Review, which provided results for 13 years of follow-up of the groups as randomised (Gøtzsche and Nielsen, 2011).
Adjudicating cause of death
Potential biases from classifying cause of death have been a major source of contention, especially in the Swedish trials. Ascribing a death as primarily due to breast cancer, or not due to breast cancer, is not always easy or reliable. So, when the screening history of a woman is known, or when a prior diagnosis of breast cancer has been made, this could influence the adjudicated cause of death. There are two ways in which this could distort the results of the trials. The first is overt bias, in which investigators closely involved with the trial adjudicate cause of death and tend to avoid ascribing the cause of death as breast cancer when the woman has been screened (and conversely if they had not). This would exaggerate any beneficial effect of screening. This bias (which may be subconscious) is avoided by the use of an independent end point committee to ascribe causes of death, or by the use of death certificates from national registries. These methods however do not avoid a second way in which a trial's results might be affected; screening increases the number of breast cancers diagnosed, and such a diagnosis may lead preferentially to classifying a subsequent death as due to breast cancer rather than any other cause. This second bias operates against any beneficial effect of screening.
Most trials used an independent end point committee to adjudicate causes of death or took the underlying cause of death from national registries (Table 1). Some of the Swedish trials were criticised for using trial investigators to ascribe cause of death, but subsequent evaluations were made using independent and consensus committees and national registry statistics (Nyström et al, 2002; Tábar et al, 2011). Although the exact numbers of deaths from breast cancer were not the same when adjudication was made using different methods, the overall estimates of RR of breast cancer mortality did not change very much. Thus, although this issue is certainly one of the major criticisms of the trials, the panel does not think it would exaggerate the estimates of RR reduction obtained from individual trials, or indeed from a meta-analysis of trials. We comment on the use of other mortality end points in section 3.5.
Many other aspects of the trials have been discussed in the literature, some of which we mention here. The numbers of women reported in each randomised group have not been identical across the multiple publications from certain trials. Although this is somewhat concerning, it is perhaps not surprising, given that population and other registers are not always fully reliable, and data checks over time reveal duplicates and other problems. Moreover, some publications are based on birth cohorts and others on exact age groups (Nyström et al, 2002). The trials report excluding women with a prior diagnosis of breast cancer. Although this is sensible, it can lead to problems if the exclusions are more easily made in the invited group (for example, because of more information obtained at screening) than in the control group. Some trials include physical examination or self-examination in either or both of the randomised groups. However, there is no evidence that these procedures influence breast cancer mortality (Canadian Task Force on Preventive Health Care, 2011).
We acknowledge the problems and biases discussed above, but judge them as unlikely to have had a major distorting effect on the overall result from a meta-analysis of the trials. Moreover, the biases considered do not all operate in the same direction, with some favouring screening and some acting against it. Although it is easy to be critical of many detailed aspects of the breast screening trials, the relevant judgement is whether the biases are so great as to make their results too misleading for guiding policy. The panel does not believe this to be the case, especially in contrast to the problems in interpreting the results from observational studies (section 3.6).
3.3 Meta-analysis of RRs
As discussed above, we focus on the deaths ascribed to breast cancer in 10 of the 11 randomised trials (excluding Edinburgh) and the meta-analysis conducted in the Cochrane Review, using 13 years of follow-up (analysis 1.2 in Gøtzsche and Nielsen, 2011). We do not distinguish the trials labelled ‘adequately randomised' and ‘sub-optimally randomised' in the Cochrane Review, but consider the totality of evidence across all the trials. We also use random-effects rather than fixed-effect meta-analysis to estimate an average effect across the trials. Using random effects acknowledges that the trials may be estimating different quantities, which is likely given their clinical heterogeneity, whereas a fixed-effect analysis estimates an assumed common effect across all the trials. The results are shown in Figure 2 along with the RRs of breast cancer mortality. The overall RR, comparing invited with control women, is 0.80 (95% CI 0.73–0.89). There was some heterogeneity in the RRs from different trials, but this was not statistically significant (Figure 2). Thus, the RR reduction in breast cancer mortality in the groups invited to screening is estimated as 20% (95% CI 11–27%).
Meta-analysis of the breast cancer screening trials: RR of breast cancer mortality after 13 years of follow-up. Adapted from the Cochrane Review (Gøtzsche and Nielsen, 2011). Note: Malmö II is excluded because follow-up approximating 13...
The RR for women invited to screening is attenuated compared with that for women who actually attend screening (Cuzick et al, 1997). This is because some invited women do not attend, and they may be assumed to get no benefit from the invitation. If the underlying rate of breast cancer mortality in non-attenders is the same as in attenders, one may estimate the RR reduction in attenders as the RR reduction in those invited divided by the (average) attendance rate. Taking the typical attendance in the trials as about 80% (Table 1), this would give 20% divided by 0.80, or 25%). However, this calculation is incorrect as the underlying risk is different in those not attending screening (Zackrisson et al, 2004; Moss et al, 2006). Without this extra information, which is not available for all trials, the calculation of the RR reduction in those attending screening is not possible. In contrast, the calculation can be made, irrespective of underlying risk differences, for the absolute risk reduction (section 3.4). We note that the coverage rate in the UK NHS screening programme is similar to that in the trials, at 77% (The NHS Information Centre). Some non-systematic (opportunistic) screening occurred in the control groups of the trials, but detailed information is not available. This is ignored in our calculations, and will lead to the effect of attending screening being somewhat underestimated.
Other estimates of overall RR
Other meta-analyses of the breast cancer screening trials have given different estimates of the RR reduction. We summarise some of these below.
The Cochrane Review undertook a fixed-effect meta-analysis of the above trials with 13 years follow-up, and reported an estimated RR of 0.81 (95% CI 0.74–0.87). As expected, the fixed-effect analysis gives a slightly narrower CI, but the estimated average RR reduction of 19% is similar to the figure of 20% above.
If women <50 years in the above trials are excluded, the overall RR reported in the Cochrane Review (analysis 1.6, Gøtzsche and Nielsen, 2011) is 0.77 (95% CI 0.69–0.86). So the RR reduction is estimated as 23%, slightly more than the 20% above based on all age groups.
The Cochrane Review (Gøtzsche and Nielsen, 2011) focused on the Canada, Malmö, and UK Age trials as the only ‘adequately randomised' trials. The estimated RR of breast cancer mortality over 13 years follow-up for invited vs control groups in these trials was 0.90 (95% CI 0.79–1.02), whereas in the trials considered ‘sub-optimally randomised' it was 0.75 (0.67–0.83). As a compromise between these two estimates, the authors concluded that a 15% RR reduction was plausible.
The US Task Force (Nelson et al, 2009) provided estimated RRs of breast cancer mortality of 0.86 (95% CI 0.75–0.99) for women aged 50–59 years invited to screening, and of 0.68 (95% CI 0.54–0.87) for those aged 60–69 years. These correspond to RR reductions of 14% and 32%, respectively, with an inverse variance weighted average of 19%.
The Canadian Task Force (Canadian Task Force on Preventive Health Care, 2011) gave an estimate of the RR of breast cancer mortality for invited vs control groups of 0.79 (95% CI 0.68–0.90) for women aged 50–69 years, a RR reduction of 21%. Routinely screening for breast cancer with mammography every 2–3 years for this age group was rated as a weak recommendation, based on moderate-quality evidence according to GRADE criteria (Schünemann et al, 2011).
A review by Duffy et al (2012) of all the trials and age groups gave an overall RR of 0.79 (95% CI 0.73–0.86) comparing invited with control groups, corresponding to a 21% RR reduction in breast cancer mortality.
Different meta-analyses include different trials, durations of follow-up, and definitions of outcome. Nevertheless, there is general agreement in their estimates, of about a 20% RR reduction in breast cancer mortality from invitation to screening.
Generalisability of RRs
A key issue is whether the RR reduction in breast cancer mortality observed in the trials may be taken as applying, at least approximately, to the current UK screening programmes. This is a judgement about external validity, rather than an issue for which much direct empirical evidence is available. As always in policy decision making, we need to use evidence from studies undertaken in the past to make an inference about what is likely in the future. Although RRs are often much more generalisable across contexts than absolute risk differences, it is clearly plausible that RRs could change in new situations. Of particular concern in breast screening is that many of the trials were undertaken a long time ago, that the techniques of mammography have changed considerably, that DCIS is now commonly diagnosed through screening (section 4.6), that the treatments for breast cancer, particularly the drug treatment that can eradicate microscopic spread, have become more effective, and that the overall mortality rate from breast cancer has decreased in the United Kingdom and other countries. These points were put to the panel by some expert witnesses. One could therefore argue that breast screening is now less effective/relevant because even later stage cancers can be treated and/or cured, so there is less need to diagnose breast cancers earlier. However, there is a counter argument that because the systemic drug treatments are only partially effective, it could be that the major improvements that drug treatments have brought in cure rates are in fact in part due to breast screening: by diagnosing more cancers at an earlier stage, contemporary drug treatments have a better chance of eradicating microscopic disease, and thus the gains in survival would not have been as great if breast screening did not exist.
Both views have some supporting arguments, but the panel found no convincing evidence that one or other was more likely to be correct. Thus, the panel's view is that the appropriate manner in which to view the benefits of screening and those of better treatments are that these effects are independent, and thus that the estimates of the relative reduction in breast cancer mortality achieved with screening are the same now as 20 years ago. However, the uncertainty about whether there could be an interaction between the benefits of screening and of contemporary treatments is not a reason for stopping breast screening.
Particular aspects for which there is at least some evidence about the external validity of the trials relate to age, screening intensity, and follow-up time. The RR does not appear to change much across the age range 50–69 years (Nyström et al, 2002), but it may be reduced below the age of 50 (Canadian Task Force on Preventive Health Care, 2011). The RR does not appear to depend strongly on the number of screens, or the screening interval, at least across the ranges studied in the trials. The only randomised trial that compared different screening intervals is inconclusive (Breast Screening Frequency Trial Group, 2002). Reports from trials with long follow-up suggest that little benefit in terms of breast cancer mortality is seen in the first 5 years after starting screening, and that the benefit lasts for at least 10 years after cessation of screening. This is not surprising, given the slow progression rates of many breast cancers.
The panel concludes that the current screening programmes in the United Kingdom, which invite women aged 50–70 every 3 years to undergo mammography, are likely to deliver about a 20% reduction in breast cancer mortality at ages 55–79 years. Clearly, there is uncertainty in this figure. In addition to the uncertainty owing to the limited numbers of breast cancer deaths across the trials, there are potential biases in the trials and concerns about the generalisability of results from the trials to the current UK screening programmes. We note, however, that the level of disagreement in the literature about the RR reduction is minor in comparison to the controversy about the absolute risk reduction.
3.4 Absolute risk reduction
The above discussion suggests a natural way to estimate the absolute risk reduction that applies to the current screening programmes in the United Kingdom. For women aged 50 invited to screening, we assume no benefit in breast cancer mortality until age 55, a 20% reduction at ages 55–79, and no change in the rates of other causes of death. An estimated 1.70% of UK women aged 50 are currently expected to die from breast cancer between the ages of 55 and 79; this is calculated from UK mortality rates (2008–2010) and takes into account the risks of dying from other causes. Since the UK programme has existed since the late 1980s, one may assume that this risk has already been reduced by 20% through screening. Hence, the risk without the screening programme would have been 2.13% (as 1.70/2.13=0.80), and the estimated absolute risk reduction is 2.13−1.70=0.43%.
The number of women needed to be invited for screening for 20 years starting at age 50 in order to prevent one death from breast cancer is therefore 1/0.43%=235. An alternative way of expressing this is that, for every 10 000 women invited into the screening programme at age 50, about 43 deaths from breast cancer would be prevented.
The absolute risk reduction for women attending screening can be estimated as the absolute risk reduction in those invited divided by the average coverage rate in the NHS breast screening programme (77%), so about 0.43%/0.77=0.56%. The number of women needed to be screened for 20 years to prevent one death from breast cancer is then 1/0.56%=180. For every 10 000 women attending screening from age 50–70 years, about 56 deaths from breast cancer would be prevented.
The above calculations are based on the same principles as those used in some publications (Advisory Committee on Breast Cancer Screening, 2006). Essentially, the RR reduction from the trials is regarded as approximately generalisable to the current UK screening programmes, and the corresponding absolute risk reduction is calculated by applying this RR reduction to the national rates of breast cancer mortality for an appropriate age group. The considerable uncertainty in the estimated RR reduction of 20%, as discussed in section 3.3, of course carries through to these estimates of absolute risk reduction.
The NHS screening programme estimates that 1400 lives are saved per year in England owing to breast screening (Advisory Committee on Breast Cancer Screening, 2006). For comparison and illustrative purposes, the panel estimates that for the 307 000 women (aged 50–52) who each year receive their first invitation to a 20-year screening programme (3-year average 2008/2009–2010/2011, The NHS Information Centre), 0.43% of 307 000, or about 1300, deaths from breast cancer per year are prevented. This is close to the NHS screening programme's estimate.
Different methods and estimates in the literature
The marked difference in estimates of absolute risk reduction proposed in the literature is one of the greatest sources of controversy about the value of breast cancer screening (McPherson, 2010). The different estimates stem from the very varied methods used for their calculation. When calculations are made directly from the trials' data themselves, the absolute risk reduction depends overwhelmingly on the underlying risk of breast cancer, which is principally governed by the age groups considered, the length of follow-up, and the population studied. Although this is obvious, it has also been empirically shown by comparing different durations of follow-up in the Swedish Two County trial (Tábar et al, 2011).
The Cochrane Review (Gøtzsche and Nielsen, 2011) focused on the Canada, Malmö, and UK Age trials as the only ‘adequately randomised' trials. The absolute risk of breast cancer death in the control groups of these trials was low (overall rate of 0.33%), partly because of the inclusion of the large UK Age trial (women initially aged 39–41) and the 13-year follow-up period considered rather than the 25-year period from age 55–79, used above by the panel. With the Cochrane Review authors' estimated 15% RR reduction, this leads to an estimated absolute risk reduction of 0.05%, or equivalently that 2000 women need to be invited to screening to prevent one breast cancer death.
An entirely different estimate is given by Duffy et al (2010) based on 22 years of follow-up for those aged 50–69 in the Swedish Two County trial, which estimated a 38% reduction in breast cancer mortality. The calculation considers the absolute risk reduction per women screened across the 7 years of screening in the trial, and makes the strong assumption that the absolute benefits can simply be multiplied up to reflect the 20 years of screening in the UK programmes. This leads to an estimated absolute risk reduction of 0.88% in women screened, or equivalently that 113 women need to be screened to prevent one breast cancer death.
The US Task Force (Nelson et al, 2009) considered a period of 7 years of invitation to screening and 13 years of follow-up after first invitation (Nelson et al, 2009). For ages 50–59 years, they estimated that 1339 women needed to be invited to prevent one death from breast cancer. For ages 60–69 years, their corresponding estimate was 377 women.
The Canadian Task Force (Canadian Task Force on Preventive Health Care, 2011) estimated from the trials that screening 720 women aged 50–69 years once every 2–3 years for about 11 years would prevent one death from breast cancer.
Beral et al (2011) summarised various published estimates of absolute risk reduction from the literature, and concluded that around one breast cancer death would be prevented in the long term for every 400 women aged 50–70 years regularly screened over a 10-year period, based on a previous review (Advisory Committee on Breast Cancer Screening, 2006).
From the above examples, it is clear that different methods of estimation give about a 20-fold difference in the estimates of absolute risk reduction. The panel's view is that to estimate the impact of the UK screening programmes on absolute risk of dying of breast cancer, it is necessary to consider the relevant underlying risk of breast cancer to which the RR reduction from the trials should apply. The panel believes this is best derived from the current UK national rate of breast cancer deaths for women aged 55–79 years. Calculations made directly from the absolute risks observed in the trials are heavily, and often misleadingly, influenced by the age groups included and the length of follow-up available (Beral et al, 2011). Estimates also depend on whether they are expressed per woman invited or per woman screened. We note, however, to the extent that the absolute rate of breast cancer mortality in the United Kingdom is currently declining, the absolute risk reduction from the UK screening programme would also be expected to decline correspondingly in the future.
Life expectancy gained
A reduction in the risk of breast cancer will lead to an increase in life expectancy. As breast cancer is only one of many causes of death, the average gain in life expectancy from the UK screening programme is likely to appear modest. An estimate can easily be derived by contrasting the life expectancy for women aged 50, using current national rates of breast cancer mortality and deaths from other causes, to that which would apply if the rates of breast cancer mortality were 25% higher in each year from age 55–79 years. (25% higher corresponds to the assumed 20% benefit from screening, as 1.25=1/0.80.) This calculation leads to an estimate of 0.073 years (or 27 days) of life gained on average for each woman aged 50 invited to screening. To put this in perspective, the panel noted that abolition of all deaths from breast cancer completely would add 159 days on average to life expectancy for women aged 50.
We also note that this is a crude average of a zero gain for the vast majority of women and a substantial gain for a few. Alternative but equivalent ways of expressing this gain are as follows: (a) for the 307 000 women aged 50–52 who are invited for screening each year, about 22 000 years of life will be saved; (b) for each 10 000 women invited to screening, 730 years of life will be saved; (c) for each 10 000 attending screening, about 950 years of life will be saved; (d) given that 1 in about 180 women attending screening avoid breast cancer death, such a woman would expect to gain on average an extra 17 years of life.
3.5 Other considerations
The Edinburgh trial was the only UK trial in an age group that is within that of the national screening programme. However, as discussed in section 3.2, we excluded this trial because problems in the cluster randomisation led to a severe imbalance in socioeconomic status of the women between the groups, and socioeconomic status influences, in opposite directions, the risk of developing breast cancer and of dying from breast cancer. At 14 years of follow-up, the unadjusted results showed a 13% reduction in breast cancer mortality. However, on adjusting for socioeconomic status, the rate ratio was 0.79 (95% CI 0.60–1.02), a RR reduction of 21% (Alexander et al, 1999). Thus, although doubts must remain about the validity of this latter estimate, we note that it very much in line with the figure of 20% we have used above.
In the preceding sections, we have focused exclusively on breast cancer mortality. Owing to the concerns about whether such deaths are reliably adjudicated in the trials, some authors have suggested that this has led to exaggerated estimates of the RR reduction, and that the outcomes of death from any cancer, or death from any cause, are the appropriate ones for judging the impact of breast screening on mortality. The panel disagrees with this: evaluating all-cancer or all-cause deaths in the trials will lack power because breast cancer deaths represent only a small proportion within these categories. In particular, a 20% RR reduction in breast cancer deaths for ages 55–79 years would yield only 3.0% and 1.2% RR reductions in all-cancer and all-cause deaths, respectively. The trials are not of sufficient size (in terms of numbers of women and length of follow-up) to allow such small RR reductions to be reliably estimated. Hence, a statistically non-significant effect for all-cancer or all-cause deaths in the trials cannot be interpreted as evidence against a reduction in breast cancer deaths.
Some authors have argued that changes in the incidence of more advanced breast cancer, whether defined as above a certain tumour size or with spread to the ipsilateral axillary nodes, is a useful surrogate indicator of the effect of screening on breast cancer mortality in the trials, as the ultimate risk of dying of breast cancer depends in part on the stage of disease at first presentation. Although, on average, one could expect a breast cancer screening programme to lead to diagnosis of breast cancers at an earlier stage, this approach cannot, however, directly exclude lead time effects. The situation is further complicated by the issue of interval cancers, which have been shown in more than one study, as compared with screen-detected cancers, to be more often high grade, which is itself predictive of a poorer prognosis. However, what is less clear is whether the prognosis of a breast cancer is determined only by the stage when diagnosed, or whether in the absence of a screening programme the underlying biology is the main determinant of outcome, and this in turn influences when the cancers present. Thus, for those cancers diagnosed earlier by screening, it is not clear which, if any, of the clinical markers of prognosis (stage, size, grade etc.) is the best predictor of ultimate outcome; or is it some other fundamental characteristic only assessable by molecular biology?
Therefore, there appears to be little reason to use these surrogate outcomes as evidence for or against the benefits of screening, as substantial assumptions are needed to estimate the consequent effect on breast cancer mortality. Only if one wanted to disregard completely the evidence about breast cancer mortality from the trials, would the use of such surrogate outcomes have value.
There are possibilities of specific harms of screening in terms of induction of other cancers through the X–rays used in mammography or the radiotherapy or drug therapy used to treat breast cancer, and of coronary damage and deaths through radiotherapy (especially of the left breast). These potential harms are discussed in section 5.2.
Statistical and other uncertainties
It is conventional that results from statistical analyses, including meta-analyses, are presented with a measure of statistical uncertainty such as 95% confidence limits. Although these are helpful in giving an impression of the possible influence of the play of chance (given the sample sizes that are available in the studies considered), they fail to represent the uncertainties because of possible biases (from lack of internal validity of the studies) or owing to generalisation from the trials to a new context (external validity). So, the CI given for the RR reduction of breast cancer mortality from a meta-analysis of the trials is an understatement of the uncertainty about the RR reduction that applies to the UK screening programmes. A RR reduction of 20% represents the panel's judgement of the evidence, and should be regarded as an approximate figure rather than a precise estimate.
3.6 Observational studies
In addition to the trials, the panel also considered the value of observational studies in estimating the impact of screening on breast cancer mortality. The RCTs of mammographic screening were conducted at least 20 years ago and most over 30 years ago. Observational studies may help to quantify the effects of screening in an era with major improvements in diagnostic imaging, clinical care, and patient outcomes, as many of the observational studies are more recent than the trials. Both proponents and critics of screening have suggested that the observational studies are more relevant today than the RCTs. However, these studies are beset by many more biases with consequent problems of interpretation. It is also possible that they are more prone to selective reporting than trials, in that the results obtained determine the enthusiasm of the authors and journals for publication.
The biases inherent in observational studies differ by type of study. All share the common problem of potential lack of comparability of screened and unscreened women. It is this feature that the RCTs are designed to address. Each observational study design has strengths and weaknesses and, within each class, specific studies vary in their methods and credibility. The relative merits and problems of the various observational study designs are hotly contested both in the literature and in the evidence the panel heard.
Ecological and time-trend studies
Some observational studies compare time trends for breast cancer mortality in countries or areas before and after the introduction of screening, or concurrently between areas with and without screening. In the first type of study, extrapolation of time trends demands that decisions are made, for example, about the linearity or otherwise of the trend, the choice of time periods considered as ‘before' and ‘after' screening, and the age groups included. In the second type of study, choices have to be made about the areas to include, the time period considered, and the age groups included. Such decisions, which can appear to have been made rather arbitrarily, can have a profound impact on the estimates obtained. Lack of comparability and different time trends in the groups being contrasted could lead to substantial bias. For these reasons the panel does not consider that these types of studies provide reliable evidence on the effect of screening on breast cancer mortality, and amongst observational study designs we focus instead on case–control studies and incidence-based mortality studies.
Case–control studies compare the history of breast screening attendance between women dying of breast cancer and control women who did not die of breast cancer. Case–control studies are prone to a number of potential biases. The main problem with case–control studies is that those attending breast screening are different from those who do not attend. This is referred to as self-selection bias or the ‘healthy screened effect'. Attendance is influenced by social and demographical factors that are also likely to be related to the risk of dying from breast cancer, with the resulting bias potentially exaggerating the estimated effect of screening. Also, the existence of a breast screening programme in an area may be associated with better treatment of breast cancer. Therefore, women diagnosed with breast cancer in an area with a breast screening programme may also receive more effective treatment than women where there is no such programme. This would bias the study in favour of screening. Attempts are made to correct for the resulting biases by choice of controls and statistical adjustment (Connor et al, 2000; Duffy et al, 2002).
Some of the expert witnesses who gave evidence to the panel felt that case–control studies provided the most reliable form of observational data while others believed the opposite. The panel undertook a review of the individual characteristics of a number of case–control studies to assess the potential bias of each one (Appendix 3). In general, the studies matched controls to cases by both age and residence but some matched on just one of these variables. Self-selection bias was discussed in around three-quarters of the studies and statistically controlled for, using a variety of methods, in less than half of the studies (Appendix 3).
The case–control studies show more favourable benefit of screening compared with the trials. The panel believes that this is plausibly because of inadequate control for self-selection bias rather than in screening actually being far more beneficial now than in the trials. Attempts to correct for self-selection bias were based on information outside of the study itself (either from a previous time period, or from other geographical areas) that may not be fully relevant. When adjustment was made, the apparent benefit of screening was diminished. The bias that screening could be associated with better treatment was controlled for studies conducted in countries with uniform treatment services.
In conclusion, the panel notes that the beneficial effects of screening are in the same direction as those seen in the trials, but that control for self-selection bias may be inadequate in many of the studies.
Incidence-based mortality studies
Njor et al (2012) conducted a review of European studies on the impact of service mammography screening on breast cancer mortality using incidence-based mortality. In these studies, only breast cancer deaths occurring in women with breast cancer diagnosed after their first invitation to screening are included. They classified the studies according to type of comparison group. These were (1) women not yet invited, (2) historical data from the same region as well as from historical and current data from a region without screening, and (3) historical comparison group combined with data for non-participants.
They found that the effect of screening on breast cancer mortality varied across studies. The RRs were 0.76–0.81 in group 1; 0.75–0.90 in group 2; and 0.52–0.89 in group 3. Study databases overlapped in both Swedish and Finnish studies, adjustment for lead time was not optimal in all studies, and some studies had various other methodological limitations. There was less variability in the RRs after allowing for the methodological shortcomings. On the basis of evidence from the most reliable incidence-based mortality studies, they concluded that the most likely impact of European breast screening programmes was a breast cancer mortality reduction of 26% (95% CI 13–36%) among women invited for screening and followed up for 6–11 years.
Many observational studies have been published, and their conclusions hotly contested. In general, the more contemporaneous case–control and incidence-based mortality studies support the evidence from the trials that screening does have a beneficial effect on mortality. The panel's view is that the trials provide more reliable evidence for an estimate of mortality reduction. Nevertheless, the observational studies support the hypothesis that screening continues to be beneficial in an era of improved treatment.
The purpose of breast screening is to detect cancer early, before it has come to clinical attention. If all cancers would eventually be clinically recognised and treatment was the same and equally effective no matter when the tumour was diagnosed, then screening would be redundant. However, the understanding is that if the cancer is diagnosed earlier, then treatment will be more effective. This is the assumption on which screening is based. The evidence reviewed in section 3 supports that assumption.
As cancers are detected earlier because of screening, we expect the cancer incidence to be higher among screened women during the screening period (the time period between the detection of a cancer at screening and when it would have presented clinically is the ‘lead time' and is an inevitable part of screening). In principle, when screening ceases the incidence should fall back so that by the end of the screening period plus lead time, the cumulative incidence in the screened and control populations should be the same.
Some screen-detected cancers, however, may never progress to become symptomatic (clinically detectable) while some women would die from another cause before the cancer became evident. This adverse consequence (harm) of screening is called overdiagnosis or overdetection. It is variously defined as the ‘detection of cancers on screening that would not have been found were it not for the screening test' (IARC, 2002), or ‘that would never have clinically surfaced in the absence of screening' (
Пуля пробьет либо позвоночник, либо легкие, а затем сердце. Если даже он не попадет в сердце, Беккер будет убит: разрыв легкого смертелен. Его, пожалуй, могли бы спасти в стране с высокоразвитой медициной, но в Испании у него нет никаких шансов.
Два человека… .