Methodology and definitions
Updated 20 January 2026
Applies to England
Summary of data sources and analytical data sets used in this report
This report uses the following analytical data sets to produce the chapter outputs of the Tuberculosis Annual Report, England, 2025 (data up until 31 December 2024):
National Tuberculosis Surveillance (NTBS) is used in:
- TB incidence and epidemiology, England 2024
- TB prevention, England, 2024; section on contact tracing of close contacts of people notified with active TB in NTBS and rate of active cases at ICB sub locations
- TB diagnosis and microbiology, England, 2024
- TB treatment and outcomes, England, 2024
- TB in children, England, 2024
Pre-entry TB screening data set is used in:
- TB prevention, England, 2024; section on pre entry TB screening of active pulmonary TB in people applying for long term visas to the UK
National NHS England (NHSE) Latent TB Infection Testing data set (LTBI) is used in:
-
TB prevention, England, 2024; section on LTBI testing of new entrant migrants to the UK with linkage to:
- (i) GP registration data for people assigned a type 4 flag to identify the population eligible for NHSE LTBI testing
NTBSÌýdata set
TBÌý²Ô´Ç³Ù¾±´Ú¾±³¦²¹³Ù¾±´Ç²Ô²õ
People who are diagnosed withÌýTBÌýin England, Wales, Northern Ireland and Scotland must be notified throughÌýNTBS. This report only includes data for individuals withÌýTBÌýwho are resident in England or are treated in England (including individuals who are homeless or visiting from abroad).
Only individuals with disease caused by Mycobacterium tuberculosis complex (MTBC) are reported. Individuals were denotified and removed from the data set if the infective agent was identified as non MTBCÌýor M. bovis Bacillus Calmette Guerin (BCG) subspecies.
Data production
In 2021,ÌýNTBSÌýwas launched and replaced 2 historical surveillance systems:
- the Enhanced Tuberculosis Surveillance system (ETS)
- the LondonÌýTBÌýRegister (LTBR)
Data sets from 2018 onwards were extracted fromÌýETSÌý²¹²Ô»åÌýLTBRÌýand were merged withÌýNTBSÌýfollowing a series of data migrations between July and December 2021. Data reported here was obtained from the merged data sets (NTBS,ÌýETS,ÌýLTBR). Data was analysed from the last full extract on 23 April 2025. Further changes to the source data and the data set were applied after this date resulting from responses to specific data queries. A further extract of contact tracing fields was done on 4 August 2025.
Data cleaning to improve data quality
Denotifications
People with BCGosis, on chemoprophylaxis for latent TB infection or with a non tuberculous mycobacterial infection who were notified in error were identified using comments fields, and denotified. People with culture-confirmed TB who had been denotified were queried with clinics, and lab contaminations were removed, or people were renotified if they were found to have been denotified in error.
In addition, a probabilistic matching process was carried out for notifications between January 2022 and December 2024 to identify people with more than one notification within a 12-month period. Identified duplicates were denotified with any missing information transferred from the duplicate to the original notification.
Geography
The postcode field (used to map postcodes to geographic areas) was cleaned by identifying invalid postcodes based on matching to the May 2024 Postcode Directory fromÌýONS. Where cleaning was necessary, the correct postcode was identified using the address fields.
For people who were homeless or who had a residence outside the UK, but were notified in England, the postcode of the clinic or hospital at which they were treated was assigned to the notification. For people with no postcode or treatment clinic or hospital, the local authority and UK Health Security Agency (UKHSA) centre were updated using the local authority field recorded based on the area that the notifying case manager was located in.
UKHSAÌýregion was derived fromÌýUKHSAÌýregion of residence based on the individual’s residential postcode. If missing,ÌýUKHSAÌýregion in which treatment occurred (most recently, as care may have been transferred) was used, for example if a person had no fixed abode.
Cleaned postcodes were assigned boundary layers and merged with boundaries forÌýclinical commissioning groups, integrated care boards,Ìýupper tier local authorities (UTLAs) and local authorities sourced from the Central Lookups Database within theÌýUKHSAÌýData Lake which is managed by the Public Health Data Science (PHDS) team. These are available in theÌýUKHSAÌýlayers of the map software (GIS).
Site of disease
The site of disease was reclassified to pulmonary if a positive sputum smear (microscopy) sample was recorded or if a positive culture was grown from a pulmonary laboratory specimen. People with laryngealÌýTBÌýwere included in pulmonary breakdowns, and people with miliaryÌýTBÌýwere included in both pulmonary and extra pulmonary breakdowns. Site of disease for people with culture confirmation was reclassified based on the site in the body from which the specimen was taken. Site of disease classifications were also updated using the free text field for site of disease.
Social risk factors including prison and asylum status
The presence or absence of the social risk factors (current or a history of drug misuse, current alcohol misuse, current or history of homelessness, current or history of prison, current mental health needs and current asylum status; including if remanded in an immigration detention centre) were updated from missing or unknown if relevant information was found in the free text comments fields withinÌýNTBS.
Homelessness was updated to ‘yes’ if mentioned in the comments fields or if the address given was ‘no fixed abode’ or a shelter or hostel for homeless people was named.
Prison (current or in the past) was updated to ‘yes’ if mentioned in the comment’s fields, if HMP or a prison name was recorded as the address or if the residential postcode corresponded with a prison. Up until 2020, data on incidentÌýTBÌýindividuals reported to the Public Health in Prisons (PHiP) log were used to further identify people who had been imprisoned, but this has not conducted since.Ìý
The immigration detainee variable was updated if the address given at notification, comments fields or occupation field showed the person to be an immigration detainee.
The asylum seeker variable (newly introduced inÌýNTBS) was updated as asylum seeker if recorded in the occupation field subcategory of ‘no occupation’. For analysis, asylum seeker was then recoded as ‘yes’ if either asylum seeker variable was present. The asylum seeker variable was further updated so that all UK born individuals with a missing value for this variable was updated to ‘no’.
Demographic characteristics
Sex is a mandatory field in NTBS and reported as male or female and is intended to reflect biological sex at birth rather than gender. There are known mechanisms by which biological sex can affect TB risk and its clinical expression. Both sex and gender may be associated with behavioural risk factors and healthcare. Where missing from the raw data (pre-2022), it was derived from the name of the individual where names were unambiguous. In 2024 sex was withheld for one individual and this notification was therefore excluded from age and sex breakdowns.
Age and age groups were derived from the date of notification and date of birth. Notification demographics were used for tracing against the Personal Demographics Service (PDS). The Demographics Batch Service (DBS) enables a user to submit a file of patient demographics for tracing against the PDS, providing back the NHS number and most up-to-date demographics where an exact match is found. Those with conflicting values for age or inconsistent mortality information were cross referenced against the matchedÌýPDSÌýdata to resolve and checked with case managers.
UK and non-UK born status occurs in the raw data. It was amended if missing and the country of birth indicated non-UK birth.
Entry to the UK is entered as year only by NTBS users. A proxy date of 15 July of the reported year is then assigned and years since entry derived as notification date minus the proxy UK entry date.
HIV co-infection
HIV test result data is not collected within user entered NTBS. Instead, results are obtained from the UKHSA HIV and AIDS Reporting Section (HARS), including individuals aged 15 or over diagnosed with HIV in England up to 31 December 2024 (extracted 9 May 2025). TB data were extracted from NTBS on 23 April 2025 and included individuals aged 15 or over notified with TB between 2000 and 2024. Matching involved an initial probabilistic step conducted by the TB unit using SQL on a secure UKHSA cluster (code stored in a UKHSA GitHub repository), followed by deterministic matching by HARS. Matching variables included country of birth, surname Soundex, initial, date of birth, gender, ethnicity, postcode, LSOA, hospital, TB flag within the HIV data set, and date of death. Due to reduced postcode completeness in recent years, LSOA was added, although its variability over time reduced match rates as the algorithm penalised mismatches. Major changes to the matching algorithm were not feasible; therefore, existing matches up to 2023 were retained, and 2024 matches appended using the same approach.
Pre-entry screening data set
This data set comprises person level data of the results of screening for active pulmonaryÌýTBÌýin people when applying for long term (more than 6 months) visas to visit the UK.
Data sources, collection and data set production
The pre-entry screening data set comprises data collected from International Organisation for Migration (IOM) and non-IOMÌý³¦±ô¾±²Ô¾±³¦²õ.ÌýIOMÌýdata was collected byÌýIOMÌýpanel physicians, entered via a secure web-basedÌýIOMÌýsystem and collated by the central office in Manila. This data was then securely transferred toÌýUKHSA. Data from non-IOMÌýproviders was collected by the clinics, collated via the Home Office United Kingdom Visas and Immigration (UKVI) unit and securely transferred toÌýUKHSA.
Data fromÌýIOMÌýclinics was updated via their web-based portal prior to submission toÌýUKHSA.
Personal data (age, sex, and visa type) is collected directly during screening at IOM clinics and stored and shared securely on the IOM web-based platform, ensuring complete records. For non-IOM clinics, personal data is not shared due to data sensitivity. Instead, this information is obtained from the Home Office using the individual’s passport number. Linkage to obtain personal data is only possible for those who have undergone screening and successfully applied for and received a UK visa. As a result, personal data from non-IOM clinics may be incomplete.
Data from the period up until 31 December 2024, as received by 1 June 2025 was used in this report. Number of screening episodes were reported for the period of January 2014 to December 2024. The number of people with confirmedÌýTBÌýand confirmedÌýTBÌýcase detection rates were reported for the period January 2018 to December 2024 only due to poor data quality in the period of January 2014 December 2017 to determine people with confirmedÌýTB.
Data production and cleaning
Data forÌýIOMÌýand non-IOMÌýdata was analysed separately due to the different methods through which this data is recorded and differences in data quality.
In both non-IOMÌý²¹²Ô»åÌýIOMÌýclinics, an individual person was defined as a unique passport number, but this may not be accurate as one individual may have more than one passport number.
In data from non-IOMÌýclinics, one screening episode was defined as one unique passport number, examination date and clinic. This variable was used to link to the sputum updates data. In data fromÌýIOMÌýclinics, one screening episode was defined as one unique system identification number.
NHSEÌýLTBIÌýdata set
This comprises 2 data subsets, a data set of people eligible for testing and a data set of individuals for whomÌýLTBIÌýtest data was received and who appeared to be eligible. TheÌýLTBIÌýtest data set was matched to the eligible population data set to determine the proportion of the eligible population tested. This was then matched to the cleanedÌýNTBSÌýanalytical data set to derive data sets of; (i) eligible individuals who were not tested through the programme and who developed activeÌýTBÌýdisease, and; (ii) eligible individuals who were tested through the programme who subsequently developed activeÌýTBÌýdisease.
Data sources, collection and processing
New entrant migrant population eligible forÌýNHSEÌýLTBIÌýtesting (Type 4) data set
The Patient Registration Data System (PRDS), managed by the NHS, holds records of all patients registered with GPs in England and Wales. When individuals whose previous address was outside the UK and who have spent more than three months abroad register with a local GP, they are assigned a specific code (Flag 4). This data is also known as Type 4 GP data.
This data is processed byÌýTBÌýUnit for the purpose of providingÌýTBÌýservices in participating integrated care boards (ICBs) and ICB sub location lists of eligible new entrant migrants in their areas. The Flag 4 data is obtained from the Spine Demographic Service (SDS) on a quarterly basis by NHS Digital via a secure encrypted user interface and provided toÌýTBÌýunit. The data is then deduplicated and cleaned including country of birth and age as key eligibility criteria.ÌýTBÌýunit then send data to participatingÌýICBÌýsubsections in an encrypted format in encrypted emails. A few (5)ÌýTBÌýservices choose other methods to determine newly registered patients.
As of 1 of July 2022,Ìýclinical commissioning groups (CCGs)Ìýwere replaced by ICBs as implemented by theÌýÌýTherefore this report presents data by ICBs reported in Supplementary Table 13 of theÌýTBÌýprevention in England data set.
Data cleaning to improve data quality of the eligible population data set (Type 4 data)
Data extracted from theÌýLTBIÌýportal was deduplicated by accepting the first valid test result and NHS number. This data is then added to the database theÌýLTBIÌýprogramme queries. If the first test result was valid but indeterminate the subsequent retest result was used, if available. To improve the completeness and accuracy of patient identifiable information, records were matched to NHS Spine,ÌýNTBSÌýand Type 4 GP data sets using NHS number. Where no NHS number was available, the forename, surname and date of birth were used. Data for NHS Spine were sent with the unique data set identifier to match the data to the correct record. Postcodes were matched to those in theÌýONSÌýNSPLÌý(National Statistics Postcode Lookup) look up to obtain geographical locations.ÌýAfter cleaning, records of individuals who were not born in or travelled from an eligible country or not aged 16 to 35 years at the time of testing (based on date of birth and firstÌýLTBIÌýtest date) or those that were not tested in programmatic ICBs were removed from this analytical data set.
Records with implausible dates including: treatment date being earlier than test date, retest date before test date and treatment start date after treatment completion date were excluded from the data set.
Matching of datas sets toÌýNTBSÌýanalytical data set
After cleaning, single person records were deterministically matched from the eligible population data set with those from theÌýLTBIÌýtested data set and the cleanedÌýNTBSÌýanalytical data set using various combinations of NHS number (includingÌýPDSÌýmatched NHS number), forename, initial, family name, family name soundex, date of birth, gender and postcode. This was undertaken usingÌýSQLÌýand on a secure server and the code saved in aÌýUKHSAÌýinternal GitHub repository.
Limitations of the data
The estimate of the eligible population forÌýNHSEÌýLTBIÌýtesting is based on the Flag 4 GP data for participating services. Readers should be aware that not all new entrant migrants will register with a GP and will thus be missed by the programme. It does not capture new entrant migrants who would be eligible but who register with a GP in a non participating, typically a lowÌýTBÌýincidence area of England.
TheÌýLTBIÌýtest data set is also likely an underestimate of the total tests completed in people meeting the eligibility criteria; due to some regions not uploading data for example Manchester within the Greater ManchesterÌýICBÌýor incomplete data. In 2024, place of birth was missing for 8,069 test records from laboratories, leading to these being excluded as eligibility could not be determined.
Data on prophylactic treatment forÌýLTBIÌýis poorly completed within theÌýLTBIÌýportal.ÌýThere were 6,097 positive tests missing 5,084 treatment start dates and 5,382 missing a treatment completion date, (there was no completion date for 301 that started treatment).ÌýTBÌýservices receive the programmatic incentive payment based on the number of valid test results entered, but not treatment data. Variation in local human resources have resulted in variable data entry on treatment completion over time and location.
Bacillus Calmette Guérin (BCG) vaccine coverage
BCG vaccine coverage presents data of children up to 3 months old in England. The data source is produced by the UKHSA; the data for this annual report is derived from the immunisation page on the UK Government official statistics website on vaccine coverage which also provides further information on data sources.
Immunisation data is published quarterly and gives a breakdown by local authority and regional totals for each programme. In the TB annual report we provide a summary by region and by local authority with a 3-year TB incidence rate of greater than 20 per 100.000 as per Supplementary Table 8 in the TB incidence and epidemiology chapter for this reporting year.
In previous reports, BCG data was reported by financial year as per NHS England’s reporting. As of 2024, we reported this data according to calendar year to align with the rest of the report. For this report, we combined and presented quarterly vaccination coverage statistics for children aged up to 5 years in the UK from Cover of vaccination evaluated rapidly (COVER) programme 2023 to 2024: January to March 2024 and from Cover of vaccination evaluated rapidly (COVER) programme 2024 to 2025: quarterly data April to June 2024, July to September 2024 and October to December 2024.
COVER presents BCG data on the number of children who reached 3 months in reporting quarter and the BCG percentage coverage at 3 months. The analysis further calculates the number of vaccinated children.
Report methodology and definitions
TBÌý²Ô´Ç³Ù¾±´Ú¾±³¦²¹³Ù¾±´Ç²Ô²õ
Individuals withÌýTBÌýare reported by area of residence and by calendar year of notification.
Social risk factors
People withÌýTBÌýare reported as having at least one social risk factor (SRF) (‘yes’) if any of the 6 social risk factors (current alcohol misuse, current or a history of homelessness, drug misuse, imprisonment, current asylum seeker status and current mental health needs) had ‘yes’ recorded. As a result, the denominator is all notifications. This assumes that people for whom no data were recorded for individualÌýSRFsÌýwere a ‘no’ and may result in under estimation.
Data for individual social risk factors reported is limited to those with recorded data, for example a ‘yes’ or a ‘no’. As a result, the denominators for these are smaller than all notifications due to missing data. If there is significant under reporting ofÌýSRFsÌýin those with missing data, this should result in a better estimate of the true proportion of the people with eachÌýSRF. However, if data is more likely to be recorded if the response is a ‘yes’ this could result in an over estimates. This may be the case for the asylum seekerÌýSRF, especially in years before 2021.
Mental health is recorded by TB case managers and is based on their judgement if mental health concerns are likely to affect the person’s ability to adhere to treatment. This was added to surveillance in London UKHSA centre in 2018 and is a simple ‘yes’ or ‘no’ response. It was introduced nationally in 2021 with the introduction of NTBS. Here we report this as the person has need of support for mental health and therefore has ‘mental health needs’.
Asylum seeker status and immigration removal centre were first added to national surveillance as discrete variables in 2020. Prior to this, ‘asylum seeker’ status was extracted from free text comment fields and user entered values within occupation (LTBR). As a result, more complete data on this exposure is apparent from 2020 to and after complete roll out of NTBS in 2021 when compared with previous years. Alcohol misuse is as recorded by case managers and is based on their judgement if current alcohol misuse is likely to affect adherence to treatment.
History of drug misuse, homelessness and prison are self-reported by individuals and are first asked as a ‘yes’ or ‘no’ response and then with additional information on duration; as current, within last 5 years or more than 5 years ago. Unless indicated otherwise, analyses here present these SRFs as ‘yes’ if either history of, or a duration value, was recorded.
Pre-entry screening activeÌýTBÌýcase definitions
In both non-IOMÌý²¹²Ô»åÌýIOMÌýclinics, a person with laboratory-confirmedÌýTBÌýmet the following criteria: had an abnormal chest X-ray (CXR) consistent withÌýTBÌýand a positive sputum culture result or positive culture result in the absence of a validÌýCXRÌýresult (for example in pregnant women or young children).
InÌýIOMÌýclinics, in the absence of sputum test confirmation,ÌýTBÌýwas clinically confirmed if the following were documented: a clinician’s judgement that the patient’s clinical and/or radiological signs and/or symptoms are compatible with tuberculosis,Ìý²¹²Ô»åÌýa clinician’s decision to treat the patient with a full course of antituberculosis therapy.
In non-IOMÌýclinics, in the absence of sputum test confirmation,ÌýTBÌýwas clinically confirmed if the reason for no certificate given was referred for treatment,Ìýand/orÌýTBÌýwas recorded as confirmed and a clearance certificate was not issued.
In non-IOMÌýdata, in the absence of reported sputum culture results where there was an abnormalÌýCXRÌýsuggestive ofÌýTB, an applicant was considered a possibleÌýTBÌýcase if they were recorded as TBÌýsuspected, and a TBÌýclearance certificate not issued.
Reports published prior to 2023 used both possible and confirmedÌýTBÌýcases to determine theÌýTBÌýcase detection rate, but the inclusion of possible cases is likely to result in an overestimate of the true number of cases detected. In the current report theÌýTBÌýcase detection rate was determined by the number of people with laboratory- or clinically-confirmedÌýTBÌýonly out of the number of people screened.
Diagnostic and laboratory tests
Data forÌýTBÌýisolates from the National Mycobacterial Reference Service (NMRS) are matched toÌýTBÌý²Ô´Ç³Ù¾±´Ú¾±³¦²¹³Ù¾±´Ç²Ô²õ. Isolates are de duplicated and summarised to only report one isolate perÌýTBÌýnotification per notification period.
NTBSÌýalso includes user entered fields to record whether a culture sample and other diagnostic tests, such as polymerase chain reactionÌý(PCR), were undertaken and the results of these tests. These data fields are combined to generate a final test status variable for the different tests for all the notified cases.
Culture and other diagnostic test results are then reported as below.
Any test performed:
- Yes: any value recorded inÌýNTBSÌýof any test type variables (culture,ÌýPCR, microscopy, histology, or chest X ray), test result and date of test regardless of result
- No: no recorded value of variables of test type, test result and date of test
Any test positive:
- Yes: positive test result recorded for any test type (culture,ÌýPCR, microscopy, histology or chest X-ray)
- No: no results or negative test result recorded to all test types (as above)
Culture confirmed:
- Culture confirmed: supported byÌýNMRSÌýlaboratory result of a positive culture forÌýMTBC
- Culture unconfirmed: negative culture, or noÌýNMRSÌýresults for culture, surveillance system states no culture undertaken, no other supporting information
Speciation
Species defined as M. tuberculosis, M. bovis, M. microti, M. africanum or Mycobacterium tuberculosis complex (MTBC).
MTBCÌýis assigned to those not fully speciated by previousÌýPCR-based methods orÌýWGS. The introduction ofÌýWGSÌýhas decreased the number of notifications in this category.
Drug resistance definitions
The resistance reported follows this classification. Most resistance proportions use the number of culture positive cases as the denominator. The only exception to this is the proportion of quinolone resistance (seeÌýTBÌýdiagnosis and microbiology in England, 2024, Supplementary Table 19) and the phenotypic testing of 2nd line drugs forÌýRR MDRÌýTBÌýwhere the denominator is test performed (seeÌýTBÌýdiagnosis and microbiology in England, 2024, Supplementary Table 20).
Definitions:
- rifampicin-resistant or multidrug-resistantÌýTBÌý(RRÌýorÌýMDRÌýTB) is defined as resistance to rifampicin with or without isoniazid resistance
- isoniazid mono resistance is defined as resistant to isoniazid but not reported as resistant to rifampicin
- pre-extensively drug-resistantÌýTBÌý(pre XDRÌýTB) areÌýTBÌýstrains which fulfil the definition of multidrug-resistant or rifampicin-resistantÌýTBÌý(MDRÌýorÌýRRÌýTB) and which are also resistant to any fluoroquinolone (levofloxacin and or moxifloxacin plus historically used levofloxacin and ofloxacin)
- extensively drug-resistantÌýTBÌý(XDRÌýTB) are strains that fulfil the definition ofÌýMDRÌýorÌýRRÌýTBÌýand which are also resistant to any fluoroquinolone and at least one additional Group A drug in theÌýÌý(Group A drugs are the most potent group of drugs in the ranking of second line medicines for the treatment of drug-resistant forms ofÌýTBÌýusing longer treatment regimens and comprise levofloxacin, moxifloxacin, bedaquiline and linezolid)
Isolates may be resistant to other antibiotics in addition to those described above.
Laboratory-confirmed resistance
Resistance is reported as either resistant or sensitive. Testing is by whole genome sequencing alone or in combination with phenotypic testing. Discordances between the 2 testing methods were resolved by the reference laboratory and the reported value is used for this data analysis. The denominator for all resistance proportions is culture positive notifications, for example known resistance reported as a proportion of culture-positive cases (regardless of whether the others were sensitive or unknown resistance).
M. bovis is intrinsically resistant to pyrazinamide. The designation of resistance using genomics relies on a database of known resistance. A change in the pncA gene (gene encoding pyrazinamidase) associated with lineage 1Ìý. Isolates that are lineage 1 with changes in above gene are checked using phenotypic methods and coded as resistant if resistant by phenotype and sensitive byÌýWGS.
Where ethambutol and pyrazinamide results are missing or unknown (and not lineage 1 for pyrazinamide) but results are known to be sensitive for isoniazid and rifampicin, these areÌý.
In July 2024, the manufacturer of culture media for pyrazinamideÌýpDSTÌýissued a field safety notice, mentioning the risk of false resistance results from June 2023. Therefore, in this report we report on all 4 first line drugs up to 2022 but omit pyrazinamide resistance results in 2023 and 2024. In agreement, the category resistance to any first line drug in 2023 does not include pyrazinamide.
Quinolone resistance is determined by detection of mutations detected byÌýWGSÌýin the quinolone resistance determining region genes:ÌýgyrA, gyrB and parC, or if resistant to moxifloxacin or levofloxacin when testedÌýpDST.
Currently,ÌýWGSÌýmutations are not used to confirm resistance to the following group A drugs and novel agents: linezolid, bedaquiline and delamanid. Isolates with rifampicin resistance and others meeting particular clinical criteria are sent to a specialised laboratory forÌýpDSTÌýfor these drugs.
This year we also report on the phenotypic test results for the drug pretomanid. For this drug we use a different threshold for lineage 1 disease. Included as resistant are isolates from people with lineage 1 disease that showed a minimal inhibitory concentration (MIC) of more than 2.0 mg/L, and for people with any other lineage of M. tb a MIC of more than 0.5 mg/L is considered resistant.
Acquired resistance
This is resistance in a person with more than one sample over time where the first sample shows sensitivity to a given drug and second sample is resistant.
Treated as resistant
This includes notifications that have no culture result but are recorded inÌýNTBS, or the multidrug resistance database, or comments inÌýNTBSÌýindicate the individual has been treated asÌýMDRÌýwith a second line drug regimen (for example, contacts ofÌýMDRÌýindividuals with activeÌýTBÌýtreated forÌýMDR, or those diagnosed and or started treatment abroad, or people intolerant to rifampicin).
TotalÌýMDRÌýorÌýRRÌýcohort
This includes both those with culture-confirmedÌýMDRÌýorÌýRRÌýTBÌýand those who were treated as resistant with second line drug regimen.
Clustering isolates
WGSÌýwas implemented for all of England in 2018. Results are available only if the isolate was successfully cultured. An isolate is defined as being in a cluster if it has 12 or fewer genetic differences (known as single nucleotide polymorphisms or SNPs) between it and another isolate that has previously been sequenced.
More detail onÌýUKHSA’s approach toÌýWGS based typing is found in theÌýWGSÌýhandbook.
The current database includes samples from devolved nations and research samples; therefore, we report positive clusters where there is more than one person in the cluster from England. The definition for clustered is:
- Yes: 12 SNPs or fewer from another person’s sample and there is more than 1 person resident in England in the cluster
- No: the sample is 12 SNPs or more from any other person’s sample that has been sequenced in theÌýUKHSAÌýdatabase
However, the proportions of notifications clustered are reported as the percentage of clustered isolates (corresponding to a single notification) as a percentage of all culture positive cases. This is also used for the risk ratio analysis of risk of a notification being in a cluster.
Note that contacts may be identified and assumed to be in clusters based on epidemiological information obtained through contact tracing. However, only those with active disease andÌýWGSÌýinformation are reported here.
TBÌýtreatment, diagnostic and treatment delays
Enhanced case management and directly observed treatment
Numbers and proportions of people with enhanced case management (ECM) per level, and those receiving directly observed treatment (DOT) were calculated for all of those with information onÌýECM, andÌýDOTÌýavailable. People who had information onÌýDOTÌýbut were missingÌýECMÌýdata, were coded as ‘Yes’ for anyÌýECMÌýand coded into level 3 ofÌýECM. Those who had missing information on anyÌýECMÌýbut were recorded as being in level 0 ofÌýECMÌý(equivalent to standard treatment) were recoded as having ‘No’ in theÌýECMÌýbinary variable ofÌýECMÌýrequired, thereby considerably reducing proportion of notification with missing information.
The percentage of anyÌýECMÌýwas calculated as the proportion of cases that reported ‘Yes’ (1) toÌýECM, or (2) toÌýDOTÌýoffered or (3)ÌýDOTÌýreceived out of all cases with information. The percentage ofÌýECMÌýper level was calculated as the proportion of cases with a known level ofÌýECMÌýout of all cases with information on ‘anyÌýECMÌýrequired’ (‘Yes’ or ‘No’). The percentage missing data is calculated as the proportion of allÌýTBÌý²Ô´Ç³Ù¾±´Ú¾±³¦²¹³Ù¾±´Ç²Ô²õ for each year with no information recorded in (1)ÌýECMÌýrequired, (2)ÌýECMÌýlevel required or (3)ÌýDOTÌýoffered or (4)ÌýDOTÌýreceived.
Diagnostic delays
Delays toÌýTBÌýdiagnosis is calculated as the days difference between self reported date ofÌýTBÌýsymptom onset and the date ofÌýTBÌýdiagnosis as recorded inÌýNTBS. Diagnostic delays are not calculated for those who were diagnosed withÌýTBÌýat post mortem and those with missing data, so these are not included in the denominator for the proportion of people with delays toÌýTBÌýdiagnosis. Negative diagnostic delays, resulting from symptoms presenting post diagnosis, were also excluded from the analysis as these are likely to indicate data errors or treatment side effects as opposed to disease symptoms.
Reporting delays
Reporting delay is calculated as the days difference betweenÌýTBÌýdiagnosis date and date ofÌýTBÌýnotification toÌýNTBS/ETS. Reporting delays are not calculated for those with missing data for date of diagnosis, so these are not included in the denominator for the proportion of people with reporting delays. People where the notification date was before the diagnosis date (that is, clinical diagnosis happened before laboratory-confirmed diagnosis) were reported as having 0 days of report delay.
Treatment delays
Treatment delay is calculated as the days difference between self-reported date ofÌýTBÌýsymptom onset and the date treatment started as recorded inÌýUKHSA’s surveillance systems (NTBS/ETS). Treatment delays are not calculated for those who have not started treatment, those who were diagnosed withÌýTBÌýat post mortem and those with missing data, so these are not included in the denominator for the proportion of people with treatment delays.
Treatment delays exceeding 2 years (730 days) are excluded from analysis as symptoms lasting for over 2 years are thought to relate to another episode ofÌýTB. Negative treatment delays, resulting from symptoms presenting post treatment start, were also excluded from analysis as these are likely to indicate data errors or treatment side effects as opposed to disease symptoms.
TBÌýcohort definitions
For the purposes of reporting treatment outcomes for people withÌýTB, 2 mutually exclusive cohorts are defined. They are:
- MDR/RR TBÌýcohort: people withÌýTBÌýwho were diagnosed withÌýMDRÌýorÌýRR TBÌýand or were treated with a second line drug regimen forÌýMDRÌýorÌýRRÌýTB
- non-MDR/non-RR TBÌýcohort: people who were not identified asÌýMDRÌýorÌýRR TBÌýand were treated with a first line treatment regimen for non-MDRÌýor non-RRÌýTB
Under this definition, people withÌýTBÌýresistance to isoniazid, ethambutol and/or pyrazinamide but without resistance to rifampicin are included in the non-MDRÌý/ non-RR TBÌýcohort.
Outcomes are reported for the non-MDR/non-RR TBÌýcohort according to the year of notification up to, and including, 2021. This is to ensure that at least one year of data is available to report treatment outcome by the expected standard treatment duration of less than 12 months. In this cohort, outcomes are reported separately for persons withÌýCNSÌýdisease, or in those in whomÌýCNSÌýdisease cannot be excluded, which includes those with spinal, cryptic disseminated or miliary disease. For this sub group, the last recorded treatment outcome is reported as standard treatment is a minimum of 12 months.
Outcomes are reported for theÌýMDRÌýorÌýRR TBÌýcohort according to the year of notification, up to, and including, 2020. This is to ensure availability of data for the expected standard treatment duration of up to 24 months.
TBÌýtreatment outcomes were extracted fromÌýNTBSÌý(2020 to 2022) andÌýETSÌý(2001 to 2020) and cleaned and validated using comment fields, post mortem diagnoses, date of key events and case manager follow up.ÌýTBÌýdiagnoses that were recorded at post mortem were excluded fromÌýTBÌýtreatment outcomes as these cases were not treated. These deaths are reported separately and added toÌýTBÌýtreatment deaths to report totalÌýTBÌýdeaths. This is a change from methodology in reports earlier than the 2021 data annual report. Therefore, please note that all treatment outcome results in this report are not directly comparable with reports prior to this.
LTBI
Completion of prophylactic treatment forÌýLTBIÌýwas defined as the presence of a treatment completion date, with or without the presence of a treatment start date.
Disclosure control methods
Only aggregate data is reported. Aggregated data values less than 5 are suppressed except if it is:
- the aggregate number of notifications within a single year for England for children aged under 5 years for each sex as the risk of disclosure is considered very low compared with the importance of monitoring changes in young children
- the aggregated number across multiple years for large geographic areas (England orÌýUKHSAÌýcentre)
- the average notifications over multiple years for a geographical area, the smallest of which (by population) is lower local authority
Data analysis
TBÌýrates
TB rates per 100,000 population are calculated using mid-year population estimates from ONS (extracted: 31 July 2025). When grouped by place of birth (‘UK born’ or ‘non-UK born’), mid-year Labour Force Survey (LFS) estimates were previously used. However, these have now been replaced with Annual Population Survey (APS) estimates (January-December) (extracted: 20 June 2025).
This change has been made as APS provides a larger sample size and more robust coverage than the LFS (extracted: 6 May 2025), resulting in more reliable population estimates, particularly for smaller subgroups. Compared with the ONS mid year population estimates, APS and LFS differ in methodology and coverage: LFS draws from a smaller household survey, while APS combines multiple sources to improve precision. To aid clarity, the denominators used in these calculations are included in each of the supplementary tables.
Average annual rates per 100,000 for the 3-year period are calculated by dividing the numerator (the number ofÌýTBÌý²Ô´Ç³Ù¾±´Ú¾±³¦²¹³Ù¾±´Ç²Ô²õ in the 3-year period) by the denominator (the sum of the mid-year population estimates for the same 3-year period) and multiplying by 100,000.
Confidence intervals
95% confidence intervals are model derived and were calculated using assumptions of the Poisson distribution for rates and the binomial distribution for proportions.
Risk ratios
Risk ratios are model derived using the binomial distribution for proportions.
Proportions
Data cleaning and analyses were undertaken using R (R4.3.1) and Stata 17 SE. The code is reviewed and output quality assured using a standard template. Code is held inÌýUKHSAÌýinternal GitHub repositories.
Glossary
95%Ìýconfidence interval
A confidence interval is a measure of the degree of uncertainty in an estimate based on a sample distribution. 95% confidence intervals indicates that if we repeated the study many times, 95% of the confidence intervals would contain the true population value. Wider confidence intervals indicate more uncertainty in the estimate. Overlapping confidence intervals indicate that there may not be a true difference between estimates.
Diagnostic delay
The diagnostic delay represents the time (in days) from when a person self reportedÌýTBÌýsymptom onset to when they are diagnosed withÌýTB.
Directly observed treatment (DOT)
DOTÌýis a treatment strategy which refers to the patient taking treatment under direct in person observation of a trained health care worker or designated individual to ensure treatment adherence for patients requiringÌýECM.
Enhanced case management
ECMÌýis defined as the increased level of patient monitoring for people with (complex) clinical or social issues or both affecting treatment. There are 3 levels ofÌýECMÌýdepending on the complexity of the clinical or social issues or both and the intensity of patient monitoring required, ranging from fortnightly or weekly visits to necessitatingÌýDOTÌýor VOT.ÌýECMÌýmay be required for children withÌýTB, those with HIV and taking antiretrovirals, people with complex side effects or single drug resistance and those with complex contact tracing or cases in which the involvement of social services is required. For more information see theÌý.
International migrant
An international migrant is classified as the movement of a person across international borders to seek temporary or permanent residence in another country.
INHÌýresistant
TBÌýthat is resistant to isoniazid, a first line anti-TBÌýdrug, and not other drugs.
Monoresistant to a drug other thanÌýINH
Resistance to a first line treatment drug other thanÌýINH, for example, ethambutol.
Multidrug-resistantÌýTBÌý(MDRÌýTB)
Multidrug-resistantÌýTBÌý(MDRÌýTB) is defined as resistance to at least isoniazid and rifampicin, with or without resistance to other drugs.
Pansensitive
Fully sensitive to all first line drugs, for example, isoniazid.
Polydrug-resistant
Polydrug resistance refers to resistance to 2 or more first line drugs but not to both isoniazid and rifampicin.
Post-mortem diagnosis
A person diagnosed at post mortem is defined as havingÌýTBÌýwhich was not suspected before death, but aÌýTBÌýdiagnosis was made at post mortem, with pathological and/or microbiological findings consistent with activeÌýTBÌýthat would have warranted anti TBÌýtreatment if discovered before death.
PulmonaryÌýTB
A person with pulmonaryÌýTBÌýis defined as havingÌýTBÌýinvolving the lungs and/or tracheo bronchial tree, with or without extra pulmonaryÌýTBÌýdiagnosis. In this report, in line with the WHO’s recommendation and international reporting definitions, miliaryÌýTBÌýis classified as pulmonaryÌýTBÌýdue to the presence of lesions in the lungs, and laryngealÌýTBÌýis also classified as pulmonaryÌýTB.
pDST
Phenotypic Drug Sensitivity Testing.
Quinolone-resistantÌýTB
TBÌýbacilli that have either mutations detected byÌýwhole genome sequencing (WGS)Ìýin the quinolone resistance determining region genes:ÌýgyrA, gyrB and parC, or when tested withÌýpDSTÌýwere found resistant to moxifloxacin or levofloxacin.
Social risk factors forÌýTB
These include current alcohol misuse, current or history of homelessness, current or history of imprisonment, current or history of drug misuse, current mental health needs, or current status as an asylum seeker. Please see relevant section under reporting methodology for further details of these variables.
Risk ratios (RR)
RRs quantify the relative risk of the outcome we are interested in between 2 different groups. For example, the relative risk of pulmonary disease in males compared with females. This is calculated as the proportion of males with pulmonary disease divided by the proportion of females with pulmonary disease, which is aÌýRRÌýof 1.18, (95% CI: 1.11 to 1.25). This is interpreted that males have an 18% increased risk of pulmonary disease compared with females and we have 95% confidence that the true increased risk lies within the range of 11% to 25%. If a 95% CI for aÌýRRÌýincludes the value of 1.0, then we cannot infer that the trueÌýRRÌýis different from 1.
As a result, we would say that these results are not providing any evidence that the observed magnitude of theÌýRRÌýis ‘statistically important’. If anÌýRRÌýof less than 1.0 is reported, such asÌýRRÌý0.85, this is interpreted that the group of interest have a 15% reduced risk of the outcome.
RRÌýTB
Resistant to rifampicin, a first line drug, and not other drugs.
Pre-XDRÌýTB
Resistant to rifampicin, isoniazid, and any quinolone.
XDRÌýTB
Resistant to rifampicin, isoniazid, any quinolone and one additional group A drug (bedaquiline, linezolid).