Improvement of data quality on the underlying cause of death from external causes using Health, Public Security and Press sector database linkage in the State of Rio de Janeiro, Brazil, 2014

Aloísio Sabino Lopes1  , Valéria Maria de Azeredo Passos (orcid: 0000-0003-2829-5798)2  , Maria de Fatima Marinho de Souza3  , Angela Maria Cascão1 

1Governo do Estado do Rio de Janeiro, Secretaria de Estado de Saúde, RJ, Brasil

2Universidade Federal de Minas Gerais, Faculdade de Medicina, Belo Horizonte, MG, Brasil

3Ministério da Saúde, Secretaria de Vigilância em Saúde, Brasília, DF, Brasil



to describe improvement of the quality of data on the underlying cause of death from external causes, after performing Health, Public Security and Press sector database linkage in the State of Rio de Janeiro, Brazil, 2014.


deterministic data linkage on deaths from external causes of undetermined intent and deaths from undetermined natural causes held on the Mortality Information System (SIM), Forensic Institute, Civil Police, Urgent Mobile Care Service (SAMU) and press databases.


of the 13,916 deaths from external causes, deaths from causes of undetermined intent were reduced from 5,836 (41.9%) to 958 (6.9%); while 222 (10.7%) of the 2,069 deaths from undetermined natural causes were reclassified to external causes; there was an increase in mortality due to traffic accidents (93.0%), assault (71.6%), legal intervention (744.7%), intentional self-harm (112%) and other accidents (29.9%).


there was an improvement in the quality of the information by type of underlying cause of death from external causes, using a strategy that can be reproduced by other services.

Keywords: Mortality Registries; External Causes; Underlying Cause of Death; Quality Control; Information Systems


The quality of information on mortality from accidents and violence is fundamental for informing good Public Health practices related to quantifying the problem, evaluations over time and for proposing violence reduction measures and the evaluation of their effectiveness. In Brazil these is still evidence of under-enumeration of deaths from external causes, whether because of absence of reporting (underreporting), or because of poor classification of registered causes.1

Improvement of the quality of data on mortality from external causes is of particular relevance for Brazil. These causes have high rates of mortality and also a pattern that is peculiar to them: deaths from homicide and traffic accidents are the main external causes, whereas in the rest of the world the main causes are suicide and death in armed conflict.2 Between 1990 and 2015, despite the reduction in external cause mortality rates from 105.1 to 81.2/100,000 inhabitants, deaths from violence not only revealed alarming rates but also went from seventh to second place as the cause of years of life lost - owing to premature death - in Brazil.3

In 2012 the Mortality Information System (SIM) reached a satisfactory level for anaylzing mortality patterns, having 92% coverage on average and reaching 100% in some states.4 Notwithstanding, as well as coverage, accurate reporting is an indispensable quality parameter for reliable analysis. In recent years there has been an improvement in the quality of external cause statistics, mainly as a result of the actions of health department teams by implanting policies on investigating and reclassifying the original causes of death issued by Forensic Medicine Institute (IML) doctors, based on active tracing of IML records. Despite this improvement, there is still a high proportion of deaths classified as unspecified external causes or causes of undetermined intent. In 2013, the state of Rio de Janeiro had the highest national percentage of deaths from external causes of undetermined intent (12.5%), followed by the states of Bahia (6.8%) and Rio Grande do Norte (6.3%).4

Since the 1990s studies have warned as to problems with the quality of the recording of underlying cause of death by medical examiners.5,6 In São Paulo it was found that the IML did not use the information it had when filling out death certificates.5 A reason put forward for this finding would be fear on the part of medical examiners of stating a violent circumstance as the underlying cause of death. These professionals tend to describe the clinical injury rather than using the World Health Organization (WHO) definition which requires the description of the circumstances of the accident or violence causing the injuries that lead to death.6

Following implantation of the SIM system in 1975, the Rio de Janeiro State Health Department (SES/RJ) turned to complementary databases, above all police records, in order to qualify information on external causes of deaths certified by medical examiners. In 2007, State Law No. 5061 restricted access to police information,7 and this resulted in an increased proportion of deaths from external causes of undetermined intent. The routines for re-establishing database linkage between SIM, IML and the Scientific and Technical Police branch were put in place once more in 2014.8,9 The Information on Deaths from External Causes Qualification and Management Group was created, this being a permanent commission maintained through technical and administrative resources provided by SES/RJ and the Institute of Public Security of the State Security Department responsible for monitoring and analyzing the data.9 Since 2014, the health surveillance team has had access to IML data and data on criminal cases held by the Civil Police of the State of Rio de Janeiro, as well as access to the records of the Urgent Mobile Care Service (SAMU).

Thus far, the majority of published studies report on actions to reclassify external causes of mortality based on active tracing of cases at IML or based on data reported by the press.1 Deterministic linkage of accident databases (Mortality Information System - SIM, Hospital Information System of the Brazilian National Health System - SIH/SUS and Highway Police) was used to improve information in Five Brazilian state capitals in 2012 and 2013. This revealed the potential of this tool for qualifying deaths from external causes with ill-defined underlying causes.10

The purpose of our study was to describe the improvement of the quality of data on underlying cause of death from external causes by linking the databases of the Health, Public Security and Press sectors in the state of Rio de Janeiro, Brazil, in 2014.


This is a descriptive study of quality improvement, via deterministic data linkage, of data on deaths from external causes of undetermined intent and deaths from undetermined natural causes in the state of Rio de Janeiro in 2014.

Rio de Janeiro is one of Brazil’s 27 Federative Units, covering an area of 43,780.172 km². According to the 2010 demographic census, it is Brazil’s most populous state, accounting for 8.4% of the country’s population and having the highest population density of all Brazil’s Federative Units. The state has the country’s second largest gross domestic product and the third highest literacy rate.11

Rio de Janeiro has 12 IML facilities. SES/RJ’s Mortality Information System (SIM/SES/RJ) recorded 131,519 deaths in 2014, 105,943 (80.5%) of which were from natural causes with death certificates issued by health services, while 25,576 (19.5%) related to deaths referred to the state IML services. Of the 25,576 deaths referred to the IML services, external causes accounted for 13,916 (54.4%); 11,600 (45.6%) deaths from natural causes were referred to the IML services, 2,069 of which were from undetermined natural causes.

SIM was the primary data source on deaths. SIM records were linked with the Public Security databases (IML and Civil Police) and, in a complementary manner, were also linked with SAMU and press records. The database using the press as its source is comprised of information gathered through daily reading of electronic media by a technical SES/RJ staff member. The key variables used for data linkage were the name of the deceased person, their age and the municipality in which death occurred. The time taken between the process being begun and the correction of the qualified data on the SIM system, and this data then being made available for our study’s database and tabulation using Tabnet, was around ten months.

All deaths from external causes of undetermined intent (ICD: Y10-Y34) and undetermined natural causes (ICD: R99) certified by IML in 2014 were investigated.

Sociodemographic data were obtained from information recorded on the SIM system databases: sex (male, female or unknown), age range (in years: 0-9, 10-19, 20-39, 40-59, 60-69, 70 or over, unknown), race/skin color (white, black, brown, other [yellow or indigenous], unknown), place of death (health establishment, household, public thoroughfare, others) and administrative region (Baia da Ilha Grande, Baixada Litorânea, Centro-Sul, Médio Paraíba, Metropolitana I, Metropolitana II, Noroeste, Norte, Serrana, others).

Figure 1 provides a summary of the reclassification work process. Stage 1 of deterministic linkage consists of obtaining data for all deaths from external causes recorded on SIM, as well as data on deaths from undetermined natural causes held on the supplementary databases (IML and Civil Police) in the state of Rio de Janeiro in 2014 (step 1). All IML deaths are compared with SIM deaths on a new linked database (step 2), which in turn is linked with the Civil Police database (step 3). Step 4 is the process of searching on SIM for the cases found on the Civil Police database that were not held on IML records. Reviewing these linkage steps, correcting slight data variations (e.g. typing errors fond in names), gives rise to the database containing all the death certificates found on the three databases (step 5).


SIM: Mortality Information System.

IML: Forensic Medicine Institute.

SAMU: Urgent Mobile Care Service.

SES/RJ: Rio de Janeiro State Health Department

Figure 1 - Procedure for linking databases on deaths from external causes of undetermined intent, state of Rio de Janeiro, 2014 

Stage 2 refers to the selection of all external causes of undetermined intent (ICD: Y10-Y34) and R99 codes for natural causes, using a new file containing the records validated in step 5 (steps 6 and 7). Step 8 consists of checking for the existence of information, in any of the other three databases, capable of qualifying underlying causes of death. Step 9 consists of consolidating all records the underlying causes of death of which were determined in step 8. Records with causes that have still not been clarified at this point are sent to the Civil Police technician who searches for new data in police death investigation records (step 10). Causes determined through police investigation are then consolidated (step 11). Finally, remaining records with undetermined causes are compared with SAMU files and with data on deaths with violent causes published by the press, whereby elements satisfactorily meeting the needs of the study are extracted (step 12).

In Stage 3 a new file is created containing all the records examined and with all circumstances of death recovered (step 13). The records contained in this file are reviewed and coded at SES/RJ (step 14) and sent to the respective Municipal Health Departments which are responsible for altering SIM records accordingly, under the supervision of the State Health Department (SES/RJ) (steps 15 to 17).

The key variables used for data linkage were: police report record, death certificate record and name of the deceased, as well as a combination of these variables in specific cases, in order to form entities. Mother’s name, age, date of occurrence and municipality of death and municipality of residence of the victim were also used. Data linkage was performed using the Microsoft Access application and the ArcGIS application phonetic search operator.12 The latter was loaned by Public Security Institute technical staff. ArcGIS generates scores ranging from 0 to 100% when the phonemes of the victims’ names are cross checked on the databases; linkages scoring 90% or more were validated.

The study project was approved by the Federal University of Minas Gerais Research Ethics Committee: Certificate of Submission for Ethical Appreciation (CAAE) No. 75555317.0.0000.5149 on 20/09/2017. We used information on causes of death recorded on death certificates available on the Ministry of Health SIM System as well as information held on databases the use of which is restricted to SES/RJ technical staff who, generally, have access to this type of information and are ethically committed to not disclosing the identity of the deceased person.


When compared to the 25,576 records held on SIM, 19,066 (74.5%) were found in both databases, 1,831 (7,2%) were only found on the IML database and 1,819 (7.1%) were only found on the Civil Police database; 2,860 (11.2%) were only held on SIM and were not found in the other databases.

With regard to the participation of the databases in the reclassification of the 5,836 records initially identified on SIM as having undetermined external causes, we found that the Civil Police database accounted for 2,148 (36.8%) and the IML database for 1,211 (20.8%) of the reclassified cases. Using press and SAMU data we were able to reclassify 1,454 (24.9%) and 65 (1.1%) records, respectively. At the end of the entire process, we were not able to improve data on the underlying cause of death for 958 (16.4%) records.

Table 1 is a contingency table summarizing the reclassification process between the main groups of causes before and after investigation. The number of records with undetermined causes of death reduced from 5,836 (41.9%) to 958 (6.9%) among deaths referred to IML. This reduction leads to the relative increase in the occurrence of all external causes: traffic accidents (93.0%), assaults (71.6%), legal intervention (744.7%), intentional self-harm (112.0%) and other accidents (29.9%).

Table 1 - Comparison of death reclassification, before and after linking data on deaths from external causes, state of Rio de Janeiro, 2014 

Types of causes Traffic accident Assault Legal intervention Intentional self-harm Other accidents Undetermined intent Natural causes Undetermined (R99) Total (before)
Traffic accident 1,528 3 - - 16 1 - - 1,548
Assault 8 3,103 19 2 9 6 - - 3,147
Legal intervention - - 38 - - - - - 38
Intentional self-harm - - - 243 2 - - - 245
Other accidents 96 51 - 7 2,896 33 19 - 3,102
Undetermined intent 1,323 2,117 262 251 939 901 43 - 5,836
Natural causes 27 2 - 3 113 11 9,455 - 9,611
Undetermined (R99) 7 125 2 14 54 6 1,841 - 2,049
Total (after) 2,989 5,401 321 520 4,029 958 11,358 - 25,576

Of the 2,069 deaths from undetermined natural causes (R99), 1,841 were natural causes that should not have been referred to IML. Of the remaining 228 external causes initially recorded using code R99, we were able to qualify 222 (97.4%) causes, the majority of which (125) were due to homicide.

Table 2 describes the sociodemographic characteristics of deaths from causes of undetermined intent, before and after reclassification. With regard to sex, the greatest proportion of undetermined causes was found among males (78,4%), with a reduction in the difference between the sexes following reclassification (males = 55%). Prior to reclassification, there was a higher proportion of undetermined causes among young adults (20-39 years = 35.6%) and the elderly (≥70 years = 19.8%). Correction of ill-defined underlying cause of death was proportionately higher among children and adults aged up to 59 years. This lead to the relative increase of undetermined causes among the elderly following the investigation, whereby they accounted for 52.6% of undefined cases following data linkage. There was a greater proportion of deaths from undetermined external causes among people with brown skin color before but not after reclassification. Reclassification also lead to the reduction of undetermined causes of deaths in public thoroughfares, from 21% to 8.5%, and the relative increase of cases in health establishments, from 50.6% to 74.7% of cases. When comparing the state’s administrative regions, we found differences in the percentage variation of deaths from external causes of undetermined intent. The highest percentages of ill-defined external causes, 81.0% before qualification and 73.5% after, were found in the Metropolitan I administrative region which covers the state’s capital city.

Table 2 - Distribution of external causes of undetermined intent, by sex, age range and place of death, state of Rio de Janeiro, 2014 

Characteristics Number of causes of undetermined intent Proportion of causes of undetermined intent (%) Proportion of external causes (%)
Before N=5836 After N=958 Before N=5836 After N=958 Before N=13916 After N= 14218
Male 4,578 527 78.4 55.0 32.9 3.7
Female 1,208 409 20.7 42.7 8.6 2.9
Unknown 50 22 0.9 2.3 0.4 0.2
Subtotal 5,836 958 100.0 100.0 41.9 6.7
Age range (years)
≤9 44 4 0.8 0.4 0.3 0.1
10-19 705 27 12.1 2.8 5.1 0.2
20-39 2,079 88 35.6 9.2 14.9 0.6
40-59 1,143 136 19.6 14.2 8.2 0.9
60-69 397 88 6.8 9.2 2.9 0.6
≥70 1,156 504 19.8 52.6 8.3 3.5
Unknown 312 111 5.3 11.6 2.2 0.8
Subtotal 5,836 958 100.0 100.0 41.9 6.7
Race/skin colour
White 2,211 518 37.9 54.1 15.9 3.6
Black 829 118 14.2 12.3 5.9 0.8
Brown 2,529 243 43.4 25.4 18.2 1.7
Other 14 3 0.2 0.3 0.1 0.1
Unknown 253 76 4.3 7.9 1.8 0.5
Subtotal 5,836 958 100.0 100.0 41.9 6.7
Place of death
Health establishment 2,953 716 50.6 74.7 21.2 5.0
Household 400 67 6.9 7.0 2.9 0.5
Public thoroughfare 1,227 81 21.0 8.5 8.8 0.6
Others 1,256 94 21.5 9.8 9.0 0.6
Administrative region
Baia Grande 10 6 0.2 0.6 0.1 0.04
Baixada Litorânea 83 34 1.4 3.6 0.6 0.24
Centro-Sul 18 9 0.3 0.9 0.1 0.06
Médio Paraíba 25 7 0.4 0.7 0.2 0.05
Metropolitana I 4,727 704 81.0 73.5 33.9 4.95
Metropolitana II 739 118 12.7 12.4 5.3 0.83
Noroeste 25 11 0.4 1.1 0.2 0.08
Norte 57 13 1.0 1.4 0.4 0.09
Serrana 142 52 2.4 5.4 1.0 0.37
Others 10 4 0.2 0.4 0.1 0.03
Total 5,836 958 100.0 100.0 41.9 6.74

Table 3 shows the distribution of deaths from external causes of undetermined intent before and after data linkage, and the change in the rates of proportional mortality by cause in the state of Rio de Janeiro in 2014. It can be seen that the considerable reduction in deaths from undetermined causes increased the relevance of traffic accidents and assaults in the state’s mortality profile, both in terms of overall mortality and also in terms of the proportion of deaths from external causes.

Table 3 - Distribution of deaths from external causes, before and after data linkage, state of Rio de Janeiro, 2014 

Deaths from external causes Number of deathsa Proportional mortality by causeb Proportion of deaths per external causesc
Before (n) After (n) Before (%) After (%) Before (%) After (%)
Traffic accident 1,548 2,989 1.17 2.27 11.12 21.02
Assault 3,147 5,401 2.39 4.11 22.61 37.99
Legal intervention 38 321 0.03 0.24 0.27 2.26
Intentional self-harm 245 520 0.19 0.40 1.76 3.66
Other accidents 3,102 4,029 2.36 3.06 22.29 28.34
Undetermined intent 5,836 958 4.44 0.73 41.93 6.74
Total 13,916 14,218 10.58 10.81 100.00 100.00

a) Data before and after linkage.

b) Percentage of total deaths (131,519) in the state of Rio de Janeiro in 2014.

c) Percentage of deaths from external causes in relation to total external causes, before linkage (13,916) or after linkage (14,218).


In theory, all deaths from external causes should be held on SIM, IML and Civil Police files. However, deterministic linkage of the data found discrepancies. In the state of Rio de Janeiro in 2014, reclassification of death certificates through linkage of institutional data lead to a substantial reduction in external causes of undetermined intent, thus improving the profile of overall mortality and also of death from external causes. Linkage of public databases, guaranteed by law, can contribute substantially to reducing the cost of active tracing at IML services, in addition to ensuring the sustainability of these actions.

It is noteworthy that 11.2% of records of external causes held on SIM were not found on the other databases. This finding reveals certification of external causes of death not referred to IML services. This can occur owing to reporting of external causes in municipalities where there is no IML; or owing to incorrect reporting of external causes by health establishments, probably in relation to patients who die some time after having suffered an accident or violence.

As is the case throughout Brazil, in Rio de Janeiro homicides and traffic accidents account for the majority of external causes of death.2 The reclassification of external causes of undetermined intent shows that violent causes of death would be under-enumerated on SIM, if the circumstances of death were not searched for on other databases. It is essential to maintain or even to increase reclassification of this information in Brazil, since the effectiveness of prevention depends on the correct analysis of the magnitude of this problem.

The higher proportion of undetermined causes among males, young people and people of brown skin color is a reflection of the profile of mortality from external causes most common in these sociodemographic strata. An investigation of ill-defined codes conducted between 2011 and 2013 in the municipality of Belo Horizonte also found a high proportion of external causes of undetermined intent and traffic accidents without type specification among young people aged 5-29 years old.13 The higher proportion of cases of external causes of undetermined intent found by us in health establishments can be attributed to SAMU sending serious cases of traffic accidents and violence to IML services in the event of death.

Qualification of death certificates issued by IML medical examiners has taken place for some considerable time in Brazil. In 1995 and 1996, low agreement was found between underlying causes of death due to accidents and violence among people aged under 18 in the municipality of Duque de Caxias (state of Rio de Janeiro), when comparing death certificates issued by legal examiners with those reclassified by the Municipal Health Department (SMS) based on information obtained from IML necropsy reports and police records.14 In Belo Horizonte underclassification of deaths from traffic accidents has been recorded since the 1990s, with low agreement (Kappa=0.124 - 95%CI 0.1555;0.4022) between classification of underlying cause of death from traffic accidents according to medical examiners and final coding done by SMS staff.15 Between 1998 and 2001, a random sample of 411 death certificates stating external causes as the underlying cause of death found only moderate agreement (Kappa=0.602 - 95%CI 0.563;0.641) between the underlying cause of death recorded by medical examiners and the underlying cause reported by Belo Horizonte Municipal Health Department. This means that the majority of underlying causes certified by IML services are corrected by Health Department technical staff.16

Despite better qualification of deaths from external causes - following death certificate reclassification - being a fact that has been demonstrated for decades, the percentage of badly filled out death reporting forms continues to be high. As stressed in previous publications, the role of medical examiners is key to reducing the reporting of external causes of undetermined intent. Lack of better filling out of death certificates by medical examiners implies higher costs for the public health sector because of the need to correct this data afterwards. One of the reasons why IML information has not yet become more consolidated health information may be because of organizational issues. Whereas data on death from natural causes comes almost exclusively from health centers, external causes certified by IML services are official data on mortality provided by Public Security departments. The distinct objectives of these bodies and failure to understand the importance of the social function of this information are some of the causes of the loss of quality of data on mortality from external causes.17

In Brazil, death certificates must be issued soon after death, given that this document is mandatory for burial to take place. Many physicians claim discomfort in having to indicate the probable circumstance of death in field 18 of the death certificate (homicide, suicide or traffic accident), because of the expectations this creates among the deceased’s family members regarding circumstances which often will only be defined by examinations still underway or by subsequent police investigations. In other countries, such as the United States, the medical part of the death certificate can be filled out up to six days after death, thus facilitating the incorporation of the results of examinations and police investigations.18 The Colombian system provides for a preliminary death certificate, solely for the purposes of civil registration, which does not mention the circumstances of death which do not need to be included on the document which the family receives for registration at the registry office.19 At this time in which Brazil is discussing the implantation of electronic death certificates, it would perhaps be opportune to discuss changes that enable better filling out of the medical fields on death certificates.

The high number of natural causes referred to IML services is a factor which probably overburdens them and which may interfere with their quality. Rio de Janeiro, like many other Brazilian states, does not yet have a Death Verification Service to which deaths from natural causes can be referred if they require investigation. In 2017, the Rio de Janeiro Public Prosecutor’s Office filed a lawsuit requiring both the municipality and the state of Rio de Janeiro to implant this service at state level and in the municipality of Rio de Janeiro.20 Notwithstanding, the obligation and the challenge of qualifying physicians to correctly fill out death certificates cannot be left to one side. Medical examiners, who tend to fill out death certificates with a description of the injuries found by their examinations, should be provided with information on the epidemiological importance of recording the circumstances of death.

A limitation of this study is that it was not designed to investigate the impact of reclassification according to specific causes, given that it analyzes a routine service activity. Furthermore, the sociodemographic variables obtained from death certificates also suffer from problems with regard to incomplete information and information quality, as can be seen from the missing information in Table 2.

To conclude, improved information and the feasibility of qualifying deaths from external causes in the state of Rio de Janeiro, by linking institutional data which are also available in all of Brazil’s Federative Units, have lead the authors of this study to share the success of this experience and to recommend it to Mortality Information System (SIM) managers throughout Brazil.


Dr. Elisabeth Albernaz and Dr. Mariana Rodrigues, of the Rio de Janeiro Public Security Department. Luciano Gonçalves, geographer at the Institute of Public Security (ISP) of the Rio de Janeiro Public Security Department. Military Police Colonel Marcus Ferreira, vice president of ISP/RJ in 2014.


Received: April 11, 2018; Accepted: September 16, 2018

Correspondence: Valéria Maria de Azeredo Passos - Rua General Ribeiro da Costa, No. 178/701, Rio de Janeiro, RJ, Brazil. Postcode: 22010-050. E-mail:

Authors' contributions

Lopes AS played a substantial role in the conception of the study, carried out the procedures for obtaining and analyzing the data and took part in preparing the first version of the manuscript. Passos VMA played a substantial role in the conception of the study and analysis of the results, and took part in preparing the first version of the manuscript. Souza FM played a substantial role in the conception of the study, analysis of the data and critical review of the manuscript. Cascão AM played a substantial role in the conception of the study, obtaining and analyzing the data, as well as critically reviewing the manuscript. All the authors have approved the final version and are responsible for all the aspects of this study.

