Methods Study design The cohort The Romanian COVID-19 mortality sub-study was conducted between March 22 and April 20, 2020, with the aim to explore the comorbidity patterns in the patients who died because of this new virus. A multiple-case, multiple-center design was chosen to answer this research question with data collected from the official public notifications of MAI concerning the COVID-19 deaths registered by the INSP3,4,8. Initially, 451 COVID-19 deaths were reported by INSP from the beginning of the epidemic until April 20 (Fig. 1). The list was completed afterwards with 69 additional cases due to the delay in the reporting process from some centers of COVID-19 infection. The official report included information about gender, age, county of residence, history of hospitalization and diagnostics, and/ or comorbidities. Only cases with information about gender, age and comorbidities were included in the study due to the main focus of the research42. Eventually, 432 patients who fulfilled these criteria were kept in the Romanian COVID-19 mortality sub-study. Since we aim at an analytic and not statistical generalization of the results regarding the comorbidity patterns, statistical testing was just serving exploratory purposes42. Hence, the complete-case analysis is justified in this case both by purpose and by the state of things. Case-case study Additionally, we run a case-case study utilizing data regarding the deaths due to hospital pneumonia between March 22 and April 20 in the years 2016–2018 extracted from the administrative database of the INSP. The study was carried out with the approval of the Local Bioethics Council of the INSP (number 6342/04 MAY 2020) and with the permission of the Local Bioethics Council of the “Carol Davila” University of Medicine, Bucharest Romania (9984/11 MAY 2020). Statistical analysis An exploratory data analysis was applied to identify general, spatial-, gender—or age-specific comorbidity patterns in our study. In a first step, the various conditions were sorted according to the standard diagnosis tool ICD-11 (The International Classification of the Diseases) as it was recommended for the mortality and morbidity statistics by the WHO43. The ICD-11 taxonomy includes categories of single diseases or the aggregated groups of diseases. We investigated the comorbidities at both levels of aggregation because the study included a large number of distinct conditions with low prevalence. The statistical analysis was done with the help of the statistical software R version 3.6.2. Continuous variables were expressed as mean, standard deviation (SD) and median. We focused in this paper on comparing the individual crude relative frequencies of the single and aggregated comorbidities between different gender and age groups. Since age is a significant risk factor for mortality, we investigated the frequency of the comorbidities stratified on ten-year intervals starting with the 50+ age groups. Categorical variables were tabulated in frequency (n) and relative frequency (%). A binomial and a chi-squared test for one proportion (only the last was reported) and a chi-squared test for equal proportions were applied when it was appropriate at the 5% level of significance. The post-hoc tests were not corrected for multiple comparisons since the focus of our study was on generating hypotheses for single diseases or group of diseases and not on an experiment-wise approach. The spatial distribution of the COVID-19 mortality corresponded to the country-specific administrative units– counties and was illustrated by single or aggregated disease frequency and relative frequency (%). The association between different comorbidities and the odds ratio between COVID-19 and pneumonia deaths was investigated with the help of the logistic regression, and the results were reported as coefficients β, as point estimates for odds ratio and their 95% confidence intervals. Multimorbidity was computed for each person in the sample since this is a good indicator of the burden of disease44,45 and transformed in a multimorbidity factor with levels 0, 1, 2, 3, 4, or more the 5 for single or aggregated comorbidities. A controversial issue regarding the epidemic of COVID-19 is the burden of disease in the deceased population and the hypothetical survival probability in the case that these people would not have been infected with this virus. A simplified version of the Charlson Comorbidity Index (CCI) was calculated on a scale from 0 to 8, as indicated in24 for an assessment of the individual disease severity. We introduced the CCI factor with two levels (mild for CCI < 3 and severe for CCI ≥ 3), and we tested its differences between gender and age groups by a chi-squared test. Following the same paper24, we adapted their statistical model for the individual prognosis of the death risk based on the reported hazard ratio for the age levels and for the CCI in multiple European studies as explained in the section 8 of the supplementary material. This gave us a prognosis for the mean, the standard deviation and the median of the crude 1-year death risk and survival probability. The next step in the analysis of the COVID-19 comorbidities concerns the co-occurrence of the diseases. This was explored with the help of the Pearson's phi coefficient as defined and investigated, e.g. in46 and computed for each pair of single or aggregated diseases with an incidence of at least 4. A value of 1 means a "perfect match", 0 means no co-occurrence and negative values up to − 1 means "inverse occurrence". A comorbidity network approach was considered to investigate the relationships between diseases. We computed the dissimilarity between the conditions studied in the previous step as one minus the Pearson's phi. Hence, a value of 1 corresponds to "no association" and values larger than 1 to "inverse association". The hierarchical classification was performed through an agglomerative clustering algorithm with the Ward method since this proved to have the largest agglomeration coefficient. Furthermore, the dendrogram was cut at nodes corresponding to branch length of 1 based both on a visual inspection and on predefined length of the branch. This study has several limitations. We considered only the deaths up to April 20 when the peak of the first pandemic wave was reached in Romania. Moreover, total case design was used because of the missing data in official public communications. Disclaimer This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Ethical compliance This research was carried out in accordance with the relevant guidelines and regulations. Informed consent We have obtained informed consent from all participants and/or their legal guardians.