Translation-invariant functional clustering on COVID-19 deaths adjusted on population risk factors

The COVID-19 pandemic has taken the world by storm with its high infection rate. Investigating its geographical disparities has paramount interest in order to gauge its relationships with political decisions, economic indicators, or mental health. This paper focuses on clustering the daily death rates reported in several regions of Europe and the United States over seventeen months. Several methods have been developed to cluster such functional data. However, these methods are not translation-invariant and thus cannot handle different times of arrivals of the disease, nor can they consider external covariates and so are unable to adjust for the population risk factors of each region. We propose a novel three steps clustering method to circumvent these issues. As a first step, feature extraction is performed by translation-invariant wavelet decomposition which permits to deal with the different onsets. As a second step, single-index regression is used to neutralize disparities caused by population risk factors. As a third step, a nonparametric mixture is fitted on the regression residuals to achieve the region clustering.