EBUL - Event-based Unsupervised Learning for Physiological Signals
Description
The collection and statistical analysis of physiological signals is ubiquitous in modern life, from the continuous monitoring of patients in hospitals, to data obtained using cheap wearable devices. While statistical methods based on handcrafted quantities are efficient to capture well identified effects, they require clear hypotheses on the underlying physiological processes. Alternatively, data-driven unsupervised approaches offer an opportunity to explore and leverage such signals for population well-being. Yet, off-the-shelve generic unsupervised algorithms remain limited. Physiological signals are recorded together with surrounding events, that are typically not exploited by unsupervised methods. The main objective of EBUL is to develop a new generation of unsupervised learning methods that jointly model physiological signals and events. EBUL will develop dedicated machine learning and statistical signal processing methods and favor the emergence of new challenges for these fields focusing on five open problems: 1) end-to-end unsupervised methods to jointly model physiological signals and events, 2) physiological events' models with multivariate Point Processes embedded in space, 3) machine learning and statistical tools for actionable feedbacks from the learned representations, 4) fast algorithms that can scale for experimental data, and 5) physiological signals processing tools to impact general anesthesia and neuroscience. These challenges will be tackled through contributions to self-supervised learning methods and point processes. The methods developed in EBUL will have broad applications for fields where physical signals enriched with events are processed. Yet, the primary purpose of EBUL will be to process physiological signals, in particular in the neuroscience and anesthesiology fields. The open source software produced in EBUL will empower practitioners with the necessary tools to uncover new findings about the signals' dynamic.
Funded Participants
Guillaume Staerman, PostdocVirginie Loison, PhD
Publications
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels 2023
Guillaume Staerman, Cédric Allain, Alexandre Gramfort & Thomas MoreauIn ICML
Temporal point processes (TPP) are a natural tool for modeling event-based data. Among all TPP models, Hawkes processes have proven to be the most widely used, mainly due to their simplicity and computational ease when considering exponential or non-parametric kernels. Although non-parametric kernels are an option, such models require large datasets. While exponential kernels are more data efficient and relevant for certain applications where events immediately trigger more events, they are...
Temporal point processes (TPP) are a natural tool for modeling event-based data. Among all TPP models, Hawkes processes have proven to be the most widely used, mainly due to their simplicity and computational ease when considering exponential or non-parametric kernels. Although non-parametric kernels are an option, such models require large datasets. While exponential kernels are more data efficient and relevant for certain applications where events immediately trigger more events, they are ill-suited for applications where latencies need to be estimated, such as in neuroscience. This work aims to offer an efficient solution to TPP inference using general parametric kernels with finite support. The developed solution consists of a fast L2 gradient-based solver leveraging a discretized version of the events. After supporting the use of discretization theoretically, the statistical and computational efficiency of the novel approach is demonstrated through various numerical experiments. Finally, the effectiveness of the method is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG). Given the use of general parametric kernels, results show that the proposed approach leads to a more plausible estimation of pattern latency compared to the state-of-the-art.
Unmixing Noise from Hawkes Process to Model Learned Physiological Events 2025
Virginie Loison, Guillaume Staerman, Thomas MoreauIn AISTAT
Physiological signal analysis often involves identifying events crucial to understanding biological dynamics. Traditional methods rely on handcrafted procedures or supervised learning, presenting challenges such as expert dependence, lack of robustness, and the need for extensive labeled data. Data-driven methods like Convolutional Dictionary Learning (CDL) offer an alternative but tend to produce spurious detections. This work introduces UNHaP (Unmix Noise from Hawkes Processes), a novel ...
Physiological signal analysis often involves identifying events crucial to understanding biological dynamics. Traditional methods rely on handcrafted procedures or supervised learning, presenting challenges such as expert dependence, lack of robustness, and the need for extensive labeled data. Data-driven methods like Convolutional Dictionary Learning (CDL) offer an alternative but tend to produce spurious detections. This work introduces UNHaP (Unmix Noise from Hawkes Processes), a novel approach addressing the joint learning of temporal structures in events and the removal of spurious detections. Leveraging marked Hawkes processes, UNHaP distinguishes between events of interest and spurious ones. By treating the event detection output as a mixture of structured and unstructured events, UNHaP efficiently unmixes these processes and estimates their parameters. This approach significantly enhances the understanding of event distributions while minimizing false detection rates.
Convolutional Sparse Coding for Time Series Via a l0 Penalty: An Efficient Algorithm With Statistical Guarantees 2024
Charles Truong, Thomas MoreauIn Statistical Analysis and Data Mining
Identifying characteristic patterns in time series, such as heartbeats or brain responses to a stimulus, is critical to understanding the physical or physiological phenomena monitored with sensors. Convolutional sparse coding (CSC) methods, which aim to approximate signals by a sparse combination of short signal templates (also called atoms), are well-suited for this task. However, enforcing sparsity leads to non-convex and untractable optimization problems. This article proposes finding the ...
Identifying characteristic patterns in time series, such as heartbeats or brain responses to a stimulus, is critical to understanding the physical or physiological phenomena monitored with sensors. Convolutional sparse coding (CSC) methods, which aim to approximate signals by a sparse combination of short signal templates (also called atoms), are well-suited for this task. However, enforcing sparsity leads to non-convex and untractable optimization problems. This article proposes finding the optimal solution to the original and non-convex CSC problem when the atoms do not overlap. Specifically, we show that the reconstruction error satisfies a simple recursive relationship in this setting, which leads to an efficient detection algorithm. We prove that our method correctly estimates the number of patterns and their localization, up to a detection margin that depends on a certain measure of the signal-to-noise ratio. In a thorough empirical study, with simulated and real-world physiological data sets, our method is shown to be more accurate than existing algorithms at detecting the patterns' onsets.
Flexible Parametric Inference for Space-Time Hawkes Processes 2025
Emilia Siviero, Guillaume Staerman, Stephan Clémençon, Thomas MoreauIn DSAA
Many modern spatio-temporal data sets, in sociology, epidemiology or seismology, for example, exhibit self-exciting characteristics, triggering and clustering behaviors both at the same time, that a suitable Hawkes space-time process can accurately capture. This paper aims to develop a fast and flexible parametric inference technique to recover the parameters of the kernel functions involved in the intensity function of a space-time Hawkes process based on such data. Our statistical ...
Many modern spatio-temporal data sets, in sociology, epidemiology or seismology, for example, exhibit self-exciting characteristics, triggering and clustering behaviors both at the same time, that a suitable Hawkes space-time process can accurately capture. This paper aims to develop a fast and flexible parametric inference technique to recover the parameters of the kernel functions involved in the intensity function of a space-time Hawkes process based on such data. Our statistical approach combines three key ingredients: 1) kernels with finite support are considered, 2) the space-time domain is appropriately discretized, and 3) (approximate) precomputations are used. The inference technique we propose then consists of a l2 gradient-based solver that is fast and statistically accurate. In addition to describing the algorithmic aspects, numerical experiments have been carried out on synthetic and real spatio-temporal data, providing solid empirical evidence of the relevance of the proposed methodology.
The largest EEG-based BCI reproducibility study for open science: the MOABB benchmark 2024
Sylvain Chevallier, Igor Carrara, Bruno Aristimunha, Pierre Guetschel, Sara Sedlar, Bruna Lopes, Sebastien Velut, Salim Khazem, Thomas Moreaupreprint Arxiv
Motivated by the challenge of seamless cross-dataset transfer in EEG signal processing, this article presents an exploratory study on the use of Joint Embedding Predictive Architectures (JEPAs). In recent years, self-supervised learning has emerged as a promising approach for transfer learning in various domains. However, its application to EEG signals remains largely unexplored. In this article, we introduce Signal-JEPA for representing EEG recordings which includes a novel domain-specific ...
Motivated by the challenge of seamless cross-dataset transfer in EEG signal processing, this article presents an exploratory study on the use of Joint Embedding Predictive Architectures (JEPAs). In recent years, self-supervised learning has emerged as a promising approach for transfer learning in various domains. However, its application to EEG signals remains largely unexplored. In this article, we introduce Signal-JEPA for representing EEG recordings which includes a novel domain-specific spatial block masking strategy and three novel architectures for downstream classification. The study is conducted on a 54 subjects dataset and the downstream performance of the models is evaluated on three different BCI paradigms: motor imagery, ERP and SSVEP. Our study provides preliminary evidence for the potential of JEPAs in EEG signal encoding. Notably, our results highlight the importance of spatial filtering for accurate downstream classification and reveal an influence of the length of the pre-training examples but not of the mask size on the downstream performance.