Connect with
Ecodep Calendar, click the event to get the Zoom link. Hybrid talks will show up a room for the events.
The seminary will have a better audience at IHP 11 Rue Pierre et Marie Curie, 75005 Paris, from January 2023.
Talks recorded on the
Ecodep Youtube Channel.
Announcements of the seminar occuring at IHP are also
located on the site of IHP.
- 6/21/2023, 14:00. Alexander Drewitz (Cologne University).
Room 01.
(Near-)critical percolation with long-range correlations on transient graphs.
Abstract.
Percolation models have been playing a fundamental role in statistical physics for several
decades by now. They had initially been investigated in the gelation of polymers
during the 1940s by chemistry Nobel laureate Flory and Stockmayer.
From a mathematical point of view, the birth of percolation theory was the
introduction of Bernoulli percolation by Broadbent and Hammersley in 1957,
motivated by research on gas masks for coal miners. One of the key features
of this model is the inherent stochastic independence which simplifies its
investigation, and which has lead to deep mathematical results. During recent
years, the investigation of the more realistic and at the same time more complex
situation of percolation models with long-range correlations has attracted
significant attention.
We will exhibit some recent progress for the Gaussian free field
with a particular focus on the understanding of the critical parameters in the
associated percolation models. What is more, we also survey recent progress in
the understanding of the model at criticality via its critical exponents as well
as the universality in the local geometry of the underlying graph.
This talk is based on joint works with A. Prévost (U Cambridge) and
P.-F. Rodriguez (Imperial College).
- 6/20/2023, 14:00. Xiaoyin Li (St. Cloud State University).
Room 201.
Using Statistical and Computational Methods to Identify Genetic Variants in Large-scale Genomic Data.
Abstract.
Recent advances in sequencing technologies make it possible to sequence a large
number of subjects and test many genetic variants. Using statistical and
computational methods, my goal is to identify regions of the genome that
influence several disorders, which is often called “pleiotropy”. The term
“pleiotropy” describes the phenomenon of a single genetic variant influencing
multiple traits of an organism; identifying such variants can help us gain a
better understanding of disease pathology. Given the importance of these
functions, the identification and characterization of this pleiotropy are
crucial for a comprehensive biological understanding of complex traits and
disease states. Within this broad topic, I address three questions:
a) which
loci in the genome govern the co-occurrence of disorders?
b) how to understand
the mechanism that genetic variants influence pairs of traits?
c) What statistical models are best suited to identify pleiotropic variants
from large-scale genetic data?
- 6/14/2023, 14:00. TBA (TBA).
Room 201.
- 5/31/2023, 14:00. Anne Van Delft
(Columbia University, New York). Room 421.
A statistical framework for analyzing shape in a time series of random geometric objects.
- 5/10/2023, 14:00. Zaher Khraibani (CYU Cergy Paris).
Online, Slides,
A Non Parametric test based on Extremal Process :
Dependent Space and Time Components.
Abstract.
We consider the successive occurrence times {S_{n}}_{n≥0} of some phenomenon,
such as the occurrence times of the first clinical cases of a new disease,
or the occurrence times of the breakdowns of some new machine, or the occurrence
times of earthquakes, or the occurrence times of bacteria of a new species. In order
to control the phenomenon as early as possible, we wish to detect from these
first occurrence times if we deal with a sporadic phenomenon or an emergent one
which means that the mean occurrence rate will increase as time increases. So it is
necessary to elaborate on some exact and fine statistics.
Since the smallest values
of the interarrival times {ΔS_{k}}_{k≥1} and the interarrival
times between successive
such small values are particularly significative of a potential emergence, we will
consider the extremal process {R(t)}_{t≥0} built from the point process
{T_{k}, X_{k}}_{k≥1},
where T_{k} := S_{k} − S_{1} is the occurrence time of the
(k + 1)th case with T_{1} = 0, and X_{k} =
(ΔS_{k})^{-1}.
By construction {R(t)} is a jump process that jumps at the successive upper record values
{R_{m}}_{m≥1} of the sequence {X_{k}}, where R_{1} :=
X_{1}.
We will derive the distribution of {R(t)} and {N(t)} (number of records in (0,t]) and will
illustrate on a simulated trajectory of a slowly emergent phenomenon, the test of the
assumption H_{0} of a sporadic phenomenon : the {ΔS_{k} } are i.i.d.,
using either the extremal process {R(t)} or the record process {R_{m}}. We will
show that the extremal process is much more efficient than the record process.
- 5/3/2023, 14:00. Marie-Pierre Etienne (Agro Rennes). ENSAI, Rennes: room 8.
Non linear state space models in Ecology.
- 4/19/2023, 14:00. Davide Faranda
(LSCE-CEA-CNRS-UVSQ-Université Paris-Saclay). Room 421.
Analogue Methods for attribution of extreme events to climate change,
application to the exceptional 2022 European-Mediterranean drought.
Abstract.
A prolonged drought affected Western Europe and the Mediterranean region in the first
nine months of 2022 producing large socio-ecological impacts. The role of
anthropogenic climate change (ACC) in exacerbating this drought has been often
invoked in the public debate, but the link between atmospheric circulation and
ACC has not received much attention so far. Here we address this question by
applying the method of circulation analogs, which allows us to identify atmospheric
patterns in the period 1836-2021 very similar to those occurred in 2022. By comparing
the circulation analogs when global warming was absent (1836-1915) with those
occurred recently (1942-2021), and by excluding interannual and interdecadal
variability as possible drivers, we identify the contribution of ACC. The 2022
drought was associated with an anticyclonic anomaly over Western Europe persistent
over December 2021-August 2022. Circulation analogs of this atmospheric pattern in
1941-2021 feature 500 hPa geopotential height anomalies larger in both extent and
magnitude, and higher temperatures at the surface, relative to those in 1836-1915.
Both factors exacerbated the drought, by increasing the area affected and enhancing
soil drying through evapotranspiration. While the occurrence of the atmospheric
circulation associated with the 2022 drought has not become more frequent in
recent decades, there is an increase of its interdecadal variability for which the
influence of the Atlantic Multidecadal oscillation cannot be ruled-out.
Reference. Davide Faranda, Salvatore Pascale, Burak Bulut.
Persistent anticyclonic conditions and climate change exacerbated the exceptional
2022 European-Mediterranean drought. Environmental Research Letters, In press.
⟨hal-03907855⟩.
- 3/15/2023, 9:00-20:00. A one day conference at IHP,
Room 314. See
Conference 2023.
- 2/15/2023, 14:00. Anne Leucht (Bamberg University, Germany).
Room 01.
Testing Equality of Spectral Density Operators for Functional Processes.
Abstract.
The problem of comparing the entire second order structure of two
functional processes is considered and an L^{2}-type statistic for
testing equality of the corresponding spectral density operators is investigated.
The test statistic evaluates, over all frequencies, the Hilbert-Schmidt distance
between the two estimated spectral density operators. Under certain assumptions,
the limiting distribution under the null hypothesis is derived. A novel frequency
domain bootstrap method is introduced, which leads to a more accurate
approximation of the distribution of the test statistic under the null
than the large sample Gaussian approximation derived. Under quite general
conditions, asymptotic validity of the bootstrap procedure is established for
estimating the distribution of the test statistic under the null. Furthermore,
consistency of the bootstrap-based test under the alternative is proved.
Numerical simulations show that, even for small samples, the bootstrap-based
test has a very good size and power behavior.
Slides,
Reference,
Preprint.
- 2/1/2023, 14:00.
Azadeh Khaleghi (ENSAE Palaiseau).
Room 01.
On some Possibilities and Limitations of Restless Multi Armed Bandits.
Abstract.
The Multi-Armed Bandit (MAB) problem is one of the most central instances of sequential
decision making under uncertainty, which plays a key role in online learning and
optimization. MABs arise in a variety of modern real-world applications,
such as online advertisement, Internet routing, and sequential portfolio
selection, only to name a few. In this problem, a forecaster aims to maximize
the expected sum of the rewards actively collected from unknown processes.
Stochastic MABs are typically studied under the assumption that the rewards are
i.i.d.. However, this assumption does not necessarily hold in practice.
In this talk I will discuss some possibilities and limitations of a more
challenging, yet more realistic (restless) MAB setting, where the reward
distributions may exhibit long-range dependencies. Together with Steffen
Grünewälder, we studied the problem in the case where the pay-off distributions
are stationary φ-mixing. Though a slightly more realistic model for most
real-world applications, this problem cannot be optimally solved in practice
as it is known to be PSPACE-hard. We characterize a subclass of the problem
where good approximate solutions can be found using tractable approaches.
Specifically, it is shown that if the sequence of φ-mixing coefficients
is summable, a modified version of the UCB algorithm can prove effective.
The main challenge is that, unlike in the i.i.d. setting, the distributions
of the sampled pay-offs may not have the same characteristics as those of the
original bandit arms. In particular, the φ-mixing property does not
necessarily carry over. This is overcome by carefully controlling the effect
of a sampling policy on the pay-off distributions. Some of the proof techniques
developed in this paper can be more generally used in the context of online
sampling under dependence. I will also briefly introduce some of our recent
results with Gabor Lugosi on the estimation of α and β-mixing
coefficients from stationary sample paths. I will conclude with a discussion
on the potential application of these results to the approximation of the
optimal strategy in restless MABs.
- 1/25/2023, 14:00. Mathias Bourel
(Universidad de la República, Uruguay). Room 01.
Two examples of application of Statistical
Learning to ecological problems in Uruguay.
Abstract. Databases in ecology represent a real challenge for
statistical modelling. Indeed, these databases present particularities that make
the application of the usual techniques not necessarily adapted: few data, many
explanatory variables compared to the number of observations, non-linear,
missing data, non-respect of the usual hypotheses, etc. Statistical learning
has developed a lot in this type of context, allowing a better understanding
of the data and better predictions. Statistical learning has developed a great
deal in this type of context, allowing better understanding of the data and
better predictions to be obtained. In this presentation, we will present two
examples of the application of machine learning to prediction problems in
ecology in Uruguay. The first one consists in predicting the presence-absence
of marine phytoplankton species in the east coast from consensus models.
These are meta-models that combine the predictions of other models.
The second example is about proposing models for the prediction of beach
quality in Montevideo to indicate if it is allowed to swim or not.
Slides.
- 1/18/2023, 14:00. Baptiste Alglave
(Agrocampus-Ouest, Rennes),
(EMH - Ifremer Nantes) and (Seattle, USA). Room 01.
Inferring fish spatio-temporal
distribution and identifying essential habitats: tackling the challenge
of preferential sampling and change of support to integrate heterogeneous
data sources. Abstract.
- 1/11/2023, 14:00. Lionel Truquet (ENSAI Rennes). Room 201.
Inference of Taylor's law parameters in ecology.
- 12/7/2022, 14:00. Philippe Naveau
(LSCE-CEA-CNRS-UVSQ-Université Paris-Saclay)
Room E554.
Climate models.
Abstract.
Numerical climate models are complex and combine a large number of
physical processes. They are key tools in quantifying the relative
contribution of potential anthropogenic causes (e.g., the current
increase in greenhouse gases) on high-impact atmospheric variables
like heavy rainfall or temperatures. These so-called climate extreme
event attribution problems are particularly challenging in a
multivariate context, that is, when the atmospheric variables are
measured on a possibly high-dimensional grid. In addition, global
climate models like any in sillico numerical experiments are
affected by different types of bias.
In this talk, I will discuss about how to combine to two statistical
theories to assess causality in the context of extreme event attribution.
In addition, the question of uncertainties quantification that remains a
challenge in any climate attribution analysis will be explored from various
directions. In particular, a simple model bias correction step for records
will described in details. To illustrate our approach, we infer emergence
times in precipitation from the CMIP5 and CMIP6 archives.
Joint work
with Anna Kiriliouk, Paula Gonzalez Soulivanh Thao and Julien Worms.
Slides.
- Kiriliouk, A., and P. Naveau (2020). Climate extreme event attribution using
multivariate peaks-over-thresholds modeling and counterfactual theory.
Ann. Appl. Stat., 14 (3), 1342–1358
- Naveau P. and S. Thao (2022). Multi-model errors and emergence times in
climate
attribution studies, journal of climate.
- Worms J. and P. Naveau. (2022, in press). Record events attribution in
climate studies. Environmetrics ⟨hal-02938596⟩.
- 11/30/2022, 14:00. Vytaute Pilipauskaite
(Aalborg University, Denmark). Room E554.
Parameter estimation of discretely observed interacting particle systems.
Abstract.
In this talk we consider the problem of joint estimation of
parameters in the drift and diffusion coefficients of a system
of N interacting particles associated to a McKean-Vlasov
equation. Using a pseudo likelihood approach we propose a contrast
function based on discrete observations of the interacting particle
system over a fixed interval [0, T]. We show that the associated
estimator is consistent when the discretization step Δ
and the number of particles N satisfy Δ → 0 and N
→ ∞,
and asymptotically normal when additionally the condition
Δ → 0 holds. The talk is based on joint work arXiv:2208.11965
with C. Amorino, A. Heidari and M. Podolskij.
Slides.
- 11/23/2022, 14:00. Frédéric Audard
(Marseille University). Room E554.
Mobility modeling, regional traffic generation.
Transports: 13 minutes, Marseille (Video).
- 11/9/2022, 14:00. Denys Pommeret
(Marseille University). Room E554.
Comparison of stationary processes.
Abstract.
We consider the problem of comparison of strictly
stationary processes. We first recall that the two sample case has
been studied for short and long memory processes. In both cases the
first works considered the comparison of the marginal distributions.
An extension to the whole distribution of the processes will be exposed
in this talk. Then we will discuss the generalization to the K-sample
case. We will deduce a clustering procedure for time series.
Reference. Pommeret, D., Reboul, L. & Yao, AF.
Testing the equality of the laws of two strictly stationary processes.
Stat Inference Stoch Process (2022).
Doi 10.1007/s11203-022-09272-w.
Preprint.
- 10/26/2022, 10:30. Workshop.
Tolbiac PMF Room B11-12, Floor 11. Open discussion.
Global Warming: a machine learning approach.
- 10/19/2022, 15:30. Alejandra Cabana
(Autonomous University of Barcelona). Room E554.
Exploring different semimetrics in nonparametric regression for functional data.
Abstract.
We consider nonparametrical classification and regression kernel-based models
for predicting a response variable using functional,
categorical and/or continuous covariates. We compare the
performance of those models using the Hausdorff, Wasserstein
and $L^2$ (semi-)metrics by applying them to real-world data
sets. Remarcable difference in the performance of the models
when varying the (semi-)metric is observed, as expected,
depending on the type of data.
- 10/19/2022, 14:00. Argimiro Arratia
(Polytechnical University of Catalonia). Room E554.
Neural Ordinary Differential Equations and universal systems.
Abstract.
Neural Ordinary Differential Equations (NODE) have emerged as a novel approach
to deep learning, where instead of specifying a discrete sequence of hidden
layers, it parameterizes the derivative of the hidden state using a neural
network. The solution to the underlying dynamical system is a flow, and
various works have explored the universality of flows, in the sense of being
able to approximate any analytical function. In the first part of my talk
I will explain the NODE paradigm and how can be used as an enhanced model
or dynamical systems. Then I will present preliminary work aimed at identifying
families of systems of ordinary differential equations (SODE) that are
universal, in the sense that they encompass most of the systems of differential
equations that appear in practice. Once one of these (candidate) universal SODEs is found, we define a process that generates
a family of NODEs whose flows are precisely the solutions of the universal SODEs found above.
Our candidates for universal SODE are:
1) the generalized Lotka-Volterra (LV) families of differential equations;
2) the Riccati dynamical systems; and 3) the S-systems.
We present the NODE models built upon each one of these dynamical systems
and a description of their appropriate flows together with some preliminary
implementations of these processes and results on learning some analytical
functions.
Preprint.
Slides.
- 05/18/2022, 14:00. Marina Friedrich.
(Tinbergen Institute, Amsterdam) Room E554.
Sieve bootstrap inference for time-varying coefficient models.
Abstract. We propose a sieve bootstrap framework to conduct
pointwise and simultaneous inference for time-varying coefficient regression
models based on a nonparametric local linear estimator. The asymptotic validity
of the sieve bootstrap in the presence of autocorrelation is established. We find
that it automatically produces a consistent estimation of nuisance parameters,
both at the interior and boundary points. In addition, we develop a bootstrap
test for parameter constancy and show that it is asymptotically correctly sized.
An extensive simulation study supports our findings. The proposed methods are
applied to assess the price development of CO2 certificates
in the European Emissions Trading System (EU ETS). We find evidence of time
variation in the relationship between allowance prices and their fundamental
price drivers. Working paper.
Slides.
- 05/4/2022, 14:00. Marion Borderon
(University of Vienna).
Room E554.
Impact of climate change on human migration. Empirical Issues.
Abstract.
While there is general agreement in the scientific community that environmental
change could have a major impact on population distribution and migration patterns,
our knowledge of the nature and role of these impacts is still limited. Migration
decision processes are multifactorial, multi-scale and complex in nature, as can
be environmental change; involving therefore a significant number of challenges
when studying the nexus of both. Recent reviews of the literature highlight that
the current research field is mainly divided between detailed empirical case
studies on the micro level that often draw on self-reported environmental
information and with limited scope for generalization, and global and national
assessments on the macro level that do not sufficiently represent the local
situation. An interesting avenue is to harness survey data coupled with climate
or environmental data at the sub-national level in order to narrow the gap between
our theoretical knowledge and our capacity to empirically study the migration-climate
change nexus. Some examples of empirical approaches to the migration-climate
change nexus will be discussed.
- 05/2/2022, 14:00 (monday) Imma Curato
(University of Ulm).
Room E554. I
Light cones and supervised learning prediction tasks.
Abstract.
Slides.
- 04/27/2022, 14:00. Pierre Jacob
(ESSEC, Cergy-Pontoise). Room E554.
Some methods based on couplings of Markov chain Monte Carlo algorithms.
Slides.
Abstract.
Markov chain Monte Carlo algorithms are commonly used to approximate a variety
of probability distributions. I will review the idea of coupling in the context
of Markov chains, and how this idea not only leads to theoretical analyses of
Markov chains but also to new Monte Carlo methods. In particular, the talk
will describe how coupled Markov chains can be used to obtain 1) unbiased
estimators of expectations and of normalizing constants, 2) non-asymptotic
convergence diagnostics for Markov chains, and 3) unbiased estimators of
the asymptotic variance of MCMC ergodic averages. The latter is based on novel
connections between the Poisson equation and coupled Markov chains.
- 04/13/2022, 14:00 Edouard Fouché, (KIT Karlsruhe). Room E554.
Automated Decision Making for Sustainable Energy Systems.
Abstract.
Designing future energy systems under the ongoing green transition is one of the major challenges of the 21st century. In particular, with the increasing share of renewables and electric transportation, it has become highly challenging to coordinate operations between producers and consumers, as the grid must supply highly reliable and green power at any time. Grid operators, producers, and consumers must make optimal real-time decisions in the face of an ever-changing, distributed, and uncertain environment. Humans cannot reasonably make such decisions, i.e., decisions must be automated. While Automated Decision Making (ADM) has been of interest in Artificial Intelligence for over fifty years, existing techniques fall short regarding constraints of real-world applications, e.g., energy systems, in which uncertainties may be very diverse.
In this talk, I will present my vision to tackle this problem. Our goal is to develop new methods for data-driven decision making and respective tools to optimize the operation of large dynamic systems. We will particularly consider emerging energy systems, such as the so-called micro-grids and multi-modal networks; the interconnection between different energy carriers and networks offers new ways of coping with intermittent power generation and consumption. I will also present some of our previous contributions to ADM, such as our recent extensions of the Multi-Armed Bandit problem to a variable number of plays and non-stationary settings, as well as implications concerning knowledge discovery from massive streams of data.
Reference.
- 03/23/2022, 14:00. Federico Maddanu
(CYU Cergy-Pontoise).
Forecasting highly persistent time series with
bounded spectrum processes. Slides.
Abstract.
Long memory models can be generalised by the Fractional equal-root
Autoregressive Moving Average (FerARMA) process, which displays short
memory for a suitable parameter's set. Consequently, the spectrum is bounded,
ensuring stationarity also for values of the memory parameter d larger than 0.5.
The FerARMA generalization is proposed here to forecast highly persistent
time series, as climate records of tree rings and paleo-temperature
reconstructions. The main advantage of a bounded spectrum allows for
more accurate predictions with respect to standard long memory models,
especially if a long-horizon is considered.
- 03/16/2022, 14:00. Federico Maddanu
(CYU Cergy-Pontoise)
with Tommaso Proietti
(Roma Torre Vergata).
Time trends in atmospheric ethane. Slides.
Abstract.
Understanding the dynamics underlying ethane (C2H6) trends is of uttermost
importance in the context of climate change. We focus on time series of ethane
abundance in the atmosphere recorded at six ground-stations located in both
the Northern and Southern Hemispheres. The trend component is hidden by both
a strong persistent annual cycle and the large amount of missing data (about 70%). Our approach proposes a novel structural model where both the cycle and trend evolve stochastically and can be estimated via the Kalman filter methodology. The results suggest a global pattern in the dynamics of ethane trends in both the Northern and Southern Hemispheres.
- 03/9/2022, 14:00. Pierre-Olivier Goffard
(Laboratory SAF, ISFA, Lyon).
Sequential Monte Carlo samplers to fit and compare insurance loss models.
Slides.
Abstract.
Insurance loss distributions are characterized by a high frequency of small
amounts and a lower, but not insignificant, occurrence of large claim amounts.
Composite models, which link two probability distributions, one for the “belly”
and the other for the “tail” of the loss distribution, have emerged in the
actuarial literature to take this specificity into account. The parameters
of these models summarize the distribution of the losses. One of them
corresponds to the breaking point between small and large claim amounts.
The composite models are usually fitted using maximum likelihood estimation.
A Bayesian approach is considered in this work. Sequential Monte Carlo samplers
are used to sample from the posterior distribution and compute the posterior
model evidences to both fit and compare the competing models.
The method is validated via a simulation study and illustrated on insurance
loss datasets.
Reference.
- 02/23/2022, 14:00. Luca Rolla & Alessandro Giovannelli
(Roma Torre Vergata). Room E554.
The forecasting performance of the factor model with martingale difference errors.
Slides.
Abstract. We compare the forecasting performance of two
factor models for a
large set of macroeconomic and financial time series: (i) The standard
principal-component model used by Stock and Watson (2002) (ii) The factor
model with martingale difference errors recently introduced by Lee and Shao
(2018). The factor model with martingale difference errors allows to find a
linear transformation of the original series so that it is possible to
obtain a separation of the resulting variables, according to whether they
are conditionally mean independent upon past information or not. In terms
of prediction this feature of the model allows to achieve optimal results
in the mean squared error sense despite considering a smaller set of factors,
i.e., those factors that display some form of dependence in the conditional mean. We adopt the classical diffusion index (DI) approach proposed by Stock and Watson to compare the empirical performance of the two methods when forecasting a large dataset of macroeconomic and financial monthly time series for the U.S. economy.
- 02/16/2022, 14:00. Marie-Pierre Etienne
(AgroParistech, Rennes).
Statistical methods for understanding movement data.
Slides.
Abstract.
Movement of organisms is one of the main mechanisms who govern relations between species.
Advances in biologging open promising perspectives in the study of animal movements at numerous scales.
It is now possible to record time series of animal locations over extended areas and long durations
with a high spatial and temporal resolution. One question addressed with this sort of data is the
habitat preference selection, which can be defined as the relationship between the environmental
covariates and the stationnary distribution, known in ecology as resource selection function.
Using some ergodicity assumptions, it is possible to estimate this relationship
by long-term monitoring of an individual movement. We explore the use of Langevin model for
modelling animal movement in order to estimate the resource selection function.
Reference.
T Michelot, P Gloaguen, P G. Blackwell, M-P Étienne (2019)
The Langevin diffusion as a continuous-time model of animal movement and habitat selection.
Methods in Ecology and Evolution,
https://doi.org/10.1111/2041-210X.13275.
- 02/9/2022, 14:00. Laurence Reboul
(Aix-Marseille University, I2M). Room E554.
Estimation of Pickands dependence function of bivariate extremes under
mixing conditions,
with M. Boutahar, I. Kchaou.
Slides.
Abstract.
- 06/30/2021, 14:00. Denys Pommeret
(Aix-Marseille University, I2M / laboratory SAF, ISFA).
Comparing copulas, with Yves Ngounou. Slides.
Abstract.
Copulas are still extensively studied and used to model the dependence of multivariate
observations. Many applications can be found in fields such as energy, environment or ecology.
In a one-sample case, there are many tests to compare an observed copula to a target copula.
In the two-sample case, Rémillard and Scaillet (2009) proposed a test to compare two nonparametric
copulas, that is to test H_{0}: C_{1 } = C_{2 },
where C_{1 } and C_{2 } are two copulas observed on two iid samples, which
may be paired. To our knowledge, there is no extension to the K sample case. However, the
increasing amount of data requires sometimes more comprehensive analyzes. It is in this sense that
we propose an equality test of K copulas simultaneously when K populations are observed.
We propose to test the following hypothesis: H_{0}: C_{1 } = C_{2 }= ⋯ = C_{K },
from K iid samples, possibly paired. It is therefore a generalization of Rémillard and
Scaillet (2009). However, we obtain the exact
asymptotic distribution of the test statistic and the convergence of the test. The idea of the test is to
transform the observations to uniform laws, then to use the decomposition of the density of the
copula in the Legendre polynomials orthogonal basis. Returning to the copula function we obtain
what are called copula coefficients which characterize each copula. The test then amounts to
simultaneously comparing these coefficients. We provide some illustrations of this method, in
particular we suggest a clustering algorithm to classify populations with similar forms of dependence.
Reference.
YI Ngounou B, D Pommeret (2020) Nonparametric estimation of copulas and copula densities
by orthogonal projections.
Arxiv-2010.15351.
- 06/23/2021, 14:00. Jüri Lember
(University of Tartu, Estonia).
An evolution model that satisfies detailed balance.
Abstract,
Slides.
Reference. J Lember, C Watkins (2020).
An Evolutionary Model that Satisfies Detailed Balance. Methodol. Comput. Appl. Probab.
doi.org/10.1007/s11009-020-09835-5.
- 06/9/2021, 14:00. Hansjörg Albrecher
(UNIL, Lausanne)
Asymptotic Analysis of the Greenwood Statistic and Extensions.
Abstract.
We revisit and unify the asymptotic analysis of the classical Greenwood
statistic comprising the ratio of
the sum of squares and the sum squared of independent and identically distributed
random variables with regularly varying tails. We discuss some of its application
areas and extend the analysis to the situation of arbitrary powers. Finally, we
study the robustness of the asymptotic expressions when some of the terms in
the statistic are dropped. Part of the talk is based on recent joint work with
Brandon Garcia-Flores.
References.
H Albrecher, S Ladoucette, J Teugels (2010). Asymptotics of the sample coefficient of variation and the sample dispersion.
JSPI, 140-2, 358-368.
Preprint.
H Albrecher, J Teugels (2007). Asymptotics analysis of a measure of variation.
Theor. Probability and Math. Statist. 74, 1–10.
- 6/2/2021, 14:00. Antonio Cuevas (Dept. Mat.
UAM, Madrid)
On the shape restrictions used in set estimation.
Abstract
Set estimation techniques deal with the problem of reconstructing a compact set S from a random sample of points.
Some shape restrictions on the target set S (often inspired on convexity-related notions) appear in a natural way in this context.
They are used at least in two ways: first, as a tool to simplify calculations when proving asymptotic results. Second, as a guide to construct natural estimators via the "hull paradigm" (the estimator is defined as the "minimal set" including the sample and fulfilling the assumed shape restriction). We will review here some recent contributions providing examples of both situations.
This talk is based on joint work with Alejandro Cholaquidis (Universidad de la República, Uruguay) and Catherine Aaron (Université de Clermont-Ferrand, France).
References.
C Aaron, A Cholaquidis, A Cuevas (2017).
Detection of low dimensionality and data denoising via set estimation techniques.
Electronic Journal of Statistics, 11, 4596-4628.
A Cholaquidis, A Cuevas (2020). Set estimation under biconvexity restrictions.
ESAIM: Probability and Statistics, 24, 770-788.
- 5/5/2021, 14:00. Kamila Kare (SAMM, Paris 1, Panthéon-Sorbonne)
Data Driven Model Selection for Same-Realization Predictions in
Autoregressive Processes.
Slides.
Abstract. This paper is about the one-step ahead prediction of the future of observations
drawn from an infinite-order autoregressive AR(∞) process.It aims to design penalties
(completely data driven)
ensuring that the selected model verifies the efficiency property but in the non asymptotic framework.
We present an oracle inequality with a leading constant equal to one. Moreover, we also show that the
excess risk of the selected estimator enjoys the best bias-variance trade-off over the considered
collection.
To achieve these results, we needed to overcome the dependence difficulties by following a classical
approach which consists in restricting to a set where the empirical covariance matrix is equivalent
to the theoretical one. We show that this event happens with probability larger than
1-c_{0}/n^{3} with c_{0}>0.
The proposed data driven criteria are based on the minimization of the penalized criterion akin
to the Mallows's C_{p}. Monte Carlo experiments are performed to highlight the obtained results.
Reference.
K Kare (2021). Data Driven Model Selection for Same-Realization Predictions in Autoregressive Processes.
Hal Preprint.
- 4/28/2021, 14:00. Alexander Kreiss (KU Leuven)
Non-Parametric Modelling of Interactions Among Vertices in Dynamic Networks.
Slides.
Abstract. We will consider dynamic networks in which the vertices (the actors) can interact
with each other along the edges of the network. We assume that over the observation period [0,T]
the number of vertices remains fixed while the edges between them may change randomly over time.
The occurrence of interactions between the actors is modelled by specifying a Cox-Type model
which allows for additional, time-varying covariates. Our interest lies in non-parametrically estimating
the (possibly) time-varying effect of the covariates on the interactions. To this end, we introduce a
kernel-based local likelihood estimator and study its asymptotic (as the network grows) performance.
Moreover, we introduce two test statistics which evaluate the fit of the non-parametric compared to parametric models. From a theoretical point of view a particular challenge when handling this type of data is that neighboring actors in the network influence each other and cannot be treated as independent. We introduce therefore weak dependence measures on dynamic networks based on correlation, mixing and temporal m-dependence. The results are illustrated on bike sharing data.
This is partially joint work with Enno Mammen (Heidelberg) and Wolfgang Polonik (UC Davis).
References.
A Kreiß, E Mammen, W Polonik (2019) Nonparametric inference for continuous-time event counting and link-based dynamic network models.
https://doi.org/10.1214/19-EJS1588.
A Kreiß (2019) Correlation bounds, mixing and m-dependence under random time-varying network distances with an application to Cox-Processes.
https://arxiv.org/abs/1906.03179.
A Kreiß, E Mammen, W Polonik (2021) Testing For a Parametric Baseline-Intensity in Dynamic Interaction Networks.
https://arxiv.org/abs/2103.14668.
- 4/7/2021, 14:00 . Diu Tran
(University of Jyväskylä, Helsinki)
Statistical inference for Vasicek-type model driven by Hermite processes.
Slides.
Abstract. Let Z denote a Hermite process of order q >= 1 and self-similarity parameter
H ∈ (1/2, 1).
This process is H-self-similar, has stationary increments and exhibits long-range dependence.
When q = 1, it corresponds
to the well-known fractional Brownian motion, whereas it is not Gaussian as soon as q >= 2.
In the talk, we deal with a Vasicek-type model driven by Z, of the form dX_{t} = a(b − X_{t})dt + dZ_{t}.
This model includes the fractional Vasicek model and Hermite-driven Ornstein-Uhlenbeck process.
Here, a > 0 and b ∈ R are considered as unknown drift parameters.
We provide estimators for a and b based on
continuous-time observations. For all possible values of H and q, we prove strong consistency and we analyze the asymptotic fluctuations.
This is a first step to estimate parameters of a stochastic equation driven by a Hermite process. Joint work with Prof.
Ivan Nourdin from University of Luxembourg.
Reference. I Nourdin, D Tran (2019). Statistical inference for Vasicek-type model driven by Hermite processes.
Stoch. Proc. Appl., 129, no. 10, pp. 3774-3791.
ArXiv.
- 3/31/2021, 15:00. Frederic Barraquand (IMB Bordeaux)
Inferring species interactions using Granger causality and convergent cross mapping.
Slides.
Abstract. How to reliably infer interactions between species from time series of their population densities is a long-standing goal of statistical ecology. Usually this inference is done using multivariate (linear) autoregressive models, defining interactions through Granger causality: x causes y whenever x helps predicting future y values. However, the entangled nature of nonlinear ecological systems has suggested an alternative causal inference method based on attractor reconstruction, convergent cross mapping, which is increasingly popular in ecology. Here, we compare the two methods. They uncover interactions with surprisingly similar performance for predator-prey cycles, 2-species chaotic or stochastic competition, as well as 10- and 20-species networks. Thus, contrary to intuition, linear Granger causality remains useful to infer interactions in highly nonlinear ecological networks. We conclude on inevitable similarities between Granger-causal methods and convergent cross mapping due to interaction definitions, and provide suggestions to improve many-species interaction inference.
Reference.
F Barraquand, C Picoche, M Detto, F Hartig (2019). Inferring species
interactions using Granger causality and convergent cross mapping.
https://arxiv.org/abs/1909.00731.
- 3/24/2021, 14:00 Benjamin Poignard
(Riken AIP, Osaka)
An introduction to sparsity: modelling, properties and applications.
Slides.
Abstract. The application domains of sparse modelling have
been substantially widened by the availability of high-dimensional data.
In particular, high-dimensional statistical modelling is concerned with
the significantly large number of parameters to estimate. To tackle the
over-fitting issue, penalised/regularized estimation methods have been
gaining much attention. In this talk, I will introduce the concept of sparsity
together with the standard penalisation methods for sparse modelling and the
implications in terms of statistical properties. To illustrate the relevance of
sparse modelling, I will present some applications to models that typically
suffer from the so-called curse of dimensionality.
References.
B Poignard, J-D Fermanian (2021). High-dimensional penalized arch processes.
Econometric Reviews
Volume 40, 2021 - Issue 1.
B Poignard, M Asai (2020). A Penalised OLS Framework for High-Dimensional
Multivariate Stochastic Volatility Models.
Papers In Economics
& Business, Discussion Paper 20-02.
- 3/10/2021, 16:00. Julien Randon-Furling
(Paris 1, Panthéon Sorbonne)
Convex Hulls of Random Walks.
Slides.
Abstract.
This talk will cover a range of results on the convex hull of random walks in the plane and
in higher dimension:
expected perimeter length in the planar case, expected number of faces on the boundary,
expected d-dimensional volume, and other geometric properties of such random convex polytopes.
Applications in ecology include estimations of animals' home ranges and minimal habitat
sizes in conservation parks.
References.
J Randon-Furling, D Zaporozhets (2020).
Convex hulls of several multidimensional Gaussian random walkls.
arXiv:2007.02768.
J Randon-Furling, F Wespi (2017).
Facets on the convex hull of d-dimensional Brownian and Lévy motion..
Physical Review E.
SN Majumdar, A Comtet, J Randon-Furling (2010).
Random convex hulls and extreme value statistics.
Journal of Statistical Physics.
- 1/27/2021, 15:00. Benjamin Bobbia
(CYU)
Extreme quantile regression: A coupling approach and Wasserstein distance.
Slides.
Abstract.
In this work, we develop two coupling approaches for extreme quantile regression. We
consider i.i.d copies of Y in R and X in R^d and we want an estimation of the conditional
quantile of Y given X = x of order 1- alpha for a very small alpha > 0.
We introduce the proportional tail model, strongly inspired by the heteroscedastic
extremes developed by Einmahl, de Haan and Zhou. The main assumption is that the
tail distribution of Y is asymptotically proportional to the conditional tail of Y given
X = x. We propose and study estimators of both model parameters and conditional
quantile, which are studied by coupling methods.
References.
B Bobbia, C Dombry, D Varron (2020). The coupling method in extreme value theory.
https://arxiv.org/pdf/1912.03155
B Bobbia, C Dombry, D Varron (2020).
Extreme quantile regression in a proportional tail framework.
https://arxiv.org/pdf/2002.01740
- 12/2/2020, 16:00. Rolando Rebolledo
(University of Valparaiso)
Open-system approach to ecological networks.
Abstract,
Talk, and
Slides.
Reference.
R Rebolledo, SA Navarrete, S Kéfi, S Rojas, PA Marquet.
An Open-System Approach to Complex Biological Networks.
SIAM Journal on Applied Mathematics, 79(2):619–640, 2019.
- 11/25/2020, 16:00. Félix Cheysson
(Agro-Paristech Paris)
Properties of Hawkes processes. Talk,
Slides.
Abstract. Hawkes processes are a family of stochastic processes for which the
occurrence of any event increases the probability of further events occurring. When count
data are only observed in discrete time, we propose a spectral approach for the estimation
of Hawkes processes, by means of Whittle's parameter estimation method. To get asymptotic
properties for the estimator, we prove alpha-mixing properties for the series of counts,
using the Galton-Watson properties of the cluster representation of Hawkes processes.
Simulated datasets and an application to the incidence of measles in France illustrate
the performances of the estimation, notably of the Hawkes excitation function, even when
the time between observations is large.
- 11/18/2020, 16:00. Marc Lavielle
(INRIA & CMAP, Polytechnique)
Modelling the COVID 19 pandemic requires a model...
but also data! Talk,
Slides.
Abstract.
I will present in this talk some models for different Covid-19 data. I will first propose a
SIR-type model for the data provided by the Johns-Hopkins University for several countries:
these are the daily numbers of confirmed cases and deaths. The same model is used for all
countries but the parameters of the model change from one country to another to reflect
differences in dynamics. In particular, the model incorporates a time-dependent transmission
rate, whose variations are thought to be related to the public health measures taken by the
country in question.
I will then present a model for French hospital data provided by Santé Publique France: daily
numbers of hospitalization, admissions in intensive care units, deaths and hospital
discharges.
The proposed models may seem relatively simple, but it must be understood that they
do not pretend to describe the spread of the pandemic in a precise and detailed way.
Their role is to adjust the available data and provide reliable forecasts: their
complexity is therefore adjusted to the amount of information available in the data.
Indeed, very few parameters are needed to properly describe the outcome of interest
and the prediction proves to be stable over time.
Two interactive web applications are available to visualize the data and the adjusted
models:
http://shiny.webpopix.org/covidix/app1/
for JHU data,
http://shiny.webpopix.org/covidix/app3/
for SPF data.