Package 'DeCAFS' reference manual

Title:	Detecting Changes in Autocorrelated and Fluctuating Signals
Description:	Detect abrupt changes in time series with local fluctuations as a random walk process and autocorrelated noise as an AR(1) process. See Romano, G., Rigaill, G., Runge, V., Fearnhead, P. (2021) <doi:10.1080/01621459.2021.1909598>.
Authors:	Gaetano Romano [aut, cre], Guillem Rigaill [aut], Vincent Runge [aut], Paul Fearnhead [aut]
Maintainer:	Gaetano Romano <[email protected]>
License:	GPL (>= 2)
Version:	3.3.3
Built:	2025-03-26 05:02:59 UTC
Source:	https://github.com/gtromano/decafs

bestParameters

Description

iteration of the least square criterion for a grid of the phi parameter

Usage

bestParameters(y, nbK = 10, type = "MAD", sdEta = TRUE)
bestParameters(y, nbK = 10, type = "MAD", sdEta = TRUE)

Arguments

`y`	A time-series obtained by the dataRWAR function
`nbK`	number of diff k elements to consider
`type`	type of robust variance estimator (MAD, S or Q)
`sdEta`	if sdEta = FALSE there is no random walk

Value

a list with an estimation of the best parameters for Eta2, Nu2 and phi

Examples

bestParameters(dataRWAR(10000, sdEta = 0.2, sdNu = 0.1, phi = 0.3,
type = "rand1", nbSeg = 10)$y)
bestParameters(dataRWAR(10000, sdEta = 0.2, sdNu = 0.1, phi = 0.3,
type = "rand1", nbSeg = 10)$y)

Generate a Random Walk + AR realization

Description

Generate a Realization from the RWAR model (check the references for further details).

$y_t = \mu_t + \epsilon_t$

where

$\mu_t = \mu_{t-1} + \eta_t + \delta_t, \quad \eta_t \sim N(0, \sigma_\eta^2), \ \delta_t \ \in R$

and

$\epsilon_t = \phi \epsilon_{t-1} + \nu_t \quad \nu_t \sim N(0, \sigma_\nu^2)$

Usage

dataRWAR(
  n = 1000,
  sdEta = 0,
  sdNu = 1,
  phi = 0,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)
dataRWAR(
  n = 1000,
  sdEta = 0,
  sdNu = 1,
  phi = 0,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)

Arguments

`n`	The length of the sequence of observations.
`sdEta`	The standard deviation of the Random Walk Component on the signal drift
`sdNu`	The standard deviation of the Autocorrelated noise
`phi`	The autocorrelation parameter $\phi$
`type`	Possible change scenarios for the jump structure (default: `none`)
`nbSeg`	Number of segments
`jumpSize`	Maximum magnitude of a change

Value

A list containing:

y: the data sequence,
signal: the underlying signal without the superimposed AR(1) noise,
changepoints: the changepoint locations

References

Romano, G., Rigaill, G., Runge, V., Fearnhead, P. Detecting Abrupt Changes in the Presence of Local Fluctuations and Autocorrelated Noise. arXiv preprint https://arxiv.org/abs/2005.01379 (2020).

Examples

library(ggplot2)
set.seed(42)
Y = dataRWAR(n = 1e3, phi = .5, sdEta = 3, sdNu = 1, jumpSize = 15, type = "updown", nbSeg = 5)
y = Y$y
ggplot(data.frame(t = 1:length(y), y), aes(x = t, y = y)) +
  geom_point() +
  geom_vline(xintercept = Y$changepoints, col = 4,  lty = 3)
library(ggplot2)
set.seed(42)
Y = dataRWAR(n = 1e3, phi = .5, sdEta = 3, sdNu = 1, jumpSize = 15, type = "updown", nbSeg = 5)
y = Y$y
ggplot(data.frame(t = 1:length(y), y), aes(x = t, y = y)) +
  geom_point() +
  geom_vline(xintercept = Y$changepoints, col = 4,  lty = 3)

Generating data from a sinusoidal model with changes

Description

This function generates a sequence of observation from a sinusoidal model with changes. This can be used as an example for model misspecification.

Usage

dataSinusoidal(
  n,
  amplitude = 1,
  frequency = 0.001,
  phase = 0,
  sd = 1,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)
dataSinusoidal(
  n,
  amplitude = 1,
  frequency = 0.001,
  phase = 0,
  sd = 1,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)

Arguments

`n`	The length of the sequence of observations.
`amplitude`	The amplitude of the sinusoid
`frequency`	The angular frequency of the sinusoid
`phase`	where the signal starts at time t = 0
`sd`	standard deviation of the noise added on top of the signal
`type`	Possible change scenarios for the jump structure (default: `none`)
`nbSeg`	Number of segments
`jumpSize`	Maximum magnitude of a change

Value

A list containing:

y: the data sequence,
signal: the underlying signal without the noise,
changepoints: the changepoint locations

Examples

Y <- dataSinusoidal(
  1e4,
  frequency = 1 / 1e3,
  amplitude = 10,
  type = "updown",
  jumpSize = 4,
  nbSeg = 4
)
res <- DeCAFS(Y$y)
plot(res, col = "grey")
lines(Y$signal, col = "blue", lwd = 2, lty = 2)
abline(v = res$changepoints, col = 2)
abline(v = Y$changepoints, col = 4, lty = 2)
Y <- dataSinusoidal(
  1e4,
  frequency = 1 / 1e3,
  amplitude = 10,
  type = "updown",
  jumpSize = 4,
  nbSeg = 4
)
res <- DeCAFS(Y$y)
plot(res, col = "grey")
lines(Y$signal, col = "blue", lwd = 2, lty = 2)
abline(v = res$changepoints, col = 2)
abline(v = Y$changepoints, col = 4, lty = 2)

Main DeCAFS algorithm for detecting abrupt changes

Description

This function implements the DeCAFS algorithm to detect abrupt changes in mean of a univariate data stream in the presence of local fluctuations and auto-correlated noise. It detects the changes under a penalised likelihood model where the data, $y_1, ..., y_n$ , is

$y_t = \mu_t + \epsilon_t$

with $\epsilon_t$ an AR(1) process, and for $t = 2, ..., N$

$\mu_t = \mu_{t-1} + \eta_t + \delta_t$

where at time $t$ if we do not have a change then $\delta_t = 0$ and $\eta_t \sim N(0, \sigma_\eta^2)$ ; whereas if we have a change then $\delta_t \neq 0$ and $\eta_t=0$ . DeCAFS estimates the change by minimising a cost equal to twice the negative log-likelihood of this model, with a penalty $\beta$ for adding a change. Note that the default DeCAFS behavior will assume the RWAR model, but fit on edge cases is still possible. For instance, should the user wish for DeCAFS to fit an AR model only with a piecewise constant signal, or similarly a model that just assumes random fluctuations in the signal, this can be specified within the initial parameter estimation, by setting the argument: modelParam = estimateParameters(y, model = "AR"). Similarly, to allow for negative autocorrelation estimation, set modelParam = estimateParameters(Y$y, phiLower = -1).

Usage

DeCAFS(
  data,
  beta = 2 * log(length(data)),
  modelParam = estimateParameters(data, warningMessage = warningMessage),
  penalties = NULL,
  warningMessage = TRUE
)
DeCAFS(
  data,
  beta = 2 * log(length(data)),
  modelParam = estimateParameters(data, warningMessage = warningMessage),
  penalties = NULL,
  warningMessage = TRUE
)

Arguments

`data`	A vector of observations y
`beta`	The l0 penalty. The default one is `2 * log(N)` where `N` is the length of the data.
`modelParam`	A list of 3 initial model parameters: `sdEta`, the SD of the drift (random fluctuations) in the signal, `sdNu`, the SD of the AR(1) noise process, and `phi`, the autocorrelation parameter of the noise process (so the stationary variance of the AR(1) noise process is `sdnu^2` / (1 - `phi^2`). Defaulted to `estimateParameters(data, K = 15)`, to perform automatically estimation of the three. See `estimateParameters()` for more details.
`penalties`	Can be used as an alternative to the model parameters, a list of 3 initial penalties: `lambda`, the l2-penalty penalising over the lag-1 of the signal, `gamma`, penalising over the lag-1 of the AR(1) noise process, `phi`, the autocorrelation parameter. These are related to the `modelParam` list by `list(lambda = 1 / sdEta ^ 2, gamma = 1 / sdNu ^ 2, phi = phi)`. Only one argument between `penalties` and `modelParam` should be specified. Defaulted to NULL.
`warningMessage`	When `TRUE` prints a message to warn the user that the automatic parameter estimation is employed. Defaults to `TRUE`.

Value

Returns an s3 object of class DeCAFSout where:

$changepoints: is the vector of change-point locations,
$signal: is the estimated signal without the auto-correlated noise,
$costFunction: is the optimal cost in form of piecewise quadratics at the end of the sequence,
$estimatedParameters: is a list of parameters estimates (if estimated, otherwise simply the initial modelParam input),
$data: is the sequence of observations.

References

Romano, G., Rigaill, G., Runge, V., Fearnhead, P. (2021). Detecting Abrupt Changes in the Presence of Local Fluctuations and Autocorrelated Noise. Journal of the American Statistical Association. doi:10.1080/01621459.2021.1909598.

Examples

library(ggplot2)
set.seed(42)
Y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3, jumpSize = 15, type = "updown", nbSeg = 5)
y <- Y$y
res = DeCAFS(y)
ggplot(data.frame(t = 1:length(y), y), aes(x = t, y = y)) +
  geom_point() +
  geom_vline(xintercept = res$changepoints, color = "red") +
  geom_vline(xintercept = Y$changepoints, col = "blue",  lty = 3)
library(ggplot2)
set.seed(42)
Y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3, jumpSize = 15, type = "updown", nbSeg = 5)
y <- Y$y
res = DeCAFS(y)
ggplot(data.frame(t = 1:length(y), y), aes(x = t, y = y)) +
  geom_point() +
  geom_vline(xintercept = res$changepoints, color = "red") +
  geom_vline(xintercept = Y$changepoints, col = "blue",  lty = 3)

Estimate parameter in the Random Walk Autoregressive model

Description

This function perform robust estimation of parameters in the Random Walk plus Autoregressive model using a method of moments estimator. To model the time-dependency DeCAFS relies on three parameters. These are sdEta, the standard deviation of the drift (random fluctuations) in the signal, modeled as a Random Walk process, sdNu, the standard deviation of the AR(1) noise process, and phi, the autocorrelation parameter of the noise process. The final estimation of the change locations is affected by the l0 penalty beta and the estimation of the process by those three initial parameters. Therefore, the choice of penalties for DeCAFS is important: where possible investigate resulting segmentations. Should the algorithm return a misspecified estimation of the signal, it might be good to constrain the estimation of the parameters to an edge case. This can be done through the argument model. Alternatively, one could employ a range of penalties or tune these on training data. To manually specify different penalties, see DeCAFS() documentation. If unsure of which model is the most suited for a given sequence, see guidedModelSelection() for guided model selection.

Usage

estimateParameters(
  y,
  model = c("RWAR", "AR", "RW"),
  K = 15,
  phiLower = 0,
  phiUpper = 0.999,
  sdEtaUpper = Inf,
  sdNuUpper = Inf,
  warningMessage = FALSE
)
estimateParameters(
  y,
  model = c("RWAR", "AR", "RW"),
  K = 15,
  phiLower = 0,
  phiUpper = 0.999,
  sdEtaUpper = Inf,
  sdNuUpper = Inf,
  warningMessage = FALSE
)

Arguments

`y`	A vector of observations
`model`	Constrain estimation to an edge case of the RWAR model. Defaults to `"RWAR"`. To fit an AR model only with a piece-wise constant signal, specify `"AR"`. To fit a a random walk plus noise, specify `"RW"`.
`K`	The number of K-lags differences of the data to run the robust estimation over. Default set at 15.
`phiLower`	Smallest value of the autocorrelation parameter. Default set at 0.
`phiUpper`	Highest value of the autocorrelation parameter. Default set at 0.99.
`sdEtaUpper`	Highest value of the RW standard deviation. Default set at Inf
`sdNuUpper`	Highest value of the AR(1) noise standard deviation. Default set at Inf
`warningMessage`	A message to warn the user when the automatic parameter estimation is employed.

Value

Returns a list of estimates that can be employed as an argument for parameter modelParam to run DeCAFS(). Those are:

sdEta: the SD of the drift (random fluctuations) in the signal,
sdNu: the SD of the AR(1) noise process,
phi: the autocorrelation parameter of the noise process.

Examples

set.seed(42)
y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3,  jumpSize = 15, type = "updown", nbSeg = 5)$y
estimateParameters(y)
set.seed(42)
y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3,  jumpSize = 15, type = "updown", nbSeg = 5)$y
estimateParameters(y)

Variance estimation for diff k operators

Description

Estimation of the variances for the diff k operator k = 1 to nbK

Usage

estimVar(y, nbK = 10, type = "MAD")
estimVar(y, nbK = 10, type = "MAD")

Arguments

`y`	A time-series obtained by the dataRWAR function
`nbK`	number of diff k elements to consider
`type`	type of robust variance estimator (MAD, S or Q)

Value

the vector varEst of estimated variances

Examples

estimVar(dataRWAR(1000, sdEta = 0.1, sdNu = 0.1, phi = 0.3, type = "rand1",  nbSeg = 10)$y)
estimVar(dataRWAR(1000, sdEta = 0.1, sdNu = 0.1, phi = 0.3, type = "rand1",  nbSeg = 10)$y)

RW and AR(1) variance estimations with fixed AR(1) parameter

Description

Evaluation of the variances Eta2 and Nu2

Usage

evalEtaNu(v, phi, sdEta = TRUE)
evalEtaNu(v, phi, sdEta = TRUE)

Arguments

`v`	the estimated variances of the diff k operator
`phi`	the autocorrelative AR(1) parameter
`sdEta`	if sdEta = FALSE there is no random walk

Value

a list with an estimation of the variances Eta2 and Nu2

Guided Model Selection

Description

This function aids the user in selecting an appropriate model for a given sequence of observations. The function goes an interactive visualization of different model fits for different choices of initial parameter estimators and l0 penalties (beta). At the end, a call to the DeCAFS function is printed, while a DeCAFS wrapper is provvided.

Usage

guidedModelSelection(data)
guidedModelSelection(data)

Arguments

data

A vector of observations y

Value

A function, being a wrapper of DeCAFS with the selected parameter estimators.

Examples

## Not run: 
y <- dataRWAR(1000, sdEta = 1, sdNu = 4, phi = .4, nbSeg = 4, jumpSize = 20, type = "updown")$y
DeCAFSWrapper <- guidedModelSelection(y)

## End(Not run)
## Not run: 
y <- dataRWAR(1000, sdEta = 1, sdNu = 4, phi = .4, nbSeg = 4, jumpSize = 20, type = "updown")$y
DeCAFSWrapper <- guidedModelSelection(y)

## End(Not run)

Rock structure data from an oil well

Description

This data comes from lowering a probe into a bore-hole, and taking measurements of the rock structure as the probe is lowered. As the probe moves from one rock strata to another we expect to see an abrupt change in the signal from the measurements.

Usage

oilWell
oilWell

Format

A numeric vector of 4050 obervations

Source

Ruanaidh, Joseph JK O., and William J. Fitzgerald. Numerical Bayesian methods applied to signal processing. Springer Science & Business Media, 2012. doi:10.1007/978-1-4612-0717-7

Examples

 
 # removing outliers
 n = length(oilWell)
 h = 32
 med = rep(NA, n)
 for (i in 1:n) {
   index = max(1, i - h):min(n, i + h)
   med[i] = median(oilWell[index])
 }
 residual = (oilWell - med)
 
 y = oilWell[abs(residual) < 8000]
 sigma = sqrt(var(residual[abs(residual) < 8000]))
 
 # running DeCAFS
 res <- DeCAFS(y/sigma)
 plot(res, xlab = "time", ylab = "y", type = "l")
 abline(v = res$changepoints, col = 4, lty = 3)
# removing outliers
 n = length(oilWell)
 h = 32
 med = rep(NA, n)
 for (i in 1:n) {
   index = max(1, i - h):min(n, i + h)
   med[i] = median(oilWell[index])
 }
 residual = (oilWell - med)
 
 y = oilWell[abs(residual) < 8000]
 sigma = sqrt(var(residual[abs(residual) < 8000]))
 
 # running DeCAFS
 res <- DeCAFS(y/sigma)
 plot(res, xlab = "time", ylab = "y", type = "l")
 abline(v = res$changepoints, col = 4, lty = 3)

DeCAFS Plotting

Description

DeCAFS output plotting method.

Usage

## S3 method for class 'DeCAFSout'
plot(x, ...)
## S3 method for class 'DeCAFSout'
plot(x, ...)

Arguments

`x`	the output object from a DeCAFS call
`...`	Additional graphical parameters to be passed down to the plot function

Value

An R plot

Examples

set.seed(42)
Y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3,  jumpSize = 15, type = "updown", nbSeg = 5)
res = DeCAFS(Y$y)
plot(res, type = "l")

set.seed(42)
Y <- dataRWAR(n = 1e3, phi = .5, sdEta = 1, sdNu = 3,  jumpSize = 15, type = "updown", nbSeg = 5)
res = DeCAFS(Y$y)
plot(res, type = "l")

Generate a piecewise constant signal of a given length

Description

Generate a piecewise constant signal of a given length

Usage

scenarioGenerator(
  n,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)
scenarioGenerator(
  n,
  type = c("none", "up", "updown", "rand1"),
  nbSeg = 20,
  jumpSize = 1
)

Arguments

`n`	The length of the sequence of observations.
`type`	Possible change scenarios for the jump structure
`nbSeg`	Number of segments
`jumpSize`	Maximum magnitude of a change

Value

a sequence of N values for the piecewise constant signal

Examples

scenarioGenerator(1e3, "rand1")

scenarioGenerator(1e3, "rand1")

`v`	the estimated variances of the diff k operator
`sdEta`	standard deviation in Random Walk
`sdNu`	standard deviation in AR(1)
`phi`	the autocorrelative AR(1) parameter

Package 'DeCAFS'

Help Index

bestParameters

Description

Usage

Arguments

Value

Examples

L2 error estimation

Description

Usage

Arguments

Value

Generate a Random Walk + AR realization

Description

Usage

Arguments

Value

References

Examples

Generating data from a sinusoidal model with changes

Description

Usage

Arguments

Value

Examples

Main DeCAFS algorithm for detecting abrupt changes

Description

Usage

Arguments

Value

References

Examples

Estimate parameter in the Random Walk Autoregressive model

Description

Usage

Arguments

Value

Examples

Variance estimation for diff k operators

Description

Usage

Arguments

Value

Examples

RW and AR(1) variance estimations with fixed AR(1) parameter

Description

Usage

Arguments

Value

Guided Model Selection

Description

Usage

Arguments

Value

Examples

Rock structure data from an oil well

Description

Usage

Format

Source

Examples

DeCAFS Plotting

Description

Usage

Arguments

Value

Examples

Generate a piecewise constant signal of a given length

Description

Usage

Arguments

Value

Examples