Package 'deltatest' reference manual

Title:	Statistical Hypothesis Testing Using the Delta Method
Description:	Statistical hypothesis testing using the Delta method as proposed by Deng et al. (2018) <doi:10.1145/3219819.3219919>. This method replaces the standard variance estimation formula in the Z-test with an approximate formula derived via the Delta method, which can account for within-user correlation.
Authors:	Koji Makiyama [aut, cre, cph], Shinichi Takayanagi [med]
Maintainer:	Koji Makiyama <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0.9000
Built:	2025-03-15 11:13:37 UTC
Source:	https://github.com/hoxo-m/deltatest

The Delta Method for Ratio

Description

Applies the Delta method to the ratio of two random variables, $f(X,Y)=X/Y$ , to estimate the expected value, variance, standard error, and confidence interval.

Methods

Public methods

DeltaMethodForRatio$new()
DeltaMethodForRatio$get_expected_value()
DeltaMethodForRatio$get_variance()
DeltaMethodForRatio$get_squared_standard_error()
DeltaMethodForRatio$get_standard_error()
DeltaMethodForRatio$get_confidence_interval()
DeltaMethodForRatio$get_info()
DeltaMethodForRatio$compute_expected_value()
DeltaMethodForRatio$compute_variance()
DeltaMethodForRatio$compute_confidence_interval()
DeltaMethodForRatio$clone()

Method `new()`

Initialize a new DeltaMethodForRatio object.

Usage

DeltaMethodForRatio$new(numerator, denominator, bias_correction = FALSE)

Arguments

numerator, denominator: numeric vectors sampled from the distributions of the random variables in the numerator and denominator of the ratio.
bias_correction: logical value indicating whether correction to the mean of the metric is performed using the second-order term of the Taylor expansion. The default is FALSE.

Method `get_expected_value()`

Get the expected value.

Usage

DeltaMethodForRatio$get_expected_value()

Returns

numeric estimate of the expected value of the ratio.

Method `get_variance()`

Get the variance.

Usage

DeltaMethodForRatio$get_variance()

Returns

numeric estimate of the variance of the ratio.

Method `get_squared_standard_error()`

Get the squared standard error.

Usage

DeltaMethodForRatio$get_squared_standard_error()

Returns

numeric estimate of the squared standard error of the ratio.

Method `get_standard_error()`

Get the standard error.

Usage

DeltaMethodForRatio$get_standard_error()

Returns

numeric estimate of the standard error of the ratio.

Method `get_confidence_interval()`

Get the confidence interval.

Usage

DeltaMethodForRatio$get_confidence_interval(
  alternative = c("two.sided", "less", "greater"),
  conf_level = 0.95
)

Arguments

alternative: character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater", or "less". You can specify just the initial letter.
conf_level: numeric value specifying the confidence level of the interval. The default is 0.95.

Returns

numeric estimates of the lower and upper bounds of the confidence interval of the ratio.

Method `get_info()`

Get statistical information.

Usage

DeltaMethodForRatio$get_info(
  alternative = c("two.sided", "less", "greater"),
  conf_level = 0.95
)

Arguments

alternative: character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater", or "less". You can specify just the initial letter.
conf_level: numeric value specifying the confidence level of the interval. The default is 0.95.

Returns

numeric estimates include the expected value, variance, standard error, and confidence interval.

Method `compute_expected_value()`

Class method to compute the expected value of the ratio using the Delta method.

Usage

DeltaMethodForRatio$compute_expected_value(
  mean1,
  mean2,
  var2,
  cov = 0,
  bias_correction = FALSE
)

Arguments

mean1: numeric value of the mean numerator of the ratio.
mean2: numeric value of the mean denominator of the ratio.
var2: numeric value of the variance of the denominator of the ratio.
cov: numeric value of the covariance between the numerator and denominator of the ratio. The default is 0.
bias_correction: logical value indicating whether correction to the mean of the metric is performed using the second-order term of the Taylor expansion. The default is FALSE.

Returns

numeric estimate of the expected value of the ratio.

Method `compute_variance()`

Class method to compute the variance of the ratio using the Delta method.

Usage

DeltaMethodForRatio$compute_variance(mean1, mean2, var1, var2, cov = 0)

Arguments

mean1: numeric value of the mean numerator of the ratio.
mean2: numeric value of the mean denominator of the ratio.
var1: numeric value of the variance of the numerator of the ratio.
var2: numeric value of the variance of the denominator of the ratio.
cov: numeric value of the covariance between the numerator and denominator of the ratio. The default is 0.

Returns

numeric estimate of the variance of the ratio

Method `compute_confidence_interval()`

Class method to compute the confidence interval of the ratio using the Delta method.

Usage

DeltaMethodForRatio$compute_confidence_interval(
  mean,
  standard_error,
  alternative = c("two.sided", "less", "greater"),
  conf_level = 0.95
)

Arguments

mean: numeric value of the estimated mean of the ratio.
standard_error: numeric value of the estimated standard error of the mean of the ratio.
alternative: character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater", or "less". You can specify just the initial letter.
conf_level: numeric value specifying the confidence level of the interval. The default is 0.95.

Returns

numeric estimates of the lower and upper bounds of the confidence interval of the ratio.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

DeltaMethodForRatio$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

References

id:sz_dr (2018). Calculating the mean and variance of the ratio of random variables using the Delta method [in Japanese]. If you are human, think more now. https://www.szdrblog.info/entry/2018/11/18/154952

Two Sample Z-Test for Ratio Metrics Using the Delta Method

Description

Performs two sample Z-test to compare the ratio metrics between two groups using the delta method. The Delta method is used to estimate the variance by accounting for the correlation between the numerator and denominator of ratio metrics.

Usage

deltatest(
  data,
  formula,
  by,
  group_names = "auto",
  type = c("difference", "relative_change"),
  bias_correction = FALSE,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  na.rm = FALSE,
  quiet = FALSE
)
deltatest(
  data,
  formula,
  by,
  group_names = "auto",
  type = c("difference", "relative_change"),
  bias_correction = FALSE,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  na.rm = FALSE,
  quiet = FALSE
)

Arguments

`data`	data.frame containing the numerator and denominator columns of the ratio metric, aggregated by randomization unit. It also includes a column indicating the assigned group (control or treatment). For example, if randomizing by user while the metric is click-through rate (CTR) per page-view, the numerator is the number of clicks per user, and the denominator is the number of page views per user.
`formula`	expression representing the ratio metric. It can be written in three styles: standard formula `x/y ~ group`, lambda formula `~ x/y`, or NSE expression `x/y`.
`by`	character string or symbol that indicates the group column. If the group column is specified in the `formula` argument, it is not required.
`group_names`	character vector of length 2 or `"auto"`. It specifies which of the two strings contained in the group column is the control group and which is the treatment group. The first string is considered the control group, and the second string is considered the treatment group. If `"auto"` is specified, it is interpreted as specifying the strings in the group column sorted in lexicographical order. The default is `"auto"`.
`type`	character string specifying the test type. If `"difference"` (default), the hypothesis test evaluates the difference in means of the ratio metric between two groups. If `"relative_change"`, it evaluates the relative change $(\mu_2 - \mu_1) / \mu_1$ instead. You can specify just the initial letter.
`bias_correction`	logical value indicating whether correction to the mean of the metric is performed using the second-order term of the Taylor expansion. The default is `FALSE`.
`alternative`	character string specifying the alternative hypothesis, must be one of `"two.sided"` (default), `"greater"`, or `"less"`. You can specify just the initial letter.
`conf.level`	numeric value specifying the confidence level of the interval. The default is 0.95.
`na.rm`	logical value. If `TRUE`, rows containing NA values in the data will be excluded from the analysis. The default is `FALSE`.
`quiet`	logical value indicating whether messages should be displayed during the execution of the function. The default is `FALSE`.

Value

A list with class "htest" containing following components:

`statistic`	the value of the Z-statistic.
`p.value`	the p-value for the test.
`conf.int`	a confidence interval for the difference or relative change appropriate to the specified alternative hypothesis.
`estimate`	the estimated means of the two groups, and the difference or relative change.
`null.value`	the hypothesized value of the difference or relative change in means under the null hypothesis.
`stderr`	the standard error of the difference or relative change.
`alternative`	a character string describing the alternative hypothesis.
`method`	a character string describing the method used.
`data.name`	the name of the data.

References

Deng, A., Knoblich, U., & Lu, J. (2018). Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. doi:10.1145/3219819.3219919

Examples

library(dplyr)
library(deltatest)

n_user <- 2000

set.seed(314)
df <- deltatest::generate_dummy_data(n_user) |>
  group_by(user_id, group) |>
  summarise(click = sum(metric), pageview = n(), .groups = "drop")

deltatest(df, click / pageview, by = group)

library(dplyr)
library(deltatest)

n_user <- 2000

set.seed(314)
df <- deltatest::generate_dummy_data(n_user) |>
  group_by(user_id, group) |>
  summarise(click = sum(metric), pageview = n(), .groups = "drop")

deltatest(df, click / pageview, by = group)

Generate Dummy Data

Description

Generate random dummy data for simulation studies. For details, see Section 4.3 in Deng et al. (2017).

Usage

generate_dummy_data(
  n_user,
  model = c("Bernoulli", "normal"),
  xi = 0,
  sigma = 0,
  random_unit = c("user", "session", "pageview"),
  treatment_ratio = 0.5
)
generate_dummy_data(
  n_user,
  model = c("Bernoulli", "normal"),
  xi = 0,
  sigma = 0,
  random_unit = c("user", "session", "pageview"),
  treatment_ratio = 0.5
)

Arguments

`n_user`	integer value specifying the number of users included in the generated data. Since multiple rows are generated for each user, the number of rows in the data exceeds the number of users.
`model`	character string specifying the model that generates the potential outcomes. It must be one of `"Bernoulli"` (default) or `"normal"`. You can specify just the initial letter.
`xi`	numeric value specifying the treatment effect variation (TEV) under the Bernoulli model, where $TEV = 2\xi$ . This argument is ignored if the `model` argument is set to `"normal"`. The default is 0.
`sigma`	numeric value specifying the treatment effect variation (TEV) under the normal model, where $TEV = \sigma$ . This argument is ignored if the `model` argument is set to `"Bernoulli"`. The default is 0.
`random_unit`	character string specifying the randomization unit. It must be one of `"user"` (default), `"session"`, or `"pageview"`. You can specify just the initial letter. The default is 0.
`treatment_ratio`	numeric value specifying the ratio assigned to treatment. The default value is 0.5.

Value

data.frame with the columns user_id, group, and metric, where each row represents a metric value for a page-view.

References

Deng, A., Lu, J., & Litz, J. (2017). Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. doi:10.1145/3018661.3018677

Examples

library(deltatest)

set.seed(314)
generate_dummy_data(n_user = 2000)

library(deltatest)

set.seed(314)
generate_dummy_data(n_user = 2000)

Package 'deltatest'

Help Index

The Delta Method for Ratio

Description

Methods

Public methods

Method new()

Usage

Arguments

Method get_expected_value()

Usage

Returns

Method get_variance()

Usage

Returns

Method get_squared_standard_error()

Usage

Returns

Method get_standard_error()

Usage

Returns

Method get_confidence_interval()

Usage

Arguments

Returns

Method get_info()

Usage

Arguments

Returns

Method compute_expected_value()

Usage

Arguments

Returns

Method compute_variance()

Usage

Arguments

Returns

Method compute_confidence_interval()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

References

Two Sample Z-Test for Ratio Metrics Using the Delta Method

Description

Usage

Arguments

Value

References

Examples

Generate Dummy Data

Description

Usage

Arguments

Value

References

Examples

Method `new()`

Method `get_expected_value()`

Method `get_variance()`

Method `get_squared_standard_error()`

Method `get_standard_error()`

Method `get_confidence_interval()`

Method `get_info()`

Method `compute_expected_value()`

Method `compute_variance()`

Method `compute_confidence_interval()`

Method `clone()`