Title: | Acceptance-Rejection Method for Generating Pseudo-Random Observations |
---|---|
Description: | Provides a function that implements the acceptance-rejection method in an optimized manner to generate pseudo-random observations for discrete or continuous random variables. Proposed by von Neumann J. (1951), <https://mcnp.lanl.gov/pdf_files/>, the function is optimized to work in parallel on Unix-based operating systems and performs well on Windows systems. The acceptance-rejection method implemented optimizes the probability of generating observations from the desired random variable, by simply providing the probability function or probability density function, in the discrete and continuous cases, respectively. Implementation is based on references CASELLA, George at al. (2004) <https://www.jstor.org/stable/4356322>, NEAL, Radford M. (2003) <https://www.jstor.org/stable/3448413> and Bishop, Christopher M. (2006, ISBN: 978-0387310732). |
Authors: | Pedro Rafael D. Marinho [aut, cre] , Vera Lucia Damasceno Tomazella [ctb] |
Maintainer: | Pedro Rafael D. Marinho <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.2 |
Built: | 2024-11-21 06:07:01 UTC |
Source: | https://github.com/prdm0/acceptreject |
This function implements the acceptance-rejection method for generating random numbers from a given probability density function (pdf).
accept_reject( n = 1L, continuous = TRUE, f = NULL, args_f = NULL, f_base = NULL, random_base = NULL, args_f_base = NULL, xlim = NULL, c = NULL, parallel = FALSE, cores = NULL, warning = TRUE, ... )
accept_reject( n = 1L, continuous = TRUE, f = NULL, args_f = NULL, f_base = NULL, random_base = NULL, args_f_base = NULL, xlim = NULL, c = NULL, parallel = FALSE, cores = NULL, warning = TRUE, ... )
n |
The number of random numbers to generate. |
continuous |
A logical value indicating whether the pdf is continuous or
discrete. Default is |
f |
The probability density function ( |
args_f |
A list of arguments to be passed to the |
f_base |
Base probability density function (for continuous case).If
|
random_base |
Random number generation function for the base
distribution passed as an argument to |
args_f_base |
A list of arguments for the base distribution. This refers
to the list of arguments that will be passed to the function |
xlim |
A vector specifying the range of values for the random numbers in
the form |
c |
A constant value used in the acceptance-rejection method. If |
parallel |
A logical value indicating whether to use parallel processing
for generating random numbers. Default is |
cores |
The number of cores to be used in parallel processing. Default
is |
warning |
A logical value indicating whether to show warnings. Default
is |
... |
Additional arguments to be passed to the |
In situations where we cannot use the inversion method (situations where it
is not possible to obtain the quantile function) and we do not know a
transformation that involves a random variable from which we can generate
observations, we can use the acceptance and rejection method.
Suppose that and
are random variables with probability
density function (pdf) or probability function (pf)
and
,
respectively. In addition, suppose that there is a constant
such that
for all values of , with
. To use the acceptance and
rejection method to generate observations from the random variable
,
using the algorithm below, first find a random variable
with pdf or
pf
, that satisfies the above condition.
Algorithm of the Acceptance and Rejection Method:
1 - Generate an observation from a random variable
with
pdf/pf
;
2 - Generate an observation from a random variable
;
3 - If accept
; otherwise reject
as an observation of the random variable
and return to step 1.
Proof: Let's consider the discrete case, that is, and
are
random variables with pf's
and
, respectively. By step 3 of
the above algorithm, we have that
.
That is,
Hence, by the Total Probability Theorem, we have that:
Therefore, by the acceptance and rejection method we accept the occurrence of $Y$ as being an occurrence of with probability
. In addition, by Bayes' Theorem, we have that
The result above shows that accepting by the procedure of the algorithm is equivalent to accepting a value from
that has pf
.
The argument c = NULL
is the default. Thus, the function accept_reject()
estimates the value of c
using the optimization algorithm optimize()
using the Brent method. For more details, see optimize()
function.
If a value of c
is provided, the function accept_reject()
will use this value to generate the random observations. An inappropriate choice of c can lead to low efficiency of the acceptance and rejection method.
In Unix-based operating systems, the function accept_reject()
can be executed in parallel. To do this, simply set the argument parallel = TRUE
.
The function accept_reject()
utilizes the parallel::mclapply()
function to execute the acceptance and rejection method in parallel.
On Windows operating systems, the code will not be parallelized even if parallel = TRUE
is set.
For the continuous case, a base density function can be used, where the arguments
f_base
, random_base
and args_f_base
need to be passed. If at least one of
them is NULL
, the function will assume a uniform density function over the
interval xlim
.
For the discrete case, the arguments f_base
, random_base
and args_f_base
should be NULL
, and if they are passed, they will be disregarded, as for
the discrete case, the discrete uniform distribution will always be
considered as the base. Sampling from the discrete uniform distribution
has shown good performance for the discrete case.
The advantage of using parallelism in Unix-based systems is relative and should be tested for each case. Significant improvement is observed when considering parallelism for very large values of n. It is advisable to conduct benchmarking studies to evaluate the efficiency of parallelism in a practical situation.
A vector of random numbers generated using the acceptance-rejection
method. The return is an object of class accept_reject
, but it can be
treated as an atomic vector.
BISHOP, Christopher. 11.4: Slice sampling. Pattern Recognition and Machine Learning. Springer, 2006.
Brent, R. (1973) Algorithms for Minimization without Derivatives. Englewood Cliffs N.J.: Prentice-Hall.
CASELLA, George; ROBERT, Christian P.; WELLS, Martin T. Generalized accept-reject sampling schemes. Lecture Notes-Monograph Series, p. 342-347, 2004.
NEUMANN V (1951). “Various techniques used in connection with random digits.” Notes by GE Forsythe, pp. 36–38.
NEAL, Radford M. Slice sampling. The Annals of Statistics, v. 31, n. 3, p. 705-767, 2003.
inspect()
, plot.accept_reject()
, qqplot.accept_reject()
,
parallel::mclapply()
and optimize()
.
set.seed(0) # setting a seed for reproducibility x <- accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 5, prob = 0.5), xlim = c(0, 5) ) plot(x) y <- accept_reject( n = 1000L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) plot(y)
set.seed(0) # setting a seed for reproducibility x <- accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 5, prob = 0.5), xlim = c(0, 5) ) plot(x) y <- accept_reject( n = 1000L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) plot(y)
Inspect the probability density function used as the base with the theoretical density function from which observations are desired.
inspect( f, args_f, f_base, args_f_base, xlim, c = 1, alpha = 0.4, color_intersection = "#BB9FC9", color_f = "#F890C2", color_f_base = "#7BBDB3" )
inspect( f, args_f, f_base, args_f_base, xlim, c = 1, alpha = 0.4, color_intersection = "#BB9FC9", color_f = "#F890C2", color_f_base = "#7BBDB3" )
f |
Theoretical density function. |
args_f |
List of arguments for the theoretical density function. |
f_base |
Base density function. |
args_f_base |
List of arguments for the base density function. |
xlim |
The range of the x-axis. |
c |
A constant that covers the base density function, with |
alpha |
The transparency of the base density function. The default value is 0.4 |
color_intersection |
Color of the intersection between the base density function and theoretical density functions. |
color_f |
Color of the base density function. |
color_f_base |
Color of the theoretical density function. |
The function inspect()
returns an object of the gg
and ggplot
class that
compares the probability density of two functions and is not useful for the
discrete case, only for the continuous one. Finding the parameters of the
base distribution that best approximate the theoretical distribution and the
smallest value of c
that can cover the base distribution is a great strategy.
Something important to note is that the plot provides the value of the area of
intersection between the theoretical probability density function we want to
generate observations from and the probability density function used as the
base. It's desirable for this value to be as close to 1 as possible, ideally
When the intersection area between the probability density functions is 1,
it means that the base probability density function passed to the f_base
argument overlaps the theoretical density function passed to the f
argument.
This is crucial in the acceptance-rejection method.
However, even if you don't use the inspect()
function to find a suitable
distribution, by finding viable args_base (list of arguments passed to f_base)
and the value of c
so that the intersection area is 1, the accept_reject()
function already does this for you.
The inspect()
function is helpful for finding a suitable base distribution,
which increases the probability of acceptance, further reducing computational
cost. Therefore, inspecting is a good practice.
If you use the accept_reject()
function, even with parallelism enabled by
specifying parallel = TRUE
in accept_reject()
and find that the generation
time is high for your needs, consider inspecting the base distribution.
An object of the gg
and ggplot
class comparing the theoretical
density function with the base density function. The object shows the
compared density functions, the intersection area between them, and the
value of the area.
accept_reject()
, print.accept_reject()
and plot.accept_reject()
.
# Considering c = 1 (default) inspect( f = dweibull, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2, scale = 1), args_f_base = list(shape = 2.1, rate = 2), c = 1 ) # Considering c = 1.35. inspect( f = dweibull, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2, scale = 1), args_f_base = list(shape = 2.1, rate = 2), c = 1.35 ) # Plotting f equal to f_base. This would be the best-case scenario, which, # in practice, is unlikely. inspect( f = dgamma, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2.1, rate = 2), args_f_base = list(shape = 2.1, rate = 2), c = 1 )
# Considering c = 1 (default) inspect( f = dweibull, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2, scale = 1), args_f_base = list(shape = 2.1, rate = 2), c = 1 ) # Considering c = 1.35. inspect( f = dweibull, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2, scale = 1), args_f_base = list(shape = 2.1, rate = 2), c = 1.35 ) # Plotting f equal to f_base. This would be the best-case scenario, which, # in practice, is unlikely. inspect( f = dgamma, f_base = dgamma, xlim = c(0,5), args_f = list(shape = 2.1, rate = 2), args_f_base = list(shape = 2.1, rate = 2), c = 1 )
Inspects the probability function (discrete case) or probability density (continuous case) by comparing the theoretical case with the observed one.
## S3 method for class 'accept_reject' plot( x, color_observed_density = "#BB9FC9", color_true_density = "#F890C2", color_bar = "#BB9FC9", color_observable_point = "#7BBDB3", color_real_point = "#F890C2", alpha = 0.3, hist = TRUE, ... )
## S3 method for class 'accept_reject' plot( x, color_observed_density = "#BB9FC9", color_true_density = "#F890C2", color_bar = "#BB9FC9", color_observable_point = "#7BBDB3", color_real_point = "#F890C2", alpha = 0.3, hist = TRUE, ... )
x |
An object of class |
color_observed_density |
Observed density color (continuous case). |
color_true_density |
True histogram density color (continuous case) |
color_bar |
Bar chart fill color (discrete case) |
color_observable_point |
Color of generated points (discrete case) |
color_real_point |
Color of real probability points (discrete case) |
alpha |
Bar chart transparency (discrete case) and observed density (continuous case) |
hist |
If |
... |
Additional arguments. |
The function plot.accept_reject()
is responsible for plotting the
probability function (in the discrete case) or the probability density (in
the continuous case), comparing the theoretical case with the observed one.
It is useful, therefore, for inspecting the quality of the samples generated
by the acceptance-rejection method. The returned plot is an object of classes
gg
and ggplot
. Easily, you can further customize the plot.
The function plot.accept_reject()
, or simply plot()
, constructs the plot
for inspection and expects an object of class accept_reject
as an argument.
An object of class gg
and ggplot
from the package ggplot2.
The function plot.accept_reject()
expects an object of class
accept_reject
as an argument.
accept_reject()
and print.accept_reject()
.
x <- accept_reject( n = 1000L, f = dbinom, continuous = FALSE, args_f = list(size = 10, prob = 0.5), xlim = c(0, 10) ) plot(x) y <- accept_reject( n = 500L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) plot(y)
x <- accept_reject( n = 1000L, f = dbinom, continuous = FALSE, args_f = list(size = 10, prob = 0.5), xlim = c(0, 10) ) plot(x) y <- accept_reject( n = 500L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) plot(y)
Print method for accept_reject objects.
## S3 method for class 'accept_reject' print(x, n_min = 10L, ...)
## S3 method for class 'accept_reject' print(x, n_min = 10L, ...)
x |
An accept_reject object. |
n_min |
Minimum number of observations to print. |
... |
Additional arguments. |
The function print.accept_reject()
is responsible for printing an object of
class accept_reject
in a formatted manner, providing some information
about the accept_reject
object, including the number of observations, the
value of the constant that maximizes acceptance, and the acceptance
probability
. Additionally, it prints the first generated
observations. The function
print.accept_reject()
delivers formatted output
when executing an object of class accept_reject
in the console or when
executing the function print()
on an object of class accept_reject
,
returned by the function accept_reject()
.
An object of class character
, providing a formatted output with some
information about the accept_reject
object, including the number of
observations, the value of the constant that maximizes acceptance, and
the acceptance probability
. Additionally, it prints the first
generated observations. The function
print.accept_reject()
enables
formatting when executing an object of class 'accept_reject' in the console
or when executing the function print()
on an object of class
accept_reject
, returned by the function accept_reject()
.
accept_reject()
and plot.accept_reject()
.
set.seed(0) # setting a seed for reproducibility x = accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 10, prob = 0.5), xlim = c(0, 10) ) print(x)
set.seed(0) # setting a seed for reproducibility x = accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 10, prob = 0.5), xlim = c(0, 10) ) print(x)
QQ-Plot QQ-Plot between observed quantiles and theoretical quantiles.
qqplot(x, ...)
qqplot(x, ...)
x |
Object of the class |
... |
Additional arguments to be passed to methods. |
Generic method to plot the QQ-Plot between observed quantiles and theoretical
quantiles. The generic method will call the specific method
qqplot.accept_reject()
, which operates on objects of class accept_reject
returned by the function accept_reject()
.
An object of classes gg
and ggplot
with the QQ-Plot of
theoretical quantiles versus observed quantiles.
accept_reject()
, print.accept_reject()
, plot.accept_reject()
and inspect()
.
QQ-Plot Plot the QQ-Plot between observed quantiles and theoretical quantiles.
## S3 method for class 'accept_reject' qqplot( x, alpha = 0.5, color_points = "#F890C2", color_line = "#BB9FC9", size_points = 1, size_line = 1, ... )
## S3 method for class 'accept_reject' qqplot( x, alpha = 0.5, color_points = "#F890C2", color_line = "#BB9FC9", size_points = 1, size_line = 1, ... )
x |
Object of the class accept_reject returned by the function |
alpha |
Transparency of the points and reference line representing where the quantiles should be (theoretical quantiles). |
color_points |
Color of the points (default is |
color_line |
Color of the reference line (detault is |
size_points |
Size of the points (default is |
size_line |
Thickness of the reference line (default is |
... |
Additional arguments for the |
The function qqplot.accept_reject()
for samples larger than ten thousand,
the geom_scattermost()
function from the
scattermore library
is used to plot the points, as it is more efficient than geom_point()
from
the ggplot2 library.
An object of classes gg
and ggplot
with the QQ-Plot between the
observed quantiles generated by the return of the function accept_reject()
and the theoretical quantiles of the true distribution.
qqplot.accept_reject()
, accept_reject()
, plot.accept_reject()
, inspect()
and
qqplot()
.
set.seed(0) # setting a seed for reproducibility x <- accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 5, prob = 0.5), xlim = c(0, 5) ) qqplot(x) y <- accept_reject( n = 1000L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) qqplot(y)
set.seed(0) # setting a seed for reproducibility x <- accept_reject( n = 2000L, f = dbinom, continuous = FALSE, args_f = list(size = 5, prob = 0.5), xlim = c(0, 5) ) qqplot(x) y <- accept_reject( n = 1000L, f = dnorm, continuous = TRUE, args_f = list(mean = 0, sd = 1), xlim = c(-4, 4) ) qqplot(y)