| Title: | SAE using HB Twofold Subarea Model under Beta Distribution |
|---|---|
| Description: | Estimates area and subarea level proportions using the Small Area Estimation (SAE) Twofold Subarea Model with a hierarchical Bayesian (HB) approach under Beta distribution. A number of simulated datasets generated for illustration purposes are also included. The 'rstan' package is employed to estimate parameters via the Hamiltonian Monte Carlo and No U-Turn Sampler algorithm. The model-based estimators include the HB mean, the variation of the mean, and quantiles. For references, see Rao and Molina (2015) <doi:10.1002/9781118735855>, Torabi and Rao (2014) <doi:10.1016/j.jmva.2014.02.001>, Leyla Mohadjer et al.(2007) <http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000559.pdf>, Erciulescu et al.(2019) <doi:10.1111/rssa.12390>, and Yudasena (2024). |
| Authors: | Nasya Zahira Putri [aut, cre], Cucu Sumarni [aut], Rizal Fauziarochman Yudasena [aut] |
| Maintainer: | Nasya Zahira Putri <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.0 |
| Built: | 2026-05-14 10:00:57 UTC |
| Source: | https://github.com/nasyazahira/saehb.tf.beta |
Small Area Estimation using Hierarchical Bayes Twofold Subarea Level Model under Beta Distribution
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.36.0.9000. https://mc-stan.org
Function betaTF used for estimation of subarea and area means simultaneously under Twofold Subarea Level Small Area Estimation Model Using Hierarchical Bayesian Method with Beta distribution
The range of data must be .
betaTF( formula, area, weight, iter.update = 3, iter.mcmc = 1000, coef = NULL, var.coef = NULL, thin = 1, burn.in = floor(iter.mcmc/2), sigma2.u = 1, sigma2.v = 1, data )betaTF( formula, area, weight, iter.update = 3, iter.mcmc = 1000, coef = NULL, var.coef = NULL, thin = 1, burn.in = floor(iter.mcmc/2), sigma2.u = 1, sigma2.v = 1, data )
formula |
Formula that describe the fitted model |
area |
Index that describes the code relating to area in each subarea.This should be defined for aggregation to get area estimator. Index start from 1 until m |
weight |
Vector contain proportion units or proportion of population on each subarea. |
iter.update |
Number of updates perform ( default = |
iter.mcmc |
Number of total iterations per chain (default = |
coef |
Vector contains prior initial value of Coefficient of Regression Model for fixed effect with default vector of |
var.coef |
Vector contains prior initial value of variance of Coefficient of Regression Model for fixed effect with default vector of |
thin |
Thinning rate, must be a positive integer |
burn.in |
Number of iterations to discard at the beginning |
sigma2.u |
Number of prior initial value of variance of subarea random effect |
sigma2.v |
Number of prior initial value of variance of area random effect |
data |
The data frame |
This function returns a list with following objects:
Est_sub |
A dataframe contains the values, standard deviation, and quantile of Subarea mean Estimates using Twofold Subarea level model under Hierarchical Bayes method |
Est_area |
A dataframe contains the values, standard deviation, and quantile of Area mean Estimates using Twofold Subarea level model under Hierarchical Bayes method |
area_randeff |
A dataframe contains area random effect |
sub_randeff |
A dataframe contains subarea random effect |
refVar |
A dataframe that contains estimated subarea and area random effect variance |
coefficient |
A dataframe that contains the estimated model coefficient |
plot |
Trace, Density, Autocorrelation Function Plot of coefficient |
fit <- betaTF(y~X1+X2,area="codearea",weight="w",data=dataBeta, iter.mcmc = 500)fit <- betaTF(y~X1+X2,area="codearea",weight="w",data=dataBeta, iter.mcmc = 500)
A dataset to simulate Small Area Estimation using Hierarchical Bayesian method under Two Fold Subarea level model with Beta distribution on variable interest.
This data is generated by these following steps:
Generate auxiliary variable , sampling error ,subarea random effect , area random effect , and weight or proportions of unit
Generate auxiliary variable on subarea level ~
Generate auxiliary variable on subarea level ~
Setting coefficient
Generate area random effect ~
Generate subarea random effect ~
Calculate target parameter
Generate constant for Beta parameter ~
Calculate Beta parameter and
Generate direct estimator ~
Generate weight on each subarea ~
Direct estimation (), Auxiliary variables ,, vardir, codearea, and weight are combined in a dataframe called dataBeta
dataBetadataBeta
A data frame with 90 rows and 6 columns:
Direct estimation of subarea mean
Auxiliary variable of
Auxiliary variable of
Index that describes the code relating to area for each subarea
Unit proportion on each subarea or weight
Sampling variance of direct estimator
A dataset to simulate Small Area Estimation using Hierarchical Bayesian method under Two Fold Subarea level model with Beta distribution and Non-sampled subarea
This data contains NA values that indicates no sampled at one or more Subareas. It uses the dataBeta with the direct estimates and the related variances in 5 subareas are missing.
dataBetaNSdataBetaNS
A data frame with 90 rows and 6 columns:
Direct estimation of subarea mean
Auxiliary variable of
Auxiliary variable of
Index that describes the code relating to area for each subarea
Unit proportion on each subarea or weight
Sampling variance of direct estimator
Function explore provides an initial exploration of a dataset. It calculate summary statistics for all variables in the provided formula or dataset,
visualizes the distribution of the response variable as a histogram density,and boxplot for Coefficient of Variation (CV) / Relative Standard Error (RSE).
explore(formula, CV = NULL, data, normality = FALSE)explore(formula, CV = NULL, data, normality = FALSE)
formula |
Optional formula to specify a response variable (e.g., y ~ x1 + x2). |
CV |
Coefficient of Variation (CV) or Relative Standard Error (RSE) of the response variable |
data |
The dataframe to be explored |
normality |
Logical; if |
Prints a data frame of summary statistics for the selected variables, including minimum, 1st quartile, median, mean, 3rd quartile, maximum, and number of missing values (NA). Plots are drawn to the current graphics device.
dataBeta$CV <- sqrt(dataBeta$vardir)/dataBeta$y explore(y~X1+X2, CV = "CV", data = dataBeta)dataBeta$CV <- sqrt(dataBeta$vardir)/dataBeta$y explore(y~X1+X2, CV = "CV", data = dataBeta)