Introduction
Latent variable models (LVMs) are a statistical method for modeling a series of correlated data in order to assess the correlations between manifest and latent variables (Bollen and Paxton, 1998; Lee, 2007).
The Bayesian method starts by defining a prior distribution for the parameters that need to be estimated. Without referencing the dataset used to estimate the model, the prior represents the researcher's knowledge (Lee, 2007).
The rapid growth of LVMs is a result of the demand for complex models and the accompanying statistical approaches for tackling difficult research problems in a range of fields. The Bayesian technique is built using the Gibbs sampler algorithm (Geman and Geman, 1984), where latent variables in multiple populations and concealed continuous normal measurements are regarded as hypothetical missing data. Conjugate priors are employed for the structural parameters whereas non-informative priors are used for the thresholds (cut points with equally and unequally spaced distances). The primary objective of this study is to present a Bayesian approach for the analysis of two populations nonlinear LVMs with dichotomous variables and covariates.
Many academics have suggested models in recent years that incorporate nonlinear relationships between the manifest, and latent variables. Several of these publications were suggested by Lee and Song (2003), Lee and Song (2005), Lee (2006), Lee and Tang, (2006(, Cai et al. (2008(, Lee et al. (2009), Lee et al. (2010).
A specific methodology for using the Bayesian approach in factor analysis is provided by Song and Lee (2002). They developed an analytical model that uses joint Bayesian estimations for the component scores and structural parameters in relation to the established restrictions, enabling the simultaneous determination of numerous findings. This system has been shown to be effective in producing calculations of these estimations because it combines the Gibbs model and Metropolis-Hastings algorithm.
The maximum likelihood method was applied by Song and Lee (2006) to multi-sample nonlinear structural equation models with missing continuous and dichotomous data.
A Bayesian nonlinear structural equation model was created by Song and Lee (2006) using linear fixed covariate and latent variables in the measurement model and nonlinear fixed covariate and latent variables in the structural model. Mixed continuous and dichotomous data are used in this study, and a concealed continuous normal distribution (a truncated normal with unknown parameters) is presented to overcome the dichotomous data problem. In order to solve the issue,
Lee (2007) used an underlying latent continuous normal distribution (a truncated normal distribution with unknown parameters) in Bayesian multi-sample nonlinear structural equation models with dichotomous variables. Additionally, the Gibbs sampling method was employed to estimate the parameter. The ordered categorical variables were handled as a continuous normal distribution in Lu et al. (2012) Bayesian study of multi-sample nonlinear structural equation models with application to behavioural finance. The multi-sample analytic method is essential in many applications, such as cross-cultural research. Nonlinear effects, such as quadratic and interaction effects between the covariates and latent variables, are frequently essential for constructing the main theory.
The document has the following structure. The model is described in Section 2 of the document. The Bayesian analysis is described in Section 3. The model comparison using DIC is described in Section 4. A case study may be found in Section 5. Section 6 summarizes the results and the discussion, while Section 7 offers conclusions and recommendations.
- Model Description
The suggested latent variable model for this case include both latent variables and linear covariates in the measurement equation. The structural model also includes latent nonlinear variables and nonlinear covariates. This LVM is taken into account.
where is a manifest variable with dichotomous data that has been established as a random vector, is a vector of linear covariates, and is a vector of dichotomous covariates, since is a matrix with unknown parameters, it is frequently referred to as the factor loading matrix. A random vector of latent variables is shown in , while a random vector of residuals is shown in .
This leads to the assumption that the outcome of is independent, and that is then distributed independently as . Additionally, has an independent distribution as , where is a diagonal matrix with as its diagonal components.
Furthermore, it has been found that and in this instance are both independent. A latent vector, , is separated into , where vectors and are both present, in order to implement more complex mathematical situations.
The vectors of the exogenous latent variables are and the endogenous latent variables are , respectively.
The vector of dichotomous variables on is used to estimate the probable significant causal impact of . But if is non-normal, then must likewise be non-normal.
The following latent variable model gives the definition of the structural equation:
A matrix of unknown parameters is represented by , a vector-valued function with differentiable functions is represented by , and an unknown parameter matrix is represented by , and . For a simple expression, (2) can be written as:
where , , and are error measurement vectors.
It is necessary to first suppose that is distributed as , then that is distributed as , and that is a representation of a diagonal matrix with the entries and for which and are independent of each other.
An illustration of two populations LVM defined in (2) that are connected to
and is:
Here,
and
where g = 1; 2. Further, and are both quadratic terms of elements. As , may be drawn from the arbitrary distributions for covariates that are dichotomous data.
Furthermore, let and to stand for the kth row for each A and .
So, let to be a partition of which corresponds with , which is also a partition. If follows that and , it follows from (1) that
But when employed in a practical application scenario, is typically not complex, and as a result, it can be anticipated that will likewise be very straightforward, making it easy to calculate .
It is also beneficial to investigate this indirect method for modeling covariates, similar to those illustrated above, by first adding with , and then by managing each component of the latter as if it were an exogenous latent variable that could be measured accurately using a single indicator.
At most basic level, a dichotomous variable can be defined according to its underlying latent continuous random variable by:
In order that the set of threshold values that define the specified categories are also true, and the number of categories for the dichotomous variable is represented by .
We will utilize hidden continuous normal distribution (a truncated normal distribution with known parameters) to solve the issue of dichotomous data in covariates . Thus, it follows:
However, it should be noted that the number of thresholds (cut points) for each group is equal for every dichotomous variable. However, we employ both equally and unequally categorized distances as our criteria.
- The Bayesian Analysis
Let serve as an unknown parameter vector in the previously mentioned model, and let serve as an unknown threshold vector for the dichotomous variables that belong to the gth group.
This was chosen because a study of several populations typically identifies a certain type of parameter in as an invariant within group models. The following limitations, for instance, apply to limits on cut points:
The thresholds on the model are typically implemented as and/or . Consequently, we may accept some common parameters while evaluating the data, . Allow to be a vector that contains all of the unknown separate parameters , and allow to be a vector that contains all of the unknown thresholds.
The Gibbs sampler is used to create the Bayesian estimate of and .
Let the dichotomous data that were observed be and . Let and Z, be the latent continuous measurements in and , respectively.
After that, add Y to the observed data in the posterior analysis. The problem will be easier to handle after Y has been defined since all the data is taken into consideration and is deemed continuous. Furthermore, assume that and represent the latent variable matrix.
Significantly reduced complications occur from the nonlinear connections between the latent variables. As a result, by enhancing the data, problems related to the model's more intricate components can be resolved
Through posterior analysis, may be added to (Z), which stands for the collection of observed data.
We will also show how the joint posterior distribution looks. The Geman and Geman (1984) Gibbs sampler may be used to generate a set of observations from the associated joint posterior distribution.
As a consequence, using the created sample of data, a number of conventional inferences may be used to determine the Bayesian answer. Additionally, we may build a collection of sample observations from these conditional distributions , and using the Gibbs sampler and the iteration technique.
We may establish the non-informative prior to calculating in a method similar to how previous cut point issues were solved, so that the corresponding prior distribution is comparable to the constant.
According to the different group models, the conditional distribution can also be divided into several parts that each comprise a variety of structural parameters.
Some examples of competing models are:
No constraints
As various theories are used or as competing approaches are explored, these components change. When placed under the various definitions of Mk as described above, the components of the conditional distribution, known as , and the condition applied to preceding distributions are considerably different, or variable.
The basic assumption is that the previous distributions for the unconstrained parameters would behave independently for each group. It is also necessary to determine the data that belongs in each group and to supply an accurate value of the prior distribution when creating an estimate for the unconstrained parameters in order for them to be fully implemented.
The Bayesian estimates and model comparison in the setting of two populations nonlinear LVMs with dichotomous variables are covered in this section. To complete the plan, the idea of data augmentation is merged with MCMC tools. Two populations nonlinear LVMs are theoretically a particular case of the two-level SEM, and the output may be utilized to produce different conditional distributions that are required by the Gibbs sampler.
Since there are clear restrictions on the parameters in different groups that must be satisfied, more attention needs to be paid to identifying the similar previous distributions. To employ the route sampling method for model comparison in two-level SEMs, similar information is required (Lee and Song, 2012).
This section explains how to use the Bayesian technique to examine the prior nonlinear LVMs in the setting of dichotomous variables. This method has various advantages for the whole application, including: (1) When it is included directly, applying past information can improve the total analysis. Particularly, it produces more precise parameter estimate. (2) Various scholars have shown that sampling-based Bayesian approaches are independent of asymptotic theory. (Lee, 2006; Lee and Shi, 2000; Shi and Lee, 2000; Lee et al., 2010; Lee et al., 2007; Lee and Song, 2002; Song et al., 2011; Yang and Dunson, 2010); (3) Both Bayesian and ML estimates feature similar optimal asymptotic properties. Through the posterior analysis, the observed data, as represented by [Z], is enhanced according to the latent data [Y, ]. Allow to represent the observed data set of Dichotomous variables and to act as the vector having unknown parameters in order to construct the Bayesian method for the suggested LVMs.
By defining in such a way that is treated as random variable with a prior distribution and prior probability density function, the Bayesian technique would be used to explain the situation. Thus, the related assumptions can be based on the observed data for Z and . So, allow Let represent the joint probability density function of both with reference to different Mk.
Based on a well-known identity in probability, , where and are conditional probability density functions. It follows that:
The posterior density function of the unknown parameters is the name given to the function .
The posterior density function , or unknown parameters, is what results from this. Additionally, the probability function and the prior density function make use of sample data and previous knowledge.
However, it should be emphasized that depends on sample size, whereas (is not). Due to its greater similarity to the likelihood function , the posterior density function, is more pertinent for situations involving large samples than , which is less significant.
Therefore, keep in mind that is important for the Bayesian technique when the sample size is less or when the data obtained from Z contains dichotomous information.
By treating yi as an unobserved variable in this situation, which corresponds to the manifest dichotomous variables as they are found in zi, MCMC techniques are used.
The Bayesian estimate for and any standard error estimates may be obtained from the sample mean and variance matrices, respectively, if we can extract a sufficient number of observations (represented by ) from the joint posterior distribution defined by , which is then used to construct the joint posterior distribution.
This means that even if establishing the conditional distribution, as explained in Step (1), it is still required to expressly identify the previous distribution for the corresponding components in . The conjugate prior distributions have typically shown to be flexible and appropriate for the task during Bayesian analysis (Broemeling, 1985).
Many Bayesian analyses in structural equation models have used this form of prior distribution (see Lee and Song, 2004; Song and Lee, 2007). Consequently, the popular conjugate prior distributions listed below are employed:
Given the definition that is distributed according to, , which is the kth diagonal element of and are the kth rows of and , respectively. and and are assumed to be known, as prior information.
The dichotomous variables and covariates in this situation, however, can make the linked conditional distributions too complicated to readily extract or simulate data from them.
This encourages the additional escalation of Y, x the latent matrices, in the posterior analysis, and motivates attention to the joint posterior distribution . To garner observations of this posterior distribution, using the Gibbs sampler, it is essential to begin with the starting values . The following procedure is then implemented to simulate and so on. More specifically at the mth reiteration of the current values .
- Generate from
-
- Generate from The cycle will only produce after the mth repeat, according to the earlier definition.
Therefore, it can be demonstrated that the joint distribution of the value of moves in the direction of the joint posterior distribution as m gets closer to infinity (see Geman and Geman, (1984)).
- Model Comparison
a measure of model comparability the Akaike Information Criterion (AIC; Akaike, (1973)) is an extension of the DIC (see Spiegelhalter et al., 2002). The DIC is calculated using a competitive model with a vector of unknown parameters as follows:
where measures the goodness of fit of the model, and is defined as
Here, is the effective number of parameters in , and is defined as
in which is the Bayesian estimate of . Let be a sample of observations simulated from the posterior distribution. The expectations in Equations (19) and (20) can be estimated as follows:
The model with the lower DIC value is chosen in Bayesian LVMs. We analyzed the same data using two populations of NLVMs using the same measurement model to demonstrate how to use DIC for model comparison. The OpenBUGS application generates the DIC values for two populations of NLVMs using actual data.
- A Case Study
Let's have a look at the data that may be utilized to derive conclusions for various, independent samples that are chosen from the natural history based on the research of a rural drug discovered in Ohio (n=200) and Kentucky (n=200) in the USA between the years 2003 and 2005 (Booth et al., 2006).
The BSI-18 scale, which examined three categories of mental illnesses and took into account factors including somatization (SOM), depression (DEP), and anxiety (ANX), experienced several more modifications.
There are two covariates in each group of the sixteen variables that make up the data. Additionally, all of them were assessed using the following ordered categorical variables: (1, not at all; 2, a little bit; 3, moderately; 4, quite a bit; 5, extremely) are changed to only two categories to be dichotomous data (Wang & Wang, 2012).
This actual data analysis, in which 16 manifest variables are associated to two fundamental latent variables from two populations nonlinear LVMs defined in Equations 17 and 18, provides some insight into the empirical performance of the suggested Bayesian technique.
Because of this, a few quadratic and interaction effects of the latent variables are taken into account. We utilize a real data set connected to random vectors with G=1,2, to demonstrate the Bayesian approaches in analyzing nonlinear LVMs with dichotomous variables.
let be the latent continuous random vector, which corresponds to the dichotomous variables where are dichotomous variables that are related to three latent variables , , with the following values of the parameters in and
where parameters with an asterisk are treated as fixed for identifying the model.
The true values of elements in and are given by: ; . The relationships of the latent variables in are assessed by the nonlinear structural equation, which is described in the following equations.
Here,
and
and . The true values for . The true values for . The covariates come from arbitrary distributions that give dichotomous data.
In the conjugate prior distributions of the parameters, the following precise prior inputs of the hyper-parameter values are taken into account:
Prior I: Elements in , and in Equation (10) are set equal to the following values with initial values are equal to 1 for two groups of data;
and are taken to be 0.25 times the identity matrices; , , .
Prior II: Elements in , and in Equation (10) are set equal to the following values with initial values are equal to 0.5 for two groups of data;
and are taken to be 0.25 times the identity matrices; , , .
The parameter estimates for a situation with a small sample size can be significantly impacted by the prior, which is informative.
Using Open BUGS, a data set (n1=200, n2=200) was analysed. The MCMC method for data analysis required more iterations to converge when compared to the Bayesian analyses of LVMs using data. Bayesian estimates for the truncated normal distribution and censored normal distribution in two populations of nonlinear LVMs were derived using T=10000 Iterations after discarding (1000) burn-in Iterations. The Open BUGS software (Spiegelhalter et al., 2007) can implement Bayesian estimates of the parameters in nonlinear LVMs. To demonstrate this, we apply Open BUGS to analyse the current aid data based on Equations (17) and (18) with different prior inputs.
- Results and Discussion
This section's goal is to give the findings of a simulation research for NLVMs in order to demonstrate how well the DIC and Bayesian estimates work empirically when compared to other models. However, we have the following proposed four models for g=1,2:
This paper introduces the Bayesian technique for analysing two populations nonlinear LVMs for dichotomous variables and covariates. Using recently created powerful instruments and the completely free statistical program Open BUGS, the model selection statistic (DIC) and the Bayesian analysis of the unobserved parameters are both achieved. As a result, real data may easily be applied to our suggested strategy. The purpose of this analysis is to use Bayesian nonlinear two populations LVMs with Dichotomous variables and covariates. The analysis of dichotomous data in LVMs is subject to various limitations. First, data are typically originating from dichotomous variables and covariates due to the nature of discrete data in the behavioural, medical, and social sciences. It is highly important to discover an alternate approach to manage the problem of dichotomous variables and covariates because when analysing dichotomous data, the fundamental premise in LVMs that the data originate from a continuous normal distribution is plainly broken. Thus, it is obvious that drawing incorrect inferences from dichotomous variables when considering them consistently as normal may do so (see Lee et al., 1990; Olsson, 1979). Assessing these types of data more effectively involves treating them as observations from a concealed continuous normal distribution with unique threshold specifications.
Figure 2. Two chains of observation corresponding to (a) ; (b) (c) ; and (d) for two populations NLVMs with Dichotomous Variables using Truncated Normal Distribution
Figure 3. Two chains of observation corresponding to (a) ; (b) (c) ; and (d) for two populations NLVMs with Dichotomous Variables using Continuous Normal Distribution
TABLE 1. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of First Group using Censored Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(1)
|
-0.985
|
0.222
|
[-1.433, -0.572]
|
λ82 (1)
|
1.528
|
0.281
|
[1.022, 2.121]
|
|
μ2(1)
|
-0.308
|
0.231
|
[-0.762, 0.148]
|
λ 92(1)
|
0.970
|
0.223
|
[0.590, 1.476]
|
|
μ3(1)
|
-0.334
|
0.201
|
[-0.741, 0.045]
|
λ 102(1)
|
0.831
|
0.202
|
[0.472, 1.269]
|
|
μ4(1)
|
-0.489
|
0.200
|
[-0.894, -0.112]
|
λ 112(1)
|
1.121
|
0.234
|
[0.714, 1.622]
|
|
μ5(1)
|
-0.123
|
0.215
|
[-0.549, 0.297]
|
λ 122(1)
|
0.960
|
0.217
|
[0.581, 1.437]
|
|
μ6(1)
|
0.121
|
0.214
|
[-0.284, 0.556]
|
λ 143(1)
|
0.666
|
0.181
|
[0.378, 1.083]
|
|
μ7(1)
|
-0.273
|
0.196
|
[-0.640, 0.119]
|
λ153 (1)
|
0.841
|
0.230
|
[0.454, 1.364]
|
|
μ8(1)
|
-0.066
|
0.227
|
[-0.514, 0.368]
|
λ 163(1)
|
0.909
|
0.248
|
[0.512, 1.472]
|
|
μ9(1)
|
-1.150
|
0.237
|
[-1.639, -0.723]
|
λ 173(1)
|
0.404
|
0.159
|
[0.152, 0.775]
|
|
μ10(1)
|
-0.860
|
0.224
|
[-1.334, -0.458]
|
λ 183(1)
|
0.824
|
0.201
|
[0.502, 1.304]
|
|
μ11(1)
|
-0.491
|
0.219
|
[-0.917, -0.066]
|
ɸ11(1)
|
0.934
|
0.214
|
[0.605, 1.458]
|
|
μ12(1)
|
-1.224
|
0.233
|
[-1.727, -0.810]
|
ɸ12(1)
|
0.770
|
0.176
|
[0.487, 1.193]
|
|
μ13(1)
|
-0.367
|
0.278
|
[-0.909, 0.168]
|
ɸ22(1)
|
0.894
|
0.225
|
[0.531, 1.437]
|
|
μ14(1)
|
-0.153
|
0.208
|
[-0.573, 0.247]
|
γ1(1)
|
0.771
|
0.277
|
[0.264, 1.375]
|
|
μ15(1)
|
0.033
|
0.228
|
[-0.409, 0.483]
|
γ2(1)
|
0.770
|
0.293
|
[0.153, 1.333]
|
|
μ16(1)
|
-0.152
|
0.233
|
[-0.629, 0.297]
|
γ3(1)
|
-0.032
|
0.232
|
[-0.476, 0.410]
|
|
μ17(1)
|
-1.373
|
0.266
|
[-1.934, -0.901]
|
γ4(1)
|
-0.203
|
0.213
|
[-0.632, 0.217]
|
|
μ18(1)
|
-0.677
|
0.246
|
[-1.157, -0.223]
|
β1(1)
|
-0.168
|
0.221
|
[-0.605, 0.278]
|
|
λ 21(1)
|
1.633
|
0.297
|
[1.076, 2.251]
|
β2(1)
|
-0.173
|
0.266
|
[-0.679, 0.353]
|
|
λ 31(1)
|
0.935
|
0.216
|
[0.566, 1.412]
|
β 3(1)
|
-0.281
|
0.336
|
[-0.932, 0.362]
|
|
λ 41(1)
|
0.771
|
0.190
|
[0.443, 1.177]
|
β 4(1)
|
0.390
|
0.332
|
[-0.239, 1.070]
|
|
λ 51(1)
|
1.154
|
0.252
|
[0.713, 1.691]
|
β 5(1)
|
0.138
|
0.392
|
[-0.671, 0.855]
|
|
λ 61(1)
|
1.238
|
0.251
|
[0.806, 1.790]
|
ψεδ(1)
|
0.512
|
0.128
|
[0.318, 0.813]
|
TABLE 2. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of Second Group using Censored Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(2)
|
-1.157
|
0.250
|
[-1.659, -0.674]
|
λ82 (2)
|
1.654
|
0.314
|
[1.068, 2.275]
|
|
μ2(2)
|
-0.124
|
0.223
|
[-0.583, 0.304]
|
λ 92(2)
|
1.225
|
0.278
|
[0.768, 1.869]
|
|
μ3(2)
|
-0.214
|
0.240
|
[-0.697, 0.261]
|
λ 102(2)
|
0.043
|
0.133
|
[-0.212, 0.314]
|
|
μ4(2)
|
-0.647
|
0.240
|
[-1.158, -0.219]
|
λ 112(2)
|
-0.020
|
0.118
|
[-0.252, 0.219]
|
|
μ5(2)
|
-0.430
|
0.223
|
[-0.884, -0.012]
|
λ 122(2)
|
-0.065
|
0.135
|
[-0.326, 0.210]
|
|
μ6(2)
|
-0.182
|
0.221
|
[-0.621, 0.240]
|
λ 143(2)
|
0.597
|
0.180
|
[0.313, 1.002]
|
|
μ7(2)
|
-0.666
|
0.225
|
[-1.121, -0.228]
|
λ153 (2)
|
0.640
|
0.178
|
[0.349, 1.052]
|
|
μ8(2)
|
-0.138
|
0.248
|
[-0.637, 0.350]
|
λ 163(2)
|
0.726
|
0.184
|
[0.413, 1.149]
|
|
μ9(2)
|
-0.942
|
0.256
|
[-1.462, -0.417]
|
λ 173(2)
|
0.277
|
0.117
|
[0.086, 0.543]
|
|
μ10(2)
|
-0.793
|
0.207
|
[-1.219, -0.420]
|
λ 183(2)
|
0.674
|
0.180
|
[0.371, 1.043]
|
|
μ11(2)
|
-0.674
|
0.190
|
[-1.073, -0.334]
|
ɸ11(2)
|
1.081
|
0.253
|
[0.640, 1.619]
|
|
μ12(2)
|
-0.927
|
0.207
|
[-1.356, -0.542]
|
ɸ12(2)
|
0.894
|
0.192
|
[0.587, 1.306]
|
|
μ13(2)
|
-0.729
|
0.306
|
[-1.341, -0.158]
|
ɸ22(2)
|
1.047
|
0.250
|
[0.639, 1.602]
|
|
μ14(2)
|
-0.475
|
0.240
|
[-0.971, -0.031]
|
γ1(2)
|
0.924
|
0.290
|
[0.372, 1.477]
|
|
μ15(2)
|
-0.365
|
0.240
|
[-0.853, 0.090]
|
γ2(2)
|
0.823
|
0.322
|
[0.251, 1.550]
|
|
μ16(2)
|
-0.317
|
0.247
|
[-0.806, 0.157]
|
γ3(2)
|
-0.306
|
0.232
|
[-0.807, 0.151]
|
|
μ17(2)
|
-1.252
|
0.268
|
[-1.803, -0.738]
|
γ4(2)
|
0.011
|
0.235
|
[-0.441, 0.476]
|
|
μ18(2)
|
-0.936
|
0.271
|
[-1.501, -0.440]
|
β1(2)
|
-0.063
|
0.232
|
[-0.514, 0.399]
|
|
λ 21(2)
|
1.326
|
0.270
|
[0.863, 1.878]
|
β2(2)
|
0.260
|
0.329
|
[-0.381, 0.970]
|
|
λ 31(2)
|
1.361
|
0.306
|
[0.855, 2.040]
|
β 3(2)
|
0.284
|
0.242
|
[-0.153, 0.796]
|
|
λ 41(2)
|
0.950
|
0.219
|
[0.564, 1.401]
|
β 4(2)
|
0.006
|
0.340
|
[-0.662, 0.695]
|
|
λ 51(2)
|
1.172
|
0.257
|
[0.743, 1.755]
|
β 5(2)
|
0.226
|
0.337
|
[-0.425, 0.891]
|
|
λ 61(2)
|
1.203
|
0.263
|
[0.770, 1.782]
|
ψεδ(2)
|
0.541
|
0.132
|
[0.344, 0.854]
|
TABLE 3. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of First Group using Truncated Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(1)
|
-1.013
|
0.227
|
[-1.462, -0.574]
|
λ82 (1)
|
1.597
|
0.303
|
[1.062,2.264]
|
|
μ2(1)
|
-0.342
|
0.251
|
[-0.834,0.146]
|
λ 92(1)
|
1.014
|
0.231
|
[0.611,1.509]
|
|
μ3(1)
|
-0.348
|
0.203
|
[-0.768,0.032]
|
λ 102(1)
|
0.821
|
0.207
|
[0.476,1.298]
|
|
μ4(1)
|
-0.521
|
0.206
|
[-0.950,-0.134]
|
λ 112(1)
|
1.120
|
0.240
|
[0.694,1.633]
|
|
μ5(1)
|
-0.127
|
0.213
|
[-0.544,0.300]
|
λ 122(1)
|
0.995
|
0.225
|
[0.613,1.490]
|
|
μ6(1)
|
0.110
|
0.216
|
[-0.301,0.539]
|
λ 143(1)
|
0.684
|
0.178
|
[0.401,1.099]
|
|
μ7(1)
|
-0.279
|
0.203
|
[-0.683,0.123]
|
λ153 (1)
|
0.856
|
0.207
|
[0.519,1.307]
|
|
μ8(1)
|
-0.076
|
0.235
|
[-0.533,0.382]
|
λ 163(1)
|
0.935
|
0.218
|
[0.563,1.421]
|
|
μ9(1)
|
-1.192
|
0.242
|
[-1.677,-0.718]
|
λ 173(1)
|
0.415
|
0.152
|
[0.163,0.759]
|
|
μ10(1)
|
-0.872
|
0.224
|
[-1.333,-0.461]
|
λ 183(1)
|
0.881
|
0.233
|
[0.526,1.414]
|
|
μ11(1)
|
-0.499
|
0.218
|
[-0.926,-0.072]
|
ɸ11(1)
|
0.953
|
0.224
|
[0.583,1.476]
|
|
μ12(1)
|
-1.288
|
0.249
|
[-1.801,-0.817]
|
ɸ12(1)
|
0.806
|
0.199
|
[0.492,1.275]
|
|
μ13(1)
|
-0.379
|
0.282
|
[-0.935,0.173]
|
ɸ22(1)
|
0.945
|
0.247
|
[0.570,1.520]
|
|
μ14(1)
|
-0.158
|
0.216
|
[-0.595,0.258]
|
γ1(1)
|
0.710
|
0.254
|
[0.197,1.211]
|
|
μ15(1)
|
-1.013
|
0.227
|
[-1.462,-0.574]
|
γ2(1)
|
0.790
|
0.289
|
[0.217,1.398]
|
|
μ16(1)
|
-0.342
|
0.251
|
[-0.834,0.146]
|
γ3(1)
|
-0.042
|
0.236
|
[-0.481,0.439]
|
|
μ17(1)
|
-0.348
|
0.203
|
[-0.768,0.032]
|
γ4(1)
|
-0.180
|
0.254
|
[-[0.679,0.342]
|
|
μ18(1)
|
-0.521
|
0.206
|
[-0.950,-0.134]
|
β1(1)
|
-0.161
|
0.206
|
[-0.577,0.241]
|
|
λ 21(1)
|
1.714
|
0.309
|
[1.189,2.391]
|
β2(1)
|
-0.176
|
0.231
|
[-0.620,0.298]
|
|
λ 31(1)
|
0.943
|
0.227
|
[0.561,1.463]
|
β 3(1)
|
-0.169
|
0.309
|
[-0.748,0.435]
|
|
λ 41(1)
|
0.795
|
0.193
|
[0.459,1.210]
|
β 4(1)
|
0.337
|
0.350
|
[-0.360,0.995]
|
|
λ 51(1)
|
1.180
|
0.230
|
[0.773,1.665]
|
β 5(1)
|
0.057
|
0.403
|
[-0.720,0.888]
|
|
λ 61(1)
|
1.264
|
0.258
|
[0.832,1.854]
|
ψεδ(1)
|
0.505
|
0.110
|
[0.330,0.761]
|
TABLE 4. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of Second Group using Truncated Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(2)
|
-1.166
|
0.251
|
[-1.675,-0.679]
|
λ82 (2)
|
1.671
|
0.314
|
[1.141,2.353]
|
|
μ2(2)
|
-0.155
|
0.238
|
[-0.641,0.306]
|
λ 92(2)
|
1.264
|
0.270
|
[0.787,1.862]
|
|
μ3(2)
|
-0.236
|
0.237
|
[-0.713,0.238]
|
λ 102(2)
|
0.039
|
0.133
|
[-0.217,0.311]
|
|
μ4(2)
|
-0.679
|
0.226
|
[-1.146,-0.258]
|
λ 112(2)
|
-0.022
|
0.118
|
[-0.247,0.217]
|
|
μ5(2)
|
-0.442
|
0.227
|
[-0.899,0.004]
|
λ 122(2)
|
-0.067
|
0.133
|
[-0.330,0.204]
|
|
μ6(2)
|
-0.185
|
0.228
|
[-0.620,0.275]
|
λ 143(2)
|
0.583
|
0.148
|
[0.329,0.906]
|
|
μ7(2)
|
-0.662
|
0.215
|
[-1.097,-0.250]
|
λ153 (2)
|
0.629
|
0.172
|
[0.364,1.044]
|
|
μ8(2)
|
-0.115
|
0.260
|
[-0.619,0.409]
|
λ 163(2)
|
0.740
|
0.186
|
[0.422,1.154]
|
|
μ9(2)
|
-0.953
|
0.251
|
[-1.465,-0.463]
|
λ 173(2)
|
0.297
|
0.125
|
[0.087,0.589]
|
|
μ10(2)
|
-0.805
|
0.207
|
[-1.236,-0.425]
|
λ 183(2)
|
0.670
|
0.171
|
[0.381,1.068]
|
|
μ11(2)
|
-0.687
|
0.198
|
[-1.103,-0.327]
|
ɸ11(2)
|
1.121
|
0.276
|
[0.660,1.731]
|
|
μ12(2)
|
-0.970
|
0.209
|
[-1.419,-0.591]
|
ɸ12(2)
|
0.921
|
0.206
|
[0.569,1.385]
|
|
μ13(2)
|
-0.712
|
0.316
|
[-1.357,-0.115]
|
ɸ22(2)
|
1.063
|
0.287
|
[0.630,1.725]
|
|
μ14(2)
|
-0.466
|
0.242
|
[-0.945,-0.003]
|
γ1(2)
|
0.939
|
0.272
|
[0.393,1.477]
|
|
μ15(2)
|
-0.368
|
0.237
|
[-0.837,0.102]
|
γ2(2)
|
0.870
|
0.320
|
[0.293,1.572]
|
|
μ16(2)
|
-0.314
|
0.263
|
[-0.838,0.221]
|
γ3(2)
|
-0.293
|
0.218
|
[-0.747,0.112]
|
|
μ17(2)
|
-1.299
|
0.268
|
[-1.852,-0.805]
|
γ4(2)
|
-0.057
|
0.256
|
[-0.589,0.426]
|
|
μ18(2)
|
-0.927
|
0.257
|
[-1.460,-0.461]
|
β1(2)
|
-0.085
|
0.240
|
[-0.558,0.405]
|
|
λ 21(2)
|
1.395
|
0.270
|
[0.924,1.966]
|
β2(2)
|
0.264
|
0.324
|
[-0.334,0.931]
|
|
λ 31(2)
|
1.377
|
0.260
|
[0.935,1.970]
|
β 3(2)
|
0.294
|
0.237
|
[-0.144,0.784]
|
|
λ 41(2)
|
0.977
|
0.217
|
[0.613,1.469]
|
β 4(2)
|
0.107
|
0.331
|
[-0.487,0.762]
|
|
λ 51(2)
|
1.195
|
0.247
|
[0.782,1.730]
|
β 5(2)
|
0.142
|
0.316
|
[-0.459,0.715]
|
|
λ 61(2)
|
1.219
|
0.245
|
[0.809,1.791]
|
ψεδ(2)
|
0.568
|
0.133
|
[0.362,0.878]
|
| |
|
|
|
|
|
|
|
|
TABLE 5. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of First Group using Continuous Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(1)
|
0.098
|
0.079
|
[-0.056,0.251]
|
λ82 (1)
|
0.830
|
0.110
|
[0.619,1.048]
|
|
μ2(1)
|
0.302
|
0.079
|
[0.146,0.459]
|
λ 92(1)
|
0.517
|
0.098
|
[0.330,0.711]
|
|
μ3(1)
|
0.301
|
0.078
|
[0.149,0.459]
|
λ 102(1)
|
0.462
|
0.101
|
[0.265,0.669]
|
|
μ4(1)
|
0.256
|
0.070
|
[0.119,0.392]
|
λ 112(1)
|
0.658
|
0.105
|
[0.459,0.869]
|
|
μ5(1)
|
0.369
|
0.076
|
[0.220,0.515]
|
λ 122(1)
|
0.507
|
0.097
|
[0.323,0.699]
|
|
μ6(1)
|
0.438
|
0.079
|
[0.283,0.594]
|
λ 143(1)
|
0.602
|
0.108
|
[0.394,0.818]
|
|
μ7(1)
|
0.295
|
0.079
|
[0.138,0.454]
|
λ153 (1)
|
0.687
|
0.108
|
[0.484,0.904]
|
|
μ8(1)
|
0.374
|
0.077
|
[0.221,0.521]
|
λ 163(1)
|
0.662
|
0.105
|
[0.460,0.872]
|
|
μ9(1)
|
0.071
|
0.063
|
[-0.054,0.196]
|
λ 173(1)
|
0.104
|
0.063
|
[-0.021,0.229]
|
|
μ10(1)
|
0.142
|
0.064
|
[0.019,0.268]
|
λ 183(1)
|
0.487
|
0.093
|
[0.306,0.672]
|
|
μ11(1)
|
0.253
|
0.073
|
[0.111,0.396]
|
ɸ11(1)
|
0.134
|
0.018
|
[0.104,0.173]
|
|
μ12(1)
|
0.035
|
0.063
|
[-0.088,0.159]
|
ɸ12(1)
|
0.078
|
0.014
|
[0.053,0.109]
|
|
μ13(1)
|
0.257
|
0.098
|
[0.058,0.445]
|
ɸ22(1)
|
0.152
|
0.021
|
[0.116,0.199]
|
|
μ14(1)
|
0.335
|
0.081
|
[0.172,0.488]
|
γ1(1)
|
0.326
|
0.135
|
[0.067,0.589]
|
|
μ15(1)
|
0.373
|
0.084
|
[0.206,0.537]
|
γ2(1)
|
0.211
|
0.151
|
[-0.084,0.505]
|
|
μ16(1)
|
0.329
|
0.080
|
[0.167,0.481]
|
γ3(1)
|
0.208
|
0.289
|
[-0.358,0.795]
|
|
μ17(1)
|
0.015
|
0.048
|
[-0.077,0.109]
|
γ4(1)
|
0.015
|
0.245
|
[-0.457,0.501]
|
|
μ18(1)
|
0.188
|
0.071
|
[0.049,0.328]
|
β1(1)
|
0.015
|
0.075
|
[-0.132,0.161]
|
|
λ 21(1)
|
0.993
|
0.118
|
[0.766,1.233]
|
β2(1)
|
-0.040
|
0.170
|
[-0.370,0.299]
|
|
λ 31(1)
|
0.771
|
0.125
|
[0.532,1.020]
|
β 3(1)
|
-0.106
|
0.365
|
[-0.825,0.625]
|
|
λ 41(1)
|
0.589
|
0.112
|
[0.375,0.812]
|
β 4(1)
|
0.081
|
0.352
|
[-0.602,0.797]
|
|
λ 51(1)
|
0.854
|
0.120
|
[0.625,1.091]
|
β 5(1)
|
-0.010
|
0.404
|
[-0.807,0.773]
|
|
λ 61(1)
|
0.859
|
0.127
|
[0.616,1.115]
|
ψεδ(1)
|
0.173
|
0.022
|
[0.134,0.221]
|
TABLE 6. Bayesian Estimation of two populations NLVMs with Dichotomous Variables of Second Group using Continuous Normal Distribution
|
Para
|
Est.
|
SE
|
HPD Interval
|
Para
|
Est.
|
SE
|
HPD Interval
|
|
μ1(2)
|
0.085
|
0.073
|
[-0.057,0.229]
|
λ82 (2)
|
0.823
|
0.115
|
[0.602,1.053]
|
|
μ2(2)
|
0.332
|
0.076
|
[0.184,0.482]
|
λ 92(2)
|
0.579
|
0.091
|
[0.405,0.765]
|
|
μ3(2)
|
0.309
|
0.076
|
[0.161,0.460]
|
λ 102(2)
|
0.390
|
0.088
|
[0.217,0.567]
|
|
μ4(2)
|
0.207
|
0.065
|
[0.079,0.334]
|
λ 112(2)
|
0.659
|
0.099
|
[0.471,0.859]
|
|
μ5(2)
|
0.252
|
0.077
|
[0.104,0.405]
|
λ 122(2)
|
0.514
|
0.084
|
[0.352,0.680]
|
|
μ6(2)
|
0.324
|
0.076
|
[0.179,0.476]
|
λ 143(2)
|
0.483
|
0.103
|
[0.288,0.689]
|
|
μ7(2)
|
0.193
|
0.080
|
[0.038,0.350]
|
λ153 (2)
|
0.588
|
0.102
|
[0.390,0.793]
|
|
μ8(2)
|
0.333
|
0.077
|
[0.182,0.481]
|
λ 163(2)
|
0.640
|
0.097
|
[0.448,0.836]
|
|
μ9(2)
|
0.153
|
0.063
|
[0.027,0.279]
|
λ 173(2)
|
0.113
|
0.060
|
[-0.003,0.233]
|
|
μ10(2)
|
0.165
|
0.058
|
[0.052,0.280]
|
λ 183(2)
|
0.347
|
0.085
|
[0.185,0.518]
|
|
μ11(2)
|
0.204
|
0.066
|
[0.069,0.335]
|
ɸ11(2)
|
0.122
|
0.016
|
[0.094,0.157]
|
|
μ12(2)
|
0.113
|
0.058
|
[0.003,0.226]
|
ɸ12(2)
|
0.072
|
0.013
|
[0.048,0.100]
|
|
μ13(2)
|
0.141
|
0.100
|
[-0.054,0.339]
|
ɸ22(2)
|
0.150
|
0.020
|
[0.115,0.194]
|
|
μ14(2)
|
0.202
|
0.078
|
[0.046,0.351]
|
γ1(2)
|
0.588
|
0.497
|
[-0.379,1.569]
|
|
μ15(2)
|
0.232
|
0.080
|
[0.078,0.393]
|
γ2(2)
|
0.599
|
0.498
|
[-0.372,1.571]
|
|
μ16(2)
|
0.247
|
0.082
|
[0.081,0.399]
|
γ3(2)
|
0.597
|
0.501
|
[-0.393,1.560]
|
|
μ17(2)
|
0.035
|
0.046
|
[-0.055,0.126]
|
γ4(2)
|
0.601
|
0.503
|
[-0.397,1.576]
|
|
μ18(2)
|
0.124
|
0.066
|
[-0.007,0.254]
|
β1(2)
|
0.608
|
0.497
|
[-0.351,1.577]
|
|
λ 21(2)
|
0.971
|
0.128
|
[0.729,1.230]
|
β2(2)
|
0.603
|
0.505
|
[-0.366,1.591]
|
|
λ 31(2)
|
1.011
|
0.124
|
[0.780,1.264]
|
β 3(2)
|
0.602
|
0.498
|
[-0.360,1.604]
|
|
λ 41(2)
|
0.664
|
0.108
|
[0.457,0.879]
|
β 4(2)
|
0.600
|
0.497
|
[-0.362,1.592]
|
|
λ 51(2)
|
0.892
|
0.128
|
[0.650,1.151]
|
β 5(2)
|
0.600
|
0.496
|
[-0.377,1.583]
|
|
λ 61(2)
|
0.926
|
0.127
|
[0.684,1.191]
|
ψεδ(2)
|
0.236
|
0.030
|
[0.182,0.301]
|
Table 7. Performance of Deviance Information Criterion DIC for two populations NLVMs with Dichotomous Variables Using Censoring, Truncation and Continuous Normal Distribution
|
|
Interval
Censored Normal
|
Interval Truncated Normal
|
|
Continuous
Normal
|
|
DIC
|
5875.0
|
5887.0
|
|
6903.0
|
Tables (1:2) contain the results for the first and second groups using Type I and Type II inputs, dichotomous variables, covariates, hidden continuous normal distributions (censored normal distributions) for variables, hidden continuous normal distributions (truncated normal distributions with known parameters), and two types of thresholds (with equally and unequally distances for categories). In the first and second groups, the SD values are noticeably low.
The results for the first and second groups under Type I and Type II inputs, dichotomous variables, covariates, hidden continuous normal distribution (truncated normal distribution with known parameters), as well as two types of thresholds (with equally and unequally distances for categories), are reported in Tables (3:4). We noticed that the first and second groups' SD values are rather low.
The parameter with the Highest posterior density (HPD) was determined. When adopting a censored normal distribution or a truncated normal distribution, we found that the HPD intervals work well for dichotomous variables.
We re-analysed the data sets using a nonlinear latent variable model (M4) with interaction term to show the efficacy of DIC for model comparison. The DIC values were contrasted with those obtained using the appropriate model. Tables 5 and 6 present the findings.
The DIC values of censored normal distribution, truncated normal distribution with equally distances of thresholds, are (18070.0) and (19310.0) respectively.
Using a censored normal distribution, the model that best fits the DIC of LVMs with dichotomous data is less accurate than using a truncated normal distribution. For dichotomous variables with censored normal distribution, it performs exceptionally well.
The DIC values of censored normal distribution, truncated normal distribution with equally distances of thresholds, are (17580.0) and (19350.0) respectively.
A model that fits the DIC of LVMs with dichotomous data using a censored normal distribution is less accurate than one that fits the DIC of LVMs with dichotomous data using a truncated normal distribution with uneven threshold distances. For dichotomous variables with censored normal distribution, it performs exceptionally well.
The censored normal distribution with unbalanced distances between thresholds (17580.0) is the best fitted model with the lowest DIC value. Additionally, the truncated normal distribution's DIC value with equally spaced thresholds is (19310.0). As a consequence, we discovered that the DIC's performance is unacceptable and would be even worse when used with dichotomous data and a truncated normal distribution with unbalanced threshold distances.
Plots of several simulated sequences of the individual parameters with varied beginning values are used to track the convergence of the Gibbs sampler and are shown in Figures (4 and 5, respectively). After eliminating (1000) burn-in rounds in two populations nonlinear NLVMs for censoring and truncation normal distribution, Bayesian estimates were obtained from T=10000 iterations for two groups.