1. Introduction
In some cases, the study variable cannot be easily measured or is too expensive, yet it can be easily ranked for no cost or at a bit of cost. The writings on ranked set sampling RSS discuss a wide range of strategies for obtaining more efficient estimators for the study variable by including auxiliary information. RSS, is a logical approach to data collection that improves estimation. The method of ranking units is based on the values of one of the auxiliary variable(s) correlated to the variable of the study. The rank of units inside groups, from smallest to largest, for the variable we wish to study with our naked eye is frequently challenging to accomplish when the group size is quite large. And, if it is mostly completed, the ranking process will encounter faults, reducing the effectiveness of RSS. As a result, exploring alternate methods for ranking the units inside the group has become vital to avoid arrangement problems. Therefore, alternate ways for ordering units inside the group have been proposed, including the median ranked set sampling MRSS. McIntyre[6] was the first to introduce the concept of ranked set sampling RSS in his exceptional attempts to develop an estimator that would be more effective for estimating the yield of Australia's vast grazing regions. After Halls and Dell[4] utilized RSS to estimate the output of animal fodder in pine woodlands, the concept appeared to gain traction, and they were the first to use the term ranked set sampling to refer to their method of estimation. Takahase and Wakimo to[11], the two scientists who provided the first mathematical proofs for this type of sampling, proved that the arithmetic mean of this type of sampling is an unbiased estimator of the population's arithmetic mean and that the variance is less than the variance of the arithmetic mean of a simple random sample SRS, assuming perfect ranking of the elements.
Dell and Clutter[3] came at the same result as the previous authors, but without the necessity that the elements be in perfect order, implying that there may or may not be ranking flaws in the elements. Stokes[9] proposed utilizing the auxiliary variable to estimate the ranks of the variable we want to examine (the main variable), because it is difficult to rank units with the naked eye when dealing with large numbers of units. AL-Saleh and Samawi[2] the proposed estimators are compared to other existing estimators using a bivariate simple random sample and application to the bivariate normal distribution. They are estimated using a bivariate ranked set sampling technique. Zamanzade and Al-Omari[12] compared empirical mean and variance estimators based on new ranked set sampling to their counterparts in ranked set sampling and simple random sampling using Monte Carlo simulation. Muttlak[7] suggested studying median ranked sets samplingMRSS as a strategy to minimize errors in the process of ranking units within groups. Syam et al.[10] investigated the average population using double median ranked set sampling method, demonstrating that DMSRSS estimators were more efficient than their simple random sampling, stratified random sampling, ranked sampling, and stratified ranked set sampling counterparts. This method produces reliable estimations of a population's mean regardless of the symmetry or asymmetry of the distribution. To estimate the ratio of a finite population, the Al-Omari with Al-Nasir[1] multistage median, ranked sampling MMRSS approach was used. The results demonstrate that the proposed estimators are unbiased and have the lowest variance when compared to simple, stratified, ranked, and median ranked sampling procedures, and that the efficiency of the MMRSS estimators grows as the number of sample size determination cycles increases. Using auxiliary variables, we present a highly generalized approach for estimating the population mean using the MRSS schemes, which is discussed in detail in this study. Based on this demonstration, it is established that a large number of prior estimators belong to the proposed class of estimator, and this proposed estimator is more efficient in estimating the mean population than the corresponding previous estimators in MRSS and SRS.
2• Methodology for :
Muttlak suggested studying median ranked sets sampling as a strategy to minimize errors in the process of ranking units within groups. And the following summarizes the procedure for drawing a sample of size . We randomly select sample size from the population, and divide this sample into groups each group having a size of units, and then arrange units within each group. If the size of group is odd, we measure the median of each group, i.e. the rank unit , however, if the group size is an even number, we withdraw the units of rank for measurements from half of the groups and measure the units of rank , from the remaining half. In both cases, the first cycle will produce a sample size of units. To obtain the needed sample size , we can repeat the cycle times. The process is summarized as follows:
1- Choose sample items at random from the targeted population.
2- Divide the items into groups of size each, and then rank the items inside each group.
3- If the sample size is odd, choose the smallest rank item, this corresponds to the median of each group from step 2. While if the sample size is even from step 2, choose the and smallest rank from the initial and subsequent samples for measurement respectively.
4- Stages 1-3 should be reiterated times till you have a sample of size .
Now assuming the sample size is odd, then represent median ranked set sampling, where the items of for main variable and the two auxiliary variables and , and suppose that the ranking depends on the auxiliary variable , described are follows.
where denotes the judgment ordering in the set in the cycle for the study variable andauxiliary variable respectively. Also denotes the ranking in the set in the cycle for the auxiliary variable where . Finally, if the sample size is even, then represent median ranked set sampling, let , where items of as follows.
Let denotes the natural, unbiased estimates of the finite population mean, and variance , of the main variable and two auxiliary variables in , respectively. [7], has estimated the mean of a finite population using median ranked sets sampling and has demonstrated that it is impartial to the population mean and has a lower variance than the simple random sample, as shown below. The estimator of the mean population is known according to the following relationship in median ranked sets sampling and the odd case.
These estimators are unbiased for the average population, which means that.
;
;
;
Now the estimators are defined as follows for the even case.
And also, these estimators are unbiased for the average population
As for the variance estimated of the arithmetic mean by median ranked sets sampling and in the even case, it is denoted by the following formula:
Where
In terms of the covariance between the averages of the main and auxiliary variables obtained using median ranked sets sampling, are defined as follows in both cases:
,
,
Where
And it is used to express the co-variance between the main and auxiliary variables in simple random sampling procedures, see for further information [7] and [8].
3. Proposed generalized estimator:
The mean of the population is one of the essential metrics that scholars are interested in investigating because of its significance in identifying the features of the community. As a result, most samples are utilized to find estimators for this unknown parameter in various methods. Samawi, Al-Omari, and Khan were among a limited group of researchers who dealt with this parameter by estimate in the . Using median ranked sets sampling, we will show suggested estimation for investigating the mean population in this paper. Because the proposed estimation is generalized estimation, we may obtain any required estimation by making a few simple modifications in the proposed estimation. The following is a broad description of the proposed estimator.
Where, are unknown constants selected to keep the mean squared error of the estimator to the smallest possible value, is a well-known constant scalar that can take either one or zero values, ,are standards values that can take , each of the values is utilized to determine the estimator form that can be generated from of the estimator defined above.
It is worth noting that in the definition of the estimator above, the index takes one of the letters , where if indicates that the estimator are defined on the odd case from the , and if indicates that the estimators are defined on the even case from the .
By setting the following error bounds, it will be possible to study the qualities of the suggested estimator to make the process of obtaining these properties easier. To reformulate this estimator, we assume the following.
Let
According to what has been demonstrated by [7], the mean estimated by is an unbiased estimate concerning the population's mean. And upon it
,
,
,
To evaluate the properties of the estimator in both its odd and even cases, it will be rewritten in a way that makes the process of obtaining these qualities easier by relying on the error bounds so that the estimator , becomes as follows up to the first degree of approximation.
It will be necessary to add and subtract the value from the equation (3-2) to obtain the following form, which will serve as the basis for determining the properties of the estimator .
(3-3)
And by taking the mathematical expectation for both sides of the equation (3-3), we can calculate the bias amount for the estimator , which is defined for the odd and even cases.
It is also possible to calculate the mean squared error for the estimator by squaring equation (3-3) and then taking the mathematical expectation up to the terms of order of it, and as follows.
(3-5)
It is worth noting that equation (3-4) represents the formula for the mean squared error of the estimator in the odd and even cases, where either of the two cases can be obtained by making the index take the symbol to denote the odd case or by making the index take the symbol to denote the even case. We also note that there is a relationship between the form of the mean square error formula of the estimator , computed by median ranked sets sampling and the same estimator, but that the latter depends in its calculation on simple random sample , by rewriting equation (3-4) in another way, as shown below.
(3-6)
Where
(3-7)
Is the mean squared error of the estimator that corresponds to the estimator using , and
(3-8)
When we examine the second term of the equation (3-6), we can see that the mean square error of the estimator under is less than the mean square error of the estimator under in both the odd-even cases, as shown in the following steps.
In the odd case: Let
Where it is noted that the term is a perfect square, which allows the formulation of the mean squared error of the estimator in the form shown by equation (3-9), and it shows the result that indicates .
(3-9)
In the even case: Let
In this case, also, we note that is the sum of two perfect squares, and therefore also remains in the even case the mean squared error of the estimator is equivalent to what was reached by equation (3-10), but replacing the second term from the right side of the equation with the amount instead of , and the result that we reached is that .
(3-10)
As for the bias formula defined by equation (3-4), we can write it as shown in the following figure, which shows that the bias amount of the estimator represents the product of subtracting the bias amount of the estimator calculated by simple random sampling from a positive quantity, which indicates that the bias amount of the estimator based on is less than the bias amount of the .
(3-11)
Where , is the bias amount of the estimator calculated using a simple random sample, and its formula is as follows:
(3-12)
It is necessary to know the optimal values for the unknown constants to obtain the best formula for the estimator , and this is accomplished through the process of partial derivation of equation (3-5) for those values and then extracting the optimal values for them, as will be demonstrated below.
(3-13)
(3-14)
For odd and even cases, we get the average squared error of the optimum estimator by substituting equations (3-13) and (3-14) into formula (3-5) and as follows:
(3-15)
Additionally, by substituting the optimal values for with equations (3-1) and (3-4), we will obtain the optimal estimator for the finite population mean and the optimal bias amount for by the .
4. Some of the estimators derived from :
We obtain several exponential and non-exponential types for ratio, product, and ratio-cum-product estimators from .By replacing the values in Eq (3-1) with specific values. And we will denote each estimator by the value of the case number corresponding to it and enter this value in the letter in . The following table shows the forms of some of these estimators.
|
Table (1) Some estimates generated from
|
| |
Values
|
Estimator
|
|
1
|
0 0 0 0 0
|
|
|
2
|
1 0 0 0 0
|
|
|
3
|
0 -1 0 0 0
|
|
|
4
|
1 1 0 0 0
|
|
|
5
|
-1 -1 0 0 0
|
|
|
6
|
1 0 1 1 0
|
|
|
7
|
-1 0 1 1 0
|
|
|
8
|
1 -1 1 1 0
|
|
Here is replaced by , which represents the traditional unbiased estimator of the population mean under the , as suggested by [7], is called the ratio estimator under the , was suggested by [1], is called the product estimator under the , is called the multiple ratio estimator under the , is called the multiple product estimator under the , is called the ratio type exponential estimator under the , is called the product type exponential estimator under the , and is called the ratio-cum-product type exponential estimator under the . It should be noted that the general estimator can be used to derive a large number of additional estimators using the same methodology. Furthermore, the properties of the estimators represented by the bias amount and the mean squared error , may be determined using the equations (3-4) and (3-5). Noting that we can calculate the exact estimators shown in table (1) using and the estimator defined in equation (3-8) meaning calculation, as well as using equations (3-7) and (3-12) to extract the properties of those estimators to compare them to the properties of estimators and use the relative efficiency.
5. Comparing estimators' efficacy:
To determine the accuracy of the estimator , it will be compared to the rest of the other estimators that were defined in the previous section by calculating the efficiency criterion between those estimators according to the following relationship:
(3-16)
The following table provides the conditions that make the suggested estimator more efficient than the rest of the other estimators by and based on equation (3-16), as shown below.
|
Table (2) Accuracy of the proposed estimator with in
|
|
Compared estimators
|
If the criterion is met
|
|
|
|
|
|
|
|
|
|
| |
|
| |
|
| |
|
| |
|
| |
|
6. Working simulation:
An actual data set is utilized to demonstrate the comparability of the proposed estimators compared to one another. The data set contains 252 men's body fat percentages determined by underwater weighing and different body circumference measurements. For more information on these data, see “http://lib.stat.cmu.edu/datasets/bodyfat/” for more details. We decide on the study variable .body fat percentage is represented by the variable , while the first auxiliary variable represents belly circumference, and thigh circumference is represented by the second auxiliary variable . Where the following features of the community are present:
Using the median ranked sets sampling technique, as explained in part 2, a simulation study compares the estimators. The ranking process will be carried out using the auxiliary variable . According to specific empirical metrics' estimates such as the percentage relative bias , and the percentage relative efficiencies , where the values of help to assess the different estimators' empirical bias, whereas the show which estimator is the most efficient from an empirical standpoint, the results of 25,000 simulations are used. As shown in Table (3-6), and the and are obtained by using the formulas given below.
From Tables (3 and 4), it appears that the estimator has the lowest mean squared error in both the odd and even cases, which suggests that it was able to describe the population mean in the most accurate manner possible using the technique, while the estimator has the second-lowest mean squared error. Regarding the proposed , estimator in tables (5 and 6); we see that it has the highest relative efficiency compared to the mean of median ranked set sampling . And that efficiency increases with increasing sample size. It is worth noting that the estimators ( ) are second order in efficiency when compared to the , estimator because they rely in their definition on the exponential form, and that efficiency increases with increasing sample size . Comparing the estimators ( ) to in terms of efficiency, they are ranked third and fourth respectively. When it comes to the estimators ( ), they have the lowest estimation efficiency because there is a positive correlation between the data and those estimations are based on a negative relationship, which demonstrates their poor estimation ability. When it comes to the relative bias scale, it appears from the two tables (5and 6) that the generalized estimator , has the lowest possible bias compared to other estimators, with the lowest bias being 2% in the odd and even cases, and with the increase in sample size, that bias fades until it is close to zero.
7. Final remarks:
When comparing the results of the simulation study with the theoretical results obtained through table (2), it becomes clear that the proposed estimator by exhibits a high relative efficiency when estimating the mean of a population and is not affected by the type of relationship between the auxiliary and main variables, in contrast to other estimators affected by this type of relationship. In terms of relative bias, the estimator has the lowest bias, and that bias decreases as the size of the ordered sample increases. And the two equations (3-9) and (3-10) also demonstrate that the outperforms the in terms of accuracy when it comes to estimating the mean population. As a result, the estimator outperforms all of the estimators described in table (1) and other types of estimators that can be derived from it.