Introduction
Weibull distribution is one of commonly distribution used in modeling many data sets taken from various experiments in different fields of life time test ,engineering ,reliability, medical ,biological ,hydrological studies , weather forecasting, and insurance systems, extreme events, economics , finance , banking sciences and many other sciences . Weibull distribution does not provide a good fit for study sample when the empirical curve of probability density function (p.d.f) is a bimodal , positive skewed ,heavy tailed ,long tails , high peak or the empirical hazard rate (hr) rarves is non-monatonic . To overcome these weakness in Weibull distribution ,many modifications have been done by adding one or more shape parameters to a baselive Weibull distribution and many lifetime distributions .These additional parameters increase the flexibility of modified distribution in fitting samples that have positive or negative high skewness measures or high kurtosis measure, are also it gives a good fit to increasing decreasing ,constant , monatomic and non- monatomic (hr) curves .Because the above weakness in classical lifetime distributions many authers have been interested in suggesting many generated families ,the first one was the exponeniated family of distribution ( Mudhalkar and Srivastava (1993) ),Marshall-Olkin generated family ( Marshall and Olkin(1997) ),Beta generator (Eugene et.al(2002)),Kumaraswamy generated family (Jones (2009)),which is based on Kumaraswamy distribution as an alternative to Beta family Quadratic transmuted family (S haw, Buckley , (2009)) T-X family (Al-Zaatreh et al(2013)) ,Topp –Leone family (AlShomrani et .al(2016)) six years ago a lot of literatures have been published dealt with the different forms of generators have been introduced called alpha power transformed families .These families require a closed form of (c.d.f) for a baseline distribution . The first family of APT suggested by (Mahdavi and Kundo(2017)).Many modifications on the first APT is family have been done by changing some terms in that family such that the (A.P.T)functions satisfy the properties of c.d.f . Many A.P.T families of distributions have been presented different form the Mahdavi and Kundo family and its all modifications. Ijaz et .al (2020) produced a new family named a Gull A.P.T family symbolized a (GAPT) .
For a continuous random variable (r.v) x follows a (GAPT) family of distribution whose c.d.f is:- . Ijaz et .al (2020)
(1)
Where G(x) is a c.d.f of baseline distribution and (a) is a shape parameter. It is known that G(x) is a c.d.f and it is clear that F(x) is right continuous function , increasing , differentiable these results proved as :
If
Therefore, F is right continuous
And if
Then and
Therefore the increasing property is proved. For the differentiability of F
Therefore F(x) is differentiable function . The P-th quintile of GAPT distribution with condition is a solution of the following equation below with respect to :
(2)
It is well known that so that eq (2) can be written as : (3)
The solution of either eq (2) or eq (3) with respect to represent P-th quantile .
Taking the derivative for both sides of eq (1) with respect to x , the p.d.f of GAPT is :-
(4)
f(x) must satisfies the conditions f(x) >0 and
f(x) to be greater than zero ,the expression must be greater than zero so that
The survival and hazard rate functions of GAPT family are respectively given below .
(5)
(6)
The aim of this paper is to generalize a standard Weibull distribution using GAPT family .Some statistical properties will be discussed and parameters of new distribution will be estimated using MPS and CVM methods .
This paper is organized as section (2) dealt with a Gull Alpha power transform standard Weibull GAPTW distribution section (3) discusses the different shapes of (hr) function .The tail of GAPTW discussed in section (4) , some important statistical properties given in section (5) ,section (6) contain two methods of estimation used to estimate parameters of the distribution .The application is based on two real data sets and conclusions contained in sections (7) and (8) respectively .
2- Gull Alpha Power transform weibull distribution
Let and be the (C.D.F) and (p.d.f) of baseline weibull distribution given by :(Ashraf & Khan (2021))
(7)
(8)
Substituting eqs (7,8) into eqs (1,4) note that G(x) and g(x) represent the abbreviated symbols for , respectively ,the c.d.f and p.d.f of GAPTW are :-
e=2.718282 (9)
(10)
This distribution is symbolized as where( ) are shape parameters and ) is a scale parameter .
The survival and hazard rate functions of GAPTW are
(11)
(12)
- Shapes of density hazard rate functions of GAPTW distribution .
Shape of distribution reflects symmetric asymmetric skewness- tails of the distribution .All the above can be checked mathematically depending on theorems given by Glazer (1980) Define :-
(13)
Where and defined in eq's ( (11) and (10)) respectively V(x) must be differentiable and continuous ,this can be proved as :-
Consider so that
And V(x) is continuous at x=c if
Taking the derivative to both sides of eq (13) with respect to x, we have
(14)
(15)
The relative importance of Glaser (1980) theorem as a tool for classifying the hazard rate function lies the possibility of tracking the comparisons of with the hazard rate function or inverted function . The use of application of classification theory lies especially when f(x) belongs to exponential family .The p.d.f f(x) is defined in eq (10) can be written in exponenttail family as :-
(16)
.The function is :-
(17)
Where
, . Glaser (1980)
So that:
Taking the first derivative to both sides of with respect to x , we get :-
(19)
Making some mathematical simplification becomes as :-
(20)
Making discussions in the following cases :-
Case(1)
For
- If
- If
In this case the (hr) function has bathtub or increasing curve , The solution of can be found by one of non - linear numerical methods .If the solution of is which satisfies and for
Case(2)
When
- For
Therefore the (hr) function is increasing :
- For
for all so that (hr) curve is increasing .
The (p.d.f) f(x) and , F(x) , h(x) have been plotted in different values of parameters .
Graph (1) the p.d.f in eq (10) , c.d.f in eq (9) , sf in eq (11) hrf in eq(12) ,of GAPTW distribution at different values of and .
We notice from the figure that the distribution has a heavy tail
4- Heavy tailed distribution
The distribution of a continuous random variable has a light tail if it doesn’t have a heavy tail . The heavy tailed distribution is a probability distribution whose tails are not exponentially bounded where they have heavier tails than exponential distribution .In many applications there is a right tail.
A random variable x with (c.d.f) F(x) has a heavy tail if moment generating function (m.g.f) of x is infinite or the limit as for ratio with the .
Where is the (C.D.F) of ie
Now we compare the survival function of (GAPTW) defined in eq (11) is :
Therefore a (GAPTW) is a heavy tail distribution. Another method for testing the heavy tail in the distribution is based on the limit of (hr) function as if it equals to zero , the distribution has a heavy tail .
It is possible to compare the right tails of two distribution by the limit as of the ratio of two survival functions if the result approaches to so that the distribution on the numerator is heavier than the other . This ratio is defined as
(21)
Where , be the survival functions of first and second distribution A comparison between (GAPTW) with Weibull distribution the ratio in (21) becomes :
So that (GuAPTW) is heavier tail than Weibull distribution
- Statistical properties of (GAPTW) distribution
In this section some important statistical properties of the distribution have been presented as :-
The quintile function of q-th quintile a solution of the following nonlinear equation with respect to
(22)
so that eq (5-1) becomes :
(23)
Solving the above non –linear equation with respect to ( ) by using one of nonlinear numerical methods , The root ( ) which satisfies eq (23) must be after that solving the following equation respect to .
(24)
Eq (23) is very important in simulating samples from (GAPTW) by replacing ( with for i=1,2,…n where (n) is the sample size and a random observation generated from U(0,1) . Also it is used to obtain the median of (x) where
The median and quantiles were computed at different of parameters values of (GAPTW) distribution . The result is shown in table (1).
It is seen that from table (1) above that the additional parameter( ) has an opposite effect on the values of quantiles and mode .Also it is seen that the mode is smaller than median at all parameter values of the distribution which indicates a positive skew .
The mode is a numerical solution of the following equation with respect to x
(25)
Table (1): Quantiles and mode of (GAPTW) distribution at different values of parameter :
|
|
|
𝙖
|
Q1
|
Q3
|
Me
|
Mo
|
|
1
|
0.5
|
0.25
|
0.4804671
|
4.588197
|
1.699874
|
0
|
| |
|
0.75
|
0.1326779
|
2.518494
|
0.698367
|
0
|
| |
|
1.5
|
0.03901163
|
1.108443
|
0.03901163
|
0
|
|
1
|
1.5
|
0.25
|
0.7832147
|
1.661671
|
1.193454
|
1.068268
|
| |
|
0.75
|
0.5100646
|
1.360551
|
0.8872139
|
0.6289186
|
| |
|
1.5
|
0.3391544
|
1.034917
|
0.6258829
|
0.3386085
|
|
1
|
2
|
0.25
|
0.8325546
|
1.463565
|
1.141837
|
1.116567
|
| |
|
0.75
|
0.6035548
|
1.259752
|
0.9141584
|
0.828462
|
| |
|
1.5
|
0.4444399
|
1.026083
|
0.703667
|
0.5533789
|
The second derivative of f(x) with respects to x less than zero ,so that the solution represent the mode .
The r-th moment around zero is
By using the relation and making mathematical simplifications ,the r_th moment around zero of x is
The r_th moment above is infinite for all values of r ,so that the distribution is a heavy tail .
6- Estimation methods
In this section the parameters of GAPTW have been estimated by maximum product of spacing (MPS) and Cramer _von mies (CVM) methods .These two methods of estimation are discussed below:
6.1 Maximum product of spacing (MPS)Method
This method was introduced by cheng and Amin (1979) [2].this method based on maximization of the following function .
(26)
For i=1,2,…n be the sample ordered observation in an assending pattern .
After arrangement the sample observations on ascending order and substituting the c.d.f of GAPTW defined on (9) in to eq(26) the function becomes :
(27)
The function (27) can be maximized by methods of restricted non –linear optimization methods or by soliving the following non –linear equations
(28)
where
6.2 Cramer-von Mises (CVM)Method
This method was introduced by MacDonald (1971) [1].It is based on the minimization the following function .
(29)
The function is defined in eq (29) on be minimized by using either one of methods of non-linear optimization or by solving the following non-linear equations with respect to :-
where
- Application
This section deals the illustration of flexibility and effeciency of GAPTW on two read data sets . The first represents the remission times of (128) bladder cancer patients ,and the second contains (40) times to failure of torbo charger .
These two data sets are taken from Al sabhi (2022). These two data sets fitted by GAPTW , Weibull and another of alpha power transform family (Elbatal et.el(2019).The comparison has been made by Akaike information criterion (AIC) minus log likelihood (-lnL) and Kolmogorov Smirnov statistic . (Ijaz (2020) ) , (Ahmad (2021))
Table (1) :Goodness of fit criteria for canser data
|
Distribution
|
|
AIC
|
-lnL
|
K. S
|
|
Weibull
|
|
832.0256
|
414.0128
|
0.897625
|
|
ELAPTW
|
|
834.7151
|
414.3576
|
0.896625
|
|
GAPTW
|
|
785.0783
|
389.5391
|
0.8936625
|
Table (2) Goodness of fit criteria for time to failure data
|
Distribution
|
|
AIC
|
-lnL
|
K.S
|
|
Weibull
|
|
170.1854
|
83.09271
|
0.7035
|
|
ELAPTW
|
|
183.8567
|
88.92833
|
0.6591
|
|
GAPTW
|
|
157.9303
|
75.96514
|
0.6559
|
It is seen from two tables (1) ,(2) that the GAPTW distribution is the best fit to two real data sets also it is the better fit than the odd-logistic –lindly weibull distribution which is proposed by Al sobhi(2022) .
Graph (1) The fitted p.d.f ,c.d.f survival function of cancer data
Graph(2) fitted p.d.f, c.d.f survival for time to failure data
It is seen from two graphs above that all plots are support to results in tables (1),(2).The parameters of the best distribution to two real data sets have been estimated by (MPS) and (CVM) methods and the comparison of two methods have been done by mean square error criterion
Table (2) MPS and CVM estimators of GAPTW distribution
|
Data
|
MPS
|
CVM
|
|
|
|
𝙖
|
|
|
𝙖
|
|
Cancer
|
1.22055
|
0.50224
|
0.00298
|
1.00000
|
1.00000
|
1.00000
|
|
MSE
|
0.53274
|
0.02700
|
|
Time to failure
|
5.9352103
|
2.7811926
|
0.3546080
|
1.00000
|
1.00000
|
1.10000
|