Wald type and Phi-divergence based test-statistics

for isotonic binomial proportions

Martin, N.1, Mata, R.2 and Pardo, L.2

1Dep. Statistics, Carlos III University of Madrid, 28903 Getafe (Madrid), Spain

2Dep. Statistics and O.R., Complutense University of Madrid, 28040 Madrid, Spain

February 28, 2014

Abstract

In this paper new test statistics are introduced and studied for the important problem of

testing hypothesis that involves inequality constraint on proportions when the sample comes

from independent binomial random variables: Wald type and phi-divergence based test-

statistics. As a particular case of phi-divergence based test-statistics, the classical likelihood

ratio test is considered. An illustrative example is given and the performance of all of them

for small and moderate sample sizes is analyzed in an extensive simulation study.

Keywords and phrases: Wald-type statistics, Phi-divergence statistics, Inequality constrains,

Loglinear model, Logistic regression.

1 Introduction

Ordinal categorical data appear frequently in the biomedical research literature, for example,

in the analysis of I independent binary random variables related to an increasing ordered cat-

egorical variable. It is well-known that for such data it is not possible to use the classical

test-statistics such as chi-square or likelihood ratio with chi-squared asymptotic distribution,

but there exist appropriate order-restricted test-statistics with chi-squared-bar asymptotic dis-

tribution. To illustrate this problem a modification of an example given in Silvapulle and Sen

(2005) is considered in this introductory section.

Table 1 contains a subset of data from a prospective study of maternal drinking and con-

genital malformations. Women completed a questionnaire, early in their pregnancy, concerning

alcohol use in the first trimester; complete data and details are available elsewhere (Graubard

1

ar
X

iv
:1

40
2.

67
17

v1
  [

st
at

.M
E

] 
 2

6 
Fe

b 
20

14


i ni ni1 ni2
1 17114 48 17066
2 14502 38 14464
3 793 5 788
4 165 2 163

Table 1: Number of individuals with (j=1) and without (j=0) congenital sex-organ malformation
cross-classified according to the maternal alcohol consumption level (i=1,2,3,4).

and Korn, 1987). Specifically, women were asked what was the amount of alcohol taken dur-

ing the first three months of their pregnancy and four categories of drink doses are considered

(I = 4), no alcohol consumption (i = 1), average number of alcoholic drinks per day less than

one but greater than zero (i = 2), one or more and less than three alcoholic drinks per day

(i = 3) and three or more alcoholic drinks per day (i = 4). In terms of a binary random variable

with ni individuals in total (see the second column in Table 1) with independent behavior with

respect to having congenital malformations, the individuals not having congenital malformations

are considered to be unsuccessful (j = 2, see the last column in Table 1) and successful otherwise

(j = 1, see the third column in Table 1). Let πi be the probability of a success associated with

the i-th alcohol dose. Let us consider some statistical inference questions that may arise in this

example and in similar ones with binomial probabilities.

1. Is there any evidence of maternal alcohol consumption being related to malformation of

sex organ? To answer this question, the null and alternative hypotheses may be formulated

as

H0 : π1 = π2 = π3 = π4 vs. H1 : π1, π2, π3, π4 are not all equal,

respectively. However, this formulation is unlikely to be appropriate because the main

issue of interest is the possible increase in the probability of malformation as alcohol

consumption increases.

2. Is there any evidence that an increase in maternal alcohol consumption is associated with

an increase in the probability of malformation?. This question, as it stands, is quite broad

to give a precise formulation of the null and the alternative hypotheses. One possibility is

to formulate the problem in the following way,

H0 : π1 = π2 = π3 = π4 vs. H1 : π1 ≤ π2 ≤ π3 ≤ π4 with at least one inequality being strict.

(1)

Consider an experiment with I increasing ordinal categories for a variable X. Suppose that

ni prefixed individuals are assigned to the i-th category and n =
∑I

i=1 ni. The individuals are

2


followed over time for the development of an event of interest Y and the events related to the

individuals are independent. Let Ni1 be the random variable that represents the number of

individuals related to successful events (Y = 1) out of the total assigned to the i-th category, ni,

i = 1, ..., I. If we denote by πi = Pr(Y = 1|X = i) the probability of a success associated with

the i-th category, we have that Ni1 is a Binomial random variable with parameters ni and πi,

i = 1, ..., I. Let Ni2 denote the number of unsuccessful events associated with the i-th category,

i.e. Ni2 = ni − Ni1, then the contingency table of a realization of (Ni1, Ni2), i = 1, ..., I, is in

the last two columns of the following table

n1 n11 n12 = n1 − n11
...

...
...

ni ni1 ni2 = ni − ni1
...

...
...

nI nI1 nI2 = nI − nI1

.

Our purpose in this paper is to propose new order-restricted test statistics, Wald-type and

phi-divergence based test-statistics for testing

H0 : π1 = π2 = · · · = πI , (2)

H1 : π1 ≤ π2 ≤ · · · ≤ πI with at least one inequality being strict.

The classical likelihood ratio test statistic will appear as a particular case of phi-divergence

based test-statistics. A log-linear formulation of (1) is proposed in Section 2, fundamental for

defining the Wald type test-statistics. In Section 3 the families of phi-divergence test statistics

are presented. Section 4 is devoted to solve the problem presented in this Section 1 for a

illustrative example. An extensive simulation study is carried out in Section 5.

2 Formulation for isotonic binomial proportions in terms of

log-linear and logistic regression modeling: Wald type test-

statistics

Reparametrizating the initial problem through log-linear modeling, the formulation of the null

hypothesis is strongly simplified since all the interaction parameters are zero under the null

3


hypothesis and this is appealing, in special, to create Wald type test-statistics. Let

p = p(θ) = (p11(θ), p12(θ), p21(θ), p22(θ), ..., pI1(θ), pI2(θ))T

= (n1
n π1,

n1
n (1− π1), n2

n π2,
n2
n (1− π2), ..., nIn πI ,

nI
n (1− πI))T (3)

be the probability vector of the following saturated log-linear model

log pij(θ) = u+ u1(i) + θ2(j) + θ12(ij), (4)

with

u1(I) = 0, θ2(2) = 0, θ12(i2) = 0, i = 1, ..., I − 1, θ12(Ij) = 0, j = 1, 2, (5)

being the identifiability constraints,

θ = (θ2(1), θ12(11), ..., θ12(1I−1))
T (6)

the unknown parameters vector and u = (u, u1(1), ..., u1(I−1))
T with

u = u(θ) = log
nI/n

1 + exp{θ2(1)}
, (7)

u1(i) = u1(i)(θ) = log

ni
nI

(
1 + exp{θ2(1)}

)
1 + exp{θ2(1) + θ12(i1)}

, i = 1, ..., I − 1, (8)

the redundant parameters, obtained through θ taking into account pi1(θ) + pi0(θ) = ni
n , i =

1, ..., I. In terms of the log-linear formulation, (2) is equivalent to

H0 : θ12(11) = θ12(21) = · · · = θ12(I−1,1) = 0, (9)

H1 : θ12(11) ≤ θ12(21) ≤ · · · ≤ θ12(I−1,1) ≤ 0 with at least one inequality being strict.

Notice that θ2(1) is a nuisance parameter since it does not interfere in (9). In particular, under

the null hypothesis of π0 = π1 = π2 = · · · = πI , the value of the nuisance parameter is θ2(1) =

logit(π0) = log[π0/(1 − π0)], and thus it contains all the information about the homogeneous

probability vector.

In matrix notation, we can express the vector of parameters of the log-linear model in terms

of the following logistic regression

logit(π) = Xθ (10)

4


where

π = (π1, π2, ..., πI)
T , X =

(
1I−1 II−1

1 0TI−1

)
,

Ia is the the identity matrix of order a, 1a is the a-vector of ones and 0a is the a-vector of zeros.

Since a saturated model has been considered, X is a full rank matrix and thus we can consider

θ = X−1logit(π), (11)

and on the other hand (9) in matrix notation is given by

H0 : Rθ = 0I−1, (12)

H1 : Rθ ≤ 0I−1 and Rθ 6= 0I−1,

with R = (0I−1,GI−1), and GI−1 is a square matrix of order I−1 with 1-s in the main diagonal

and −1-s in the upper superdiagonal.

We shall consider three parameter spaces for θ

Θ0 =
{
θ ∈ RI : Rθ = 0I−1

}
⊂ Θ̃ =

{
θ ∈ RI : Rθ ≤ 0I−1

}
⊂ Θ = RI ,

i.e. while Θ is the restricted parameter space, Θ̄ is unrestricted, and Θ0 becomes the parameter

space under the null hypothesis. It is well known that, the Fisher information matrix for θ ∈ Θ

in the logistic regression is given by

IF (θ) = XTdiag{νiπi(1− πi)}Ii=1X, (13)

where νi = limn→∞
ni
n , i = 1, ..., I. The following result provides the explicit expression of the

Fisher information matrix under the null hypothesis given in (2) or (9).

Theorem 1 For θ0 ∈ Θ0 in the model (4) or (10), the Fisher information matrix is given by

IF (θ̂) = π0(1− π0)



1 ν1 ν2 · · · νI−1

ν1 ν1 0 · · · 0

ν2 0 ν2 0
...

...
. . .

...

νI−1 0 0 · · · νI−1


. (14)

Proof. It is immediate by plugging πi = π0, i = 1, ..., I, to (13).

If θ̂, θ̃ and θ represent the maximum likelihood estimator (MLE) of θ focussed on the

5


parameter spaces, Θ0, Θ̃, Θ respectively, according to Silvapulle and Sen (2005, pages 154 and

166) we can consider three Wald-type test-statistics,

W (θ̃, θ̂) = nθ̃
T
RT

(
RI−1F (θ̂)RT

)−1
Rθ̃, (15)

H(θ̃, θ̂) = n(θ̃ − θ̂)TIF (θ̂)(θ̃ − θ̂), (16)

D(θ, θ̃, θ̂) = n(θ − θ̂)TIF (θ̂)(θ − θ̂)− n(θ − θ̃)TIF (θ̃)(θ − θ̃), (17)

where

IF (θ̂) = π̂0(1− π̂0)

(
1 (ν̂∗)T

ν̂∗ diag(ν̂∗)

)
,

IF (θ̃) = XTdiag{ν̂iπ̃i(1− π̃i)}Ii=1X,

ν̂∗ = (ν̂1, ..., ν̂I−1)
T = (

n1
n
, ...,

nI−1
n

)T .

These test-statistics, have according to Proposition 4.4.1 in Silvapulle and Sen (2005), the same

asymptotic distribution as the likelihood ratio test-statistic.

Proposition 2 Under the null hypothesis given in (2) or (9), the expression of W (θ̃, θ̂), given

in (15), is as follows

W (θ̃, θ̂) = nπ̂0(1− π̂0)(θ̃
∗
)TΣν̂∗ θ̃

∗
,

where θ∗ = (θ12(11), ..., θ12(1I−1))
T , Σν̂∗ = diag(ν̂∗)− ν̂∗(ν̂∗)T ,

π̂0 = N•1
n ,

N•1 = N11 +N21 + ...+NI1.

Proof. From Theorem 1 the 2× 2 block structure of the Fisher information matrix is

IF (θ0) = π̂0(1− π̂0)

(
1 (ν̂∗)T

ν̂∗ diag(ν̂∗)

)

and

I−1F (θ0) =
1

π̂0(1− π̂0)

(
a bT

c D

)
,

with D = Σ−1
ν̂∗ , Σν̂∗ = diag(ν̂∗)− ν̂∗(ν̂∗)T . But

RI−1F (θ̂)RT =
1

π̂0(1− π̂0)
GI−1Σ

−1
ν̂∗GT

I−1

6


and

Rθ̃ = GI−1θ
∗.

Therefore,

W (θ̃, θ̂) = nθ̃
T
RT

(
RI−1F (θ̂)RT

)−1
Rθ̃

= nπ̂0(1− π̂0)(θ̃
∗
)TΣν̂∗ θ̃

∗
.

There is an explicit formula for the MLEs of θ under the null hypothesis,

θ̂ = (θ̂2(1), θ̂12(11), ..., θ̂12(1I−1))
T = (logit(π̂0), 0, ..., 0)T . (18)

For the calculation of θ̃ = (θ̃2(1), θ̃12(11), ..., θ̃12(1I−1))
T or θ = (θ2(1), θ12(11), ..., θ12(1I−1))

T ,

it is much easier to calculate first the corresponding MLE for the probability vector, π̃ =

(π̃1, π̃2, ..., π̃I)
T or π = (π1, π2, ..., πI)

T , and plugging it to (11). There is an explicit formula for

π = (N11
n1
, N21
n2
, ..., NI1nI

)T , (19)

and for calculating π̃ the following PAVA algorithm can be used.

Algorithm 3 (Order restricted estimation of probabilities) The MLE of π = (π1, π2, · · · , πI)T

under the restriction of π1 ≤ π2 ≤ · · · ≤ πI , π̃ = (π̃1, π̃2, · · · , π̃I)T , is calculated in the following

way:

STEP 1: Do π̃ := π, where π is (19).

STEP 2: While not π̃i ≤ π̃i+1 ∀i = 1, ..., I − 1 do

For i = 1, ..., I − 1

If π̃i 6≤ π̃i+1 do π̃i := ni
n π̃i + ni+1

n π̃i+1 and π̃i+1 := π̃i.

3 Phi-divergence test statistics

The classical order-restricted likelihood ratio test for testing (2) is given by

G2 = 2
I∑
i=1

(
Ni1 log

π̃i
π̂0

+ (ni −Ni1) log
1− π̃i
1− π̂0

)

(see for instance Mancuso et al (2001)). The Kullback-Leibler divergence measure between two

2I-dimensional probability vectors p = (p11, p12, ..., pI1, pI2)
T and q = (q11, q12, ..., qI1, qI2)

T , is

7


given by

dKull(p, q) =

I∑
i=1

(
pi1 log pi1

qi1
+ pi2 log pi2

qi2

)
.

It is an easy exercise to verify that

G2 = 2n(dKull(p,p(θ̂))− dKull(p,p(θ̃))), (20)

where

p = p(θ) = (n1
n π1,

n1
n (1− π1), n2

n π2,
n2
n (1− π2), ..., nIn πI ,

nI
n (1− πI))T

= (N11
n , N12

n , N21
n , N22

n , ..., NI1n , NI2n )T ,

p̃ = p(θ̃) = (n1
n π̃1,

n1
n (1− π̃1), n2

n π̃2,
n2
n (1− π̃2), ..., nIn π̃I ,

nI
n (1− π̃I))T ,

p̂ = p(θ̂) = (n1
n π̂0,

n1
n (1− π̂0), n2

n π̂0,
n2
n (1− π̂0), ..., nIn π̂0,

nI
n (1− π̂0))T .

The classical order-restricted chi-square test statistic for testing (2), known as Bartholomew’s

test-statistic, is given by

X2 =
1

π̂0 (1− π̂0)

I∑
i=1

ni (π̃i − π̂0)2 , (21)

which can be written as

X2 = 2ndPearson(p(θ̃),p(θ̂)), (22)

where dPearson(p, q) is the Pearson divergence measure defined by

dPearson(p, q) =
1

2

I∑
i=1

(
(pi1−qi1)2

qi1
+ (pi2−qi2)2

qi2

)
.

Details about this test-statistic can be found in Fleiss et al. (2003, Section 9.3).

More general than the Kullback-Leibler divergence and Pearson divergence measures are

φ-divergence measures, defined as

dφ(p, q) =
I∑
i=1

(
qi1φ

(
pi1
qi1

)
+ qi2φ

(
pi2
qi2

))
,

where φ : R+ −→ R is a convex function such that φ(1) = φ′(1) = 0, φ′′(1) > 0, 0φ(00) = 0,

0φ(p0) = p limu→∞
φ(u)
u , for p 6= 0. For more details about φ-divergence measures see Pardo

(2006).

Based on φ-divergence measures we shall consider in this paper two families of order-

8


restricted φ-divergence test statistics valid for testing (2) or (9). The first one generalizes

the order-restricted likelihood ratio test given in (20) in the sense that we replace on it the

Kullback-Leibler divergence measure by a phi-divergence measure and its expression is

Tφ(p,p(θ̃),p(θ̂)) =
2n

φ′′(1)
(dφ(p,p(θ̂))− dφ(p,p(θ̃))) (23)

=
2

φ′′(1)

{
I∑
i=1

ni

(
π̂0φ

(
Ni1

niπ̂0

)
− π̃iφ

(
Ni1

niπ̃i

)
+ (1− π̂0)φ

(
ni −Ni1

ni(1− π̂0)

)
− (1− π̃i)φ

(
ni −Ni1

ni (1− π̃i)

))}
.

For φ(x) = x log x− x+ 1, we get the likelihood ratio test.

The second one generalizes the order-restricted Pearson test statistic given in (22) in the

sense that we replace on it the Pearson divergence measure by a phi-divergence measure and its

expression is

Sφ(p(θ̃),p(θ̂)) =
2n

φ′′(1)
dφ(p(θ̃),p(θ̂)) (24)

=
2

φ′′(1)

{
I∑
i=1

ni

(
π̂0φ

(
π̃i
π̂0

)
+ (1− π̂0)φ

(
1− π̃i
1− π̂0

))}
.

For φ(x) = 1
2 (x− 1)2, we get the Pearson test-statistics.

The following theorem provides the link between the both test-statistics, Tφ(p,p(θ̃),p(θ̂))

and Sφ(p(θ̃),p(θ̂)), and Wald-type test-statistics.

Theorem 4 For testing (2) or (9), the asymptotic distribution of

T ∈ {Tφ(p,p(θ̃),p(θ̂)), Sφ(p(θ̃),p(θ̂)),W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)}

is common and is given by

lim
n→∞

Pr(T ≤ x) =
I−1∑
i=0

wi(I − 1,V ) Pr(χ2
i ≤ x),

where

V = GI−1diag−1(ν∗)GT
I−1 +

1

νI
eI−1e

T
I−1, (25)

and {wi(I − 1,V )}I−1i=0 is the set of weights such that
∑I−1

i=0 wi(I − 1,V ) = 1 and its values are

given in Theorem 5.

Proof. Let θ̂ be the I-dimensional vector given in (18). The second order Taylor expansion of

9


function dφ(θ) = dφ(p(θ),p(θ̂)) about θ̂ is

dφ(θ) = dφ(θ̂) + (θ− θ̂)T
∂

∂θ
dφ(θ)

∣∣∣∣
θ=θ̂

+
1

2
(θ− θ̂)T

∂2

∂θ∂θT
dφ(θ)

∣∣∣∣
θ=θ̂

(θ− θ̂) + o

(∥∥∥θ − θ̂∥∥∥2) ,
(26)

where

∂

∂θ
dφ(θ)

∣∣∣∣
θ=θ̂

= 0I−1,

∂2

∂θ∂θT
dφ(θ)

∣∣∣∣
θ=θ̂

= φ′′ (1) I(n1,...,nI)
F (θ̂),

and

I(n1,...,nI)
F (θ) = XTdiag{nin πi(1− πi)}

I
i=1X.

Let θ = X−1logit(π), where π is (19). In particular, for θ = θ we have

dφ(p(θ),p(θ̂)) =
φ′′ (1)

2
(θ − θ̂)TI(n1,...,nI)

F (θ̂)(θ − θ̂) + o

(∥∥∥θ − θ̂∥∥∥2)
and for θ = θ̃

dφ(p(θ̃),p(θ̂)) =
φ′′ (1)

2
(θ̃ − θ̂)TI(n1,...,nI)

F (θ̂)(θ̃ − θ̂) + o

(∥∥∥θ̃ − θ̂∥∥∥2) ,
and then taking into account that limn→∞ I(n1,...,nI)

F (θ) = IF (θ),

Tφ(p,p(θ̃),p(θ̂)) =
2n

φ′′(1)
(dφ(p,p(θ̂))− dφ(p,p(θ̃)))

= n(θ − θ̂)TIF (θ̂)(θ − θ̂)− n(θ − θ̃)TIF (θ̃)(θ − θ̃) + oP (1)

= D(θ, θ̃, θ̂) + oP (1) .

In a similar way it is obtained

dφ(p(θ),p(θ̃)) =
φ′′ (1)

2
(θ − θ̃)TI(n1,...,nI)

F (θ̃)(θ − θ̃) + o

(∥∥∥θ − θ̃∥∥∥2) ,

10


and then taking into account that limn→∞ I(n1,...,nI)
F (θ) = IF (θ),

Sφ(p(θ̃),p(θ̂)) =
2n

φ′′(1)
dφ(p(θ̃),p(θ̂))

= n(θ̃ − θ̂)TIF (θ̂)(θ̃ − θ̂) + oP (1)

= H(θ̃, θ̂) + oP (1)

According to Proposition 4.4.1 in Silvapulle and Sen (2005)

G2 = D(θ, θ̃, θ̂) + oP (1) = H(θ̃, θ̂) + oP (1) = W (θ̃, θ̂) + oP (1) ,

which means that the asymptotic distribution of

T ∈ {Tφ(p,p(θ̃),p(θ̂)), Sφ(p(θ̃),p(θ̂)),W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)}

is common. Such a distribution can be established from the likelihood ratio test-statistic used

for the problem formulated in (6.13) of Silvapulle and Sen (2005)

lim
n→∞

Pr(G2 ≤ x) =

I−1∑
i=0

wi(I − 1,V ) Pr(χ2
i ≤ x),

where

V = Bdiag−1(ν)BT ,

with

ν = (ν∗, νI)
T = (ν1, ..., νI)

T ,

B = (GI−1,−eI−1).

Using the partitioned structure of B and ν, (25) is obtained.

The following result is based on the third way for computation of weights given in page 79

of Silvapulle and Sen (2005).

Theorem 5 The set of weights {wi(I − 1,V )}I−1i=0 of the asymptotic distribution given in The-

orem 4 is computed as follows

wi(I − 1,V ) = Pr

(
arg min

ζ∈RI−1
+

(Z − ζ)TV −1(Z − ζ) ∈ RI−1+ (i)

)
, i = 0, ..., I − 1 (27)

Z ∼ NI−1(0I−1,V ), RI−1+ = {ζ ∈ RI−1+ : ζ ≥ 0} and RI−1+ (i) ⊂ RI−1+ such that i components

11


are strictly positive and I − 1 − i components are null. In particular, for I ∈ {2, 3, 4} (27) has

the explicit expressions given in (3.24), (3.25) and (3.26) of Silvapulle and Sen (2005).

For computing (27) is useful to know the following explicit expression V −1 = T TI−1Σν∗T I−1,

where T I−1 = G−1I−1 is an upper triangular matrix of 1-s, Σν∗ = diag(ν∗) − ν∗(ν∗)T , and to

simulate the probability according to the following algorithm. For simulation

V̂ = GI−1diag−1(ν̂∗)GT
I−1 +

1

ν̂I
eI−1e

T
I−1, (28)

V̂
−1

= T TI−1Σν̂∗T I−1, (29)

are needed rather than V and V −1 respectively.

Algorithm 6 (Estimation of weights) The estimators of the weights given in Theorem 5,

ŵi = wi(I − 1, V̂ ), are obtained by Monte Carlo in the following way:

STEP 1: For i = 0, ..., I − 1, do N(i) := 0.

STEP 2: Repeat the following steps R (say R = 1, 000, 000) times:

STEP 2.1: Generate an observation, z, from Z ∼ NI−1(0I−1, V̂ ). E.g., the

NAG Fortran library subroutines G05CBF, G05EAF, and G05EZF can be useful.

STEP 2.2: Compute ζ̂(z) = arg minζ∈RI−1
+

1
2ζ

T V̂
−1
ζ−(V̂

−1
z)T ζ. E.g., the IMSL Fortran

library subroutine DQPROG can be useful.

STEP 2.3: Count i∗, the number of strictly positive components contained in

ζ̂(z), and

do N(i∗) := N(i∗) + 1.

STEP 3: Do ŵi := N(i)
R for i = 0, ..., I − 1.

4 Example

In this section the data set of the introduction (Table 1) is analyzed. The sample, a realization

of N , is summarized in the following vector

n = (n11, n12, n21, n22, n31, n32, n41, n42)
T

= (48, 17066, 38, 14464, 5, 788, 2, 163)T .

12


The estimated vectors of interest are

π =
(

48
17114 ,

38
14502 ,

5
793 ,

2
165

)T
= (0.0028, 0.0026, 0.0063, 0.0121)T ,

π̃ =
(

43
15 808 ,

43
15 808 ,

5
793 ,

2
165

)T
= (0.0027, 0.0027, 0.0063, 0.0121)T ,

π̂0 = 93
32 574 = 0.0029,

and

θ = X−1logit(π) = (−4. 400 6,−1. 473,−1. 541 2,−0.659 46)T ,

θ̃ = X−1logit(π̃) = (−4. 400 6,−1. 503 7,−1. 503 7,−0.659 46)T

θ̂ = (logit(π̂0), 0, 0, 0)T = (−5.8558, 0, 0, 0)T .

For the asymptotic distribution the weighs can be calculated though

V̂ =

 83774.0156250 −14428.7177734 0

−14428.7177734 15217.7109375 −788.9927979

0 −788.9927979 1457.5666504

 ,

calculating the correlation coefficients

ρ̂12 = −0.40411, ρ̂13 = 0, ρ̂23 = −0.16753,

the partial correlation coefficients

ρ̂12•3 = −0.4099, ρ̂13•2 = −0.07507 2, ρ̂23•1 = −0.183 15,

and evaluating the following expressions

ŵ3 =
1

4π
(2π − arccos (ρ̂12)− arccos (ρ̂13)− arccos (ρ̂23)) = 0.07850,

ŵ2 =
1

4π
(3π − arccos(ρ̂12•3)− arccos(ρ̂13•2)− arccos(ρ̂23•1)) = 0.32075,

ŵ1 = 0.5− w0(θ̂) = 0.5− 0.07850 = 0.4215,

ŵ0 = 0.5− w1(θ̂) = 0.5− 0.32075 = 0.17925.

If we take φλ(x) = 1
λ(1+λ)(x

λ+1 − x− λ(x− 1)), where for each λ ∈ R− {−1, 0}, the “power

13


divergence family” is obtained

dλ(p, q) =
1

λ(λ+ 1)

 I∑
i=1

J∑
j=1

pλ+1
ij

qλij
− 1

 , for each λ ∈ R− {−1, 0}. (30)

It is also possible to cover the real line for λ, by defining dλ(p, q) = limt→λ dt(p, q), for λ ∈
{−1, 0}. It is well known that d0(p, q) = dKull(p, q) and d1(p, q) = dPearson(p, q). This is very

interesting since this means that the power divergence based family of test-statistics contain as

special cases G2 and X2.

Finally, the expressions of the test-statistics are summarized in Table 2. It can be seen that

the null hypothesis cannot be rejected for W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂), T−1.5(p,p(θ̃),p(θ̂)),

T−1(p,p(θ̃),p(θ̂)), T−0.5(p,p(θ̃),p(θ̂)), S−1.5(p(θ̃),p(θ̂)), S−1(p(θ̃),p(θ̂)), S−0.5(p(θ̃),p(θ̂)) and

should be rejected for T0(p,p(θ̃),p(θ̂)), T 2
3
(p,p(θ̃),p(θ̂)), T1(p,p(θ̃),p(θ̂)), S0(p(θ̃),p(θ̂)),

S 2
3
(p(θ̃),p(θ̂)), S1(p(θ̃),p(θ̂)). Even though the sample size seems to be large enough, this

is a case of small values of π1, π2, π3, π4 which is known to have not reliable behavior in the val-

ues calculated for the p-values in order to make decisions. In the simulation study we shall study

such a case and according to the results the rejection of the null hypothesis is supported since

with {Tλ(p,p(θ̃),p(θ̂))}λ∈{0, 2
3
,1} are obtained the most realiable test-statistics. As conclussion,

an increase in maternal alcohol consumption is associated with an increase in the probability of

malformation.

test-statistic λ = −1.5 λ = −1 λ = −0.5 λ = 0 λ = 2
3

λ = 1

Tλ(p,p(θ̃),p(θ̂)) 3.3068 3.8173 4.4920 5.4057 7.2076 8.4895
p-value(Tλ) 0.1177 0.0911 0.0650 0.0413 0.0169 0.0090

Sλ(p(θ̃),p(θ̂)) 3.2993 3.8124 4.4896 5.4057 7.2107 8.4942
p-value(Sλ) 0.1181 0.0913 0.0651 0.0413 0.0169 0.0090

W (θ̃, θ̂) 2.5979 H(θ̃, θ̂) 2.6363 D(θ, θ̃, θ̂) 2.6462

p-value(W (θ̃, θ̂)) 0.1686 p-value(H(θ̃, θ̂)) 0.1653 p-value(D(θ, θ̃, θ̂)) 0.1645

Table 2: Power divergence based test-statistics, Wald type statistics and their corresponding
asymptotic p-values.

5 Simulation study

For testing (1), by considering I = 4 binomial random variables, the following scenarios will be

considered:

• scenario A (small/big proportions): n1 = 40, n2 = 30, n3 = 20, n4 = 10.

– scenario A-0: π1 = π2 = π3 = π4 = 0.05.

14


– scenario A-1: π1 = 0.05, π2 = π3 = π4 = 0.1.

– scenario A-2: π1 = 0.05, π2 = 0.1, π3 = π4 = 0.125.

– scenario A-3: π1 = 0.05, π2 = 0.1, π3 = 0.125, π4 = 0.135.

• scenario B (small/big proportions): n1 = 60, n2 = 45, n3 = 30, n4 = 15.

– scenario B-0: π1 = π2 = π3 = π4 = 0.05.

– scenario B-1: π1 = 0.05, π2 = π3 = π4 = 0.1.

– scenario B-2: π1 = 0.05, π2 = 0.1, π3 = π4 = 0.125.

– scenario B-3: π1 = 0.05, π2 = 0.1, π3 = 0.125, π4 = 0.135.

• scenario C (small/big proportions): n1 = 100, n2 = 75, n3 = 50, n4 = 25.

– scenario C-0: π1 = π2 = π3 = π4 = 0.05.

– scenario C-1: π1 = 0.05, π2 = π3 = π4 = 0.1.

– scenario C-2: π1 = 0.05, π2 = 0.1, π3 = π4 = 0.125.

– scenario C-3: π1 = 0.05, π2 = 0.1, π3 = 0.125, π4 = 0.135.

• scenario D (intermediate proportions): n1 = 40, n2 = 30, n3 = 20, n4 = 10.

– scenario D-0: π1 = π2 = π3 = π4 = 0.35.

– scenario D-1: π1 = 0.35, π2 = π3 = π4 = 0.45.

– scenario D-2: π1 = 0.35, π2 = 0.45, π3 = π4 = 0.475.

– scenario D-3: π1 = 0.35, π2 = 0.45, π3 = 0.475, π4 = 0.485.

• scenario E (intermediate proportions): n1 = 60, n2 = 45, n3 = 30, n4 = 15.

– scenario E-0: π1 = π2 = π3 = π4 = 0.35.

– scenario E-1: π1 = 0.35, π2 = π3 = π4 = 0.45.

– scenario E-2: π1 = 0.35, π2 = 0.45, π3 = π4 = 0.475.

– scenario E-3: π1 = 0.35, π2 = 0.45, π3 = 0.475, π4 = 0.485.

• scenario F (intermediate proportions): n1 = 100, n2 = 75, n3 = 50, n4 = 25.

– scenario F-0: π1 = π2 = π3 = π4 = 0.35.

– scenario F-1: π1 = 0.35, π2 = π3 = π4 = 0.45.

15


– scenario F-2: π1 = 0.35, π2 = 0.45, π3 = π4 = 0.475.

– scenario F-3: π1 = 0.35, π2 = 0.45, π3 = 0.475, π4 = 0.485.

The simulation experiment is performed with R = 50000 replications and in each of them,

apart from the Wald type test-statistics T (h) ∈ {W (h)(θ̃, θ̂), H(h)(θ̃, θ̂), D(h)(θ, θ̃, θ̂)}, h =

1, ..., R, all the power divergence test statistics, T (h) ∈ {T (h)
λ (p,p(θ̃),p(θ̂)), S

(h)
λ (p(θ̃),p(θ̂))}λ∈I ,

h = 1, ..., R, associated with the interval I = (−1.5, 3) are considered. From the p-values it is

possible to calculate the proportion of replications rejected according with the nominal size

α = 0.05, i.e. ∑R
h=1 I{p-value(T (h)) ≤ α}

R
, (31)

where I{•} represents the indicator function. The scenarios ending in 0 represent that the

null hypothesis is true, and are useful to obtain the simulated significance levels, α̂T , T ∈
{W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂), Tλ(p,p(θ̃),p(θ̂)), Sλ(p(θ̃),p(θ̂))}λ∈I , with different sample sizes

and kinds of test-statistics. The scenarios ending in either 1, 2 or 3 represent that the null

hypothesis is false and are useful to obtain the simulated powers, β̂T , with different alternatives,

sample sizes and types of test-statistics. For calculating both, α̂T and β̂T , (31) is applied, each

one in the corresponding scenario.

In Figures 1 and 2, α̂T and β̂T for all the aforementioned test-statistics are plotted in different

scenarios. The curves represent either Tλ(p,p(θ̃),p(θ̂)) or Sλ(p(θ̃),p(θ̂)) power divergence test-

statistics, located respectively on left or right of the panel of plots. The asterisk, square and circle

symbols, represent W (θ̃, θ̂), H(θ̃, θ̂) and D(θ, θ̃, θ̂) respectively and all of them are repeated on

the left as well as on the right in order to make easier their comparison with Tλ(p,p(θ̃),p(θ̂))

or Sλ(p(θ̃),p(θ̂)) respectively. The black color lines and symbols, representing α̂T , are useful to

select the test-statistics closed to nominal level According to the criterion given by Dale (1986),

a reasonable exact significance level should verify

|logit(1− α̂T )− logit(1− α)| ≤ ε

for ε ∈ {0.35, 0.7}, being “closed to nominal level” the exact significance levels verifying the

inequality with ε = 0.35 and “fairly closed to nominal level” the ones with ε = 0.7. In this

study only the test statistics satisfying the condition with ε = 0.35 are considered, and the

corresponding upper and lower bounds appear plotted with two horizontal lines, having in the

middle the line associated with the nominal level, α = 0.05. Among the test-statistics with

simulated significance levels closed to the nominal level, the test-statistics with higher powers

should be selected but since in general high powers correspond to high significance levels, this

choice is not straightforward. For this reason, based on β̂T0 − α̂T0 or β̂S1 − α̂S1 as baseline, the

16


efficiencies relative to the likelihood ratio test (T0(p,p(θ̃),p(θ̂)) = G2)

ρT =

(
β̂T − α̂T

)
−
(
β̂T0 − α̂T0

)
β̂T0 − α̂T0

(32)

or the Bartholomew’s test (S1(p(θ̃),p(θ̂)) = X2)

ρ∗T =

(
β̂T − α̂T

)
−
(
β̂S1 − α̂S1

)
β̂S1 − α̂S1

(33)

are considered. In Table 3, the efficiency of S1 = X2 is compared with respect to T0 = G2, and

then if ρS1 < 0, since T0 = G2 is better than S1 = X2, the plot of the efficiencies in Figures 3

and 4 will be only focussed on (32). Similarly, if ρS1 > 0, since T0 = G2 is worse than S1 = X2,

the plot of the efficiencies in Figures 3 and 4 will be only focussed on (33).

sc A sc B sc C sc D sc E sc F

1 ρS1 < 0 ρS1 < 0 ρS1 < 0 ρS1 < 0 ρS1 < 0 ρS1 < 0
2 ρS1 < 0 ρS1 > 0 ρS1 < 0 ρS1 < 0 ρS1 < 0 ρS1 < 0
3 ρS1 < 0 ρS1 > 0 ρS1 < 0 ρS1 < 0 ρS1 < 0 ρS1 < 0

Table 3: Efficiency of the Bartholomew’s test with respect to the likelihood ratio test.

In view of the plots, it is possible to propose test-statistics with better performance in

comparison with G2 and X2. From Figures 1 and 3, the so-called Cressie-Read test-statistic,

T 2
3
(p,p(θ̃),p(θ̂)), can be recommended for small/big proportions either for small or moderate

sample sizes. On the other hand, W (θ̃, θ̂), H(θ̃, θ̂) and the test-statistic based on the Hellinger

distance, T−0.5(p,p(θ̃),p(θ̂)), can be recommended for intermediate proportions and moderate

sample sizes, however for small sample sizes the likelihood ratio test-statistic still remains being

the best one.

17


{Tλ(p,p(θ̃),p(θ̂))}λ∈(−1.5,3) {Sλ(p(θ̃),p(θ̂))}λ∈(−1.5,3)
W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)

sc A-0 (black), sc A-1 (red), sc A-2 (green), sc A-3 (blue)

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

sc B-0 (black), sc B-1 (red), sc B-2 (green), sc B-3 (blue)

−1 0 1 2 3 W H D
0

0.1

0.2

0.3

0.4

0.5

0.6

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.1

0.2

0.3

0.4

0.5

0.6

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

sc C-0 (black), sc C-1 (red), sc C-2 (green), sc C-3 (blue)

−1 0 1 2 3 W H D
0

0.1

0.2

0.3

0.4

0.5

0.6

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.1

0.2

0.3

0.4

0.5

0.6

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

Figure 1: Simulated sizes (black) and powers (red, green, blue) for scenarios A,B,C (small/big
proportions).

18


{Tλ(p,p(θ̃),p(θ̂))}λ∈(−1.5,3) {Sλ(p(θ̃),p(θ̂))}λ∈(−1.5,3)
W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)

sc D-0 (black), sc D-1 (red), sc D-2 (green), sc D-3 (blue)

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

sc E-0 (black), sc E-1 (red), sc E-2 (green), sc E-3 (blue)

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

sc F-0 (black), sc F-1 (red), sc F-2 (green), sc F-3 (blue)

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

λ

β

H
0

H
1

H
1
*

H
1
**

H
0
, W

H
1
, W

H
1
*, W

H
1
**, W

H
0
, H

H
1
, H

H
1
*, H

H
1
**, H

H
0
, D

H
1
, D

H
1
*, D

H
1
**, D

Figure 2: Simulated sizes (black) and powers (red, green, blue) for scenarios D,E,F (intermediate
proportions).

19


{Tλ(p,p(θ̃),p(θ̂))}λ∈(−1.5,3) {Sλ(p(θ̃),p(θ̂))}λ∈(−1.5,3)
W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)

sc A-0 (black), sc A-1 (red), sc A-2 (green), sc A-3 (blue)

−1 0 1 2 3 W H D 
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

sc B-0 (black), sc B-1 (red), sc B-2 (green), sc B-3 (blue)

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ*

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ*

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

sc C-0 (black), sc C-1 (red), sc C-2 (green), sc C-3 (blue)

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

Figure 3: Efficiencies for scenarios A,B,C (small/big proportions).

20


{Tλ(p,p(θ̃),p(θ̂))}λ∈(−1.5,3) {Sλ(p(θ̃),p(θ̂))}λ∈(−1.5,3)
W (θ̃, θ̂), H(θ̃, θ̂), D(θ, θ̃, θ̂)

sc C-0 (black), sc C-1 (red), sc C-2 (green), sc C-3 (blue)

−1 0 1 2 3 W H D 
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D 
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

sc D-0 (black), sc D-1 (red), sc D-2 (green), sc D-3 (blue)

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

sc E-0 (black), sc E-1 (red), sc E-2 (green), sc E-3 (blue)

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

−1 0 1 2 3 W H D
−1.5

−1

−0.5

0

0.5

λ

ρ

H
1

H
1
*

H
1
**

H
1
, W

H
1
*, W

H
1
**, W

H
1
, H

H
1
*, H

H
1
**, H

H
1
, D

H
1
*, D

H
1
**, D

Figure 4: Efficiencies for scenarios C,D,E (intermediate proportions).

21


References

[1] Barlow, R. E., Bartholomew, D. J. and Brunk, H.D. (1972). Statistical inference

under order restrictions. Wiley.

[2] Dale, J.R. (1986). Asymptotic normality of goodness-of-fit statistics for sparse product

multinomials. Journal of the Royal Statistical Society, B, 48, 48–59.

[3] Dardanoni, V. and Forcina, A. (1998). A Unified Approach to Likelihood Inference

on Stochastic Orderings in a Nonparametric Context. Journal of the American Statistical

Association, 93, 1112–1122.

[4] Fleiss, J.L., Levin B. and Paik, M.C. (2003). Statistical Methods for Rates and Pro-

portions. Wiley Interscience.

[5] Graubard, B. I. and Korn, E. L. (1987). Choice of Column Scores for Testing Inde-

pendence in Ordered 2× I Contingency Tables. Biometrics, 43, 471-476.

[6] Martin, N. and Pardo, L. (2008). New families of estimators and test statistics in

log-linear models. Journal of Multivariate Analysis, 99(8), 1590-1609.

[7] Martin, N., Mata, R. and Pardo, L. (2014). Phi-divergence statistics for the likelihood

ratio order: an approach based on log-linear models. http://arxiv.org/pdf/1402.5384v1.pdf.

[8] Mancuso, J. Y.; Ahan, H. and Chen, J. J. (2001). Order-restricted dose-related trend

tests. Statistics in Medicine, 20, 2305-2318.

[9] Pardo, L. (2006). Statistical Inference Based on Divergence Measures. Statistics: series of

Textbooks and Monograhps. Chapman & Hall / CRC.

[10] Silvapulle, M. J. and Sen., P. K. (2005). Constrained statistical inference. Inequality,

order, and shape restrictions. Wiley Series in Probability and Statistics. Wiley-Interscience

(John Wiley & Sons).

22


	1 Introduction
	2 Formulation for isotonic binomial proportions in terms of log-linear and logistic regression modeling: Wald type test-statistics
	3 Phi-divergence test statistics
	4 Example
	5 Simulation study