OrdinalMultinomialModels.jl

OrdinalMultinomialModels.jl provides Julia utilities to fit ordered multinomial models, including proportional odds model and ordered Probit model as special cases.

Installation

This package requires Julia v0.7 or later. The package has not yet been registered and must be installed using the repository location. Start julia and use the ] key to switch to the package manager REPL

(v1.1) pkg> add https://github.com/OpenMendel/OrdinalMultinomialModels.jl
# Machine info for results in this tutorial
versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
# for use in this tutorial
using OrdinalMultinomialModels, BenchmarkTools, RDatasets

Example data

housing is a data set from R package MASS. The outcome of interest is Sat (satisfication) that takes values Low, Medium, or High. Predictors include Infl (inflation, categorical), Type (housing type, categorical), and Cont (categorical). Freq codes number of observation for each combination of levels.

housing = dataset("MASS", "housing")

<table class="data-frame"><thead><tr><th></th><th>Sat</th><th>Infl</th><th>Type</th><th>Cont</th><th>Freq</th></tr><tr><th></th><th>Categorical…</th><th>Categorical…</th><th>Categorical…</th><th>Categorical…</th><th>Int32</th></tr></thead><tbody><p>72 rows × 5 columns</p><tr><th>1</th><td>Low</td><td>Low</td><td>Tower</td><td>Low</td><td>21</td></tr><tr><th>2</th><td>Medium</td><td>Low</td><td>Tower</td><td>Low</td><td>21</td></tr><tr><th>3</th><td>High</td><td>Low</td><td>Tower</td><td>Low</td><td>28</td></tr><tr><th>4</th><td>Low</td><td>Medium</td><td>Tower</td><td>Low</td><td>34</td></tr><tr><th>5</th><td>Medium</td><td>Medium</td><td>Tower</td><td>Low</td><td>22</td></tr><tr><th>6</th><td>High</td><td>Medium</td><td>Tower</td><td>Low</td><td>36</td></tr><tr><th>7</th><td>Low</td><td>High</td><td>Tower</td><td>Low</td><td>10</td></tr><tr><th>8</th><td>Medium</td><td>High</td><td>Tower</td><td>Low</td><td>11</td></tr><tr><th>9</th><td>High</td><td>High</td><td>Tower</td><td>Low</td><td>36</td></tr><tr><th>10</th><td>Low</td><td>Low</td><td>Apartment</td><td>Low</td><td>61</td></tr><tr><th>11</th><td>Medium</td><td>Low</td><td>Apartment</td><td>Low</td><td>23</td></tr><tr><th>12</th><td>High</td><td>Low</td><td>Apartment</td><td>Low</td><td>17</td></tr><tr><th>13</th><td>Low</td><td>Medium</td><td>Apartment</td><td>Low</td><td>43</td></tr><tr><th>14</th><td>Medium</td><td>Medium</td><td>Apartment</td><td>Low</td><td>35</td></tr><tr><th>15</th><td>High</td><td>Medium</td><td>Apartment</td><td>Low</td><td>40</td></tr><tr><th>16</th><td>Low</td><td>High</td><td>Apartment</td><td>Low</td><td>26</td></tr><tr><th>17</th><td>Medium</td><td>High</td><td>Apartment</td><td>Low</td><td>18</td></tr><tr><th>18</th><td>High</td><td>High</td><td>Apartment</td><td>Low</td><td>54</td></tr><tr><th>19</th><td>Low</td><td>Low</td><td>Atrium</td><td>Low</td><td>13</td></tr><tr><th>20</th><td>Medium</td><td>Low</td><td>Atrium</td><td>Low</td><td>9</td></tr><tr><th>21</th><td>High</td><td>Low</td><td>Atrium</td><td>Low</td><td>10</td></tr><tr><th>22</th><td>Low</td><td>Medium</td><td>Atrium</td><td>Low</td><td>8</td></tr><tr><th>23</th><td>Medium</td><td>Medium</td><td>Atrium</td><td>Low</td><td>8</td></tr><tr><th>24</th><td>High</td><td>Medium</td><td>Atrium</td><td>Low</td><td>12</td></tr><tr><th>25</th><td>Low</td><td>High</td><td>Atrium</td><td>Low</td><td>6</td></tr><tr><th>26</th><td>Medium</td><td>High</td><td>Atrium</td><td>Low</td><td>7</td></tr><tr><th>27</th><td>High</td><td>High</td><td>Atrium</td><td>Low</td><td>9</td></tr><tr><th>28</th><td>Low</td><td>Low</td><td>Terrace</td><td>Low</td><td>18</td></tr><tr><th>29</th><td>Medium</td><td>Low</td><td>Terrace</td><td>Low</td><td>6</td></tr><tr><th>30</th><td>High</td><td>Low</td><td>Terrace</td><td>Low</td><td>7</td></tr><tr><th>&vellip;</th><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td></tr></tbody></table>

There are 72 unique combination of levels and the total number of observations is 1,681.

size(housing, 1), sum(housing[:Freq])
(72, 1681)

Syntax

polr is the main function of fitting ordered multinomial model. For documentation, type ?polr at Julia REPL.

OrdinalMultinomialModels.polrFunction
polr(formula, df, link, solver=NLoptSolver(algorithm=:LD_SLSQP, maxeval=4000))
polr(X, y, link, solver=NLoptSolver(algorithm=:LD_SLSQP, maxeval=4000))

Fit ordered multinomial model by maximum likelihood estimation.

Positional arguments

  • formula::Formula: a model formula specifying responses and regressors.
  • df::DataFrame: a dataframe. Response variable should take integer values starting from 1.
  • y::Vector: integer vector taking values in 1,...,J.
  • X::Matrix: n x p covariate matrix excluding intercept.
  • link::GLM.Link: LogitLink() (default), ProbitLink(), CauchitLink(), or CloglogLink().
  • solver: An instance of Ipopt.Optimizer() or NLopt.Optimizer() (default).

Keyword arguments

  • wts::AbstractVector=similar(X, 0): observation weights.

Output

  • dd:PolrModel: a PolrModel type.
source

Fit ordered multinomial models

Proportional odds model

To fit an ordered multinomial model using default link link=LogitLink(), i.e., proportional odds model

house_po = polr(@formula(Sat ~ Infl + Type + Cont), housing, wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.496141  0.124541 -3.98376   0.0002
intercept2|3      0.690706  0.125212  5.51628    <1e-6
Infl: Medium      0.566392  0.104963  5.39611    <1e-5
Infl: High         1.28881  0.126705  10.1718   <1e-14
Type: Apartment  -0.572352  0.118747 -4.81991    <1e-5
Type: Atrium     -0.366182  0.156766 -2.33586   0.0226
Type: Terrace     -1.09101  0.151514 -7.20075    <1e-9
Cont: High        0.360284 0.0953574  3.77825   0.0003

Since there are $J=3$ categories in Sat, the fitted model has 2 intercept parameters $\theta_1$ and $\theta_2$ that satisfy $\theta_1 \le \theta_2$. $\beta_1, \beta_2$ are regression coefficients for Infl (3 levels), $\beta_3, \beta_4, \beta_5$ for Type (4 levels), and $\beta_6$ for Cont (2 levels).

Deviance (-2 loglikelihood) of the fitted model is

deviance(house_po)
3479.149299072586

Estimated regression coefficients are

coef(house_po)
8-element Array{Float64,1}:
 -0.4961406134145464 
  0.6907056439400198 
  0.5663915864515743 
  1.288810825003427  
 -0.5723518911944654 
 -0.36618227185604263
 -1.0910115747375622 
  0.36028444175284197

with standard errors

stderror(house_po)
8-element Array{Float64,1}:
 0.12454076603722913
 0.1252121074560087 
 0.10496298447216806
 0.1267047666889804 
 0.11874734570694132
 0.1567658347801272 
 0.15151363694373737
 0.09535742939732537

Ordered probit model

To fit an ordered probit model, we use link ProbitLink()

house_op = polr(@formula(Sat ~ Infl + Type + Cont), housing, ProbitLink(), wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,ProbitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.299829 0.0761614 -3.93676   0.0002
intercept2|3       0.42672 0.0763991   5.5854    <1e-6
Infl: Medium      0.346423 0.0641796  5.39771    <1e-5
Infl: High        0.782914 0.0762645  10.2658   <1e-14
Type: Apartment  -0.347538 0.0722116 -4.81278    <1e-5
Type: Atrium     -0.217889 0.0955741 -2.27979   0.0260
Type: Terrace    -0.664175 0.0919294 -7.22484    <1e-9
Cont: High        0.222386 0.0581214  3.82624   0.0003
deviance(house_op)
3479.6888425652414

Proportional hazards model

To fit a proportional hazards model, we use CloglogLink()

house_ph = polr(@formula(Sat ~ Infl + Type + Cont), housing, CloglogLink(), wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,CloglogLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.796225 0.0904739  -8.8006   <1e-11
intercept2|3     0.0553535 0.0866662 0.638697   0.5253
Infl: Medium      0.382045 0.0701219  5.44829    <1e-6
Infl: High        0.915384 0.0924967  9.89639   <1e-13
Type: Apartment  -0.407228 0.0861331 -4.72789    <1e-4
Type: Atrium      -0.28056  0.112949 -2.48396   0.0156
Type: Terrace    -0.742481  0.102084 -7.27321    <1e-9
Cont: High        0.209235 0.0653756   3.2005   0.0021
deviance(house_ph)
3484.0531705626904

From the deviances, we see that the proportional odds model (logit link) has the best fit among all three models.

deviance(house_po), deviance(house_op), deviance(house_ph)
(3479.149299072586, 3479.6888425652414, 3484.0531705626904)

Alternative syntax without using DataFrame

An alternative syntax is useful when it is inconvenient to use DataFrame

polr(X, y, link, solver; wts)

where y is the response vector and X is the n x p predictor matrix excluding intercept.

Optimization algorithms

PolrModels.jl relies on nonlinear programming (NLP) optimization algorithms to find the maximum likelihood estimate (MLE). User can input any solver supported by the MathProgBase.jl package (see http://www.juliaopt.org) as the 4th argument of polr function. Common choices are:

  • Ipopt solver: IpoptSolver(print_level=0). See Ipopt.jl for numerous arugments to IpoptSolver. For example, setting print_level=5 is useful for diagnosis purpose.
  • NLopt package: NLoptSolver(algorithm=:LD_SLSQP), NLoptSolver(algorithm=:LD_LBFGS). See NLopt algorithms for all algorithms in NLopt.jl.

When optimization fails, user can always try another algorithm.

Use Ipopt (interior-point) solver

polr(@formula(Sat ~ Infl + Type + Cont), housing, LogitLink(), 
    IpoptSolver(print_level=3); wts = housing[:Freq])
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************

Total number of variables............................:        8
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 38

                                   (scaled)                 (unscaled)
Objective...............:   2.0594068522767617e+02    1.7395746495294738e+03
Dual infeasibility......:   8.1478360269928591e-09    6.8824520931403241e-08
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   0.0000000000000000e+00    0.0000000000000000e+00
Overall NLP error.......:   8.1478360269928591e-09    6.8824520931403241e-08


Number of objective function evaluations             = 91
Number of objective gradient evaluations             = 39
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 0
Total CPU secs in IPOPT (w/o function evaluations)   =      0.050
Total CPU secs in NLP function evaluations           =      0.006

EXIT: Optimal Solution Found.





StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.496135  0.124541 -3.98372   0.0002
intercept2|3      0.690708  0.125212   5.5163    <1e-6
Infl: Medium      0.566394  0.104963  5.39613    <1e-5
Infl: High         1.28882  0.126705  10.1718   <1e-14
Type: Apartment   -0.57235  0.118747  -4.8199    <1e-5
Type: Atrium     -0.366186  0.156766 -2.33588   0.0226
Type: Terrace     -1.09101  0.151514 -7.20077    <1e-9
Cont: High        0.360284 0.0953575  3.77825   0.0003

Use SLSQP (sequential quadratic programming) in NLopt.jl package

polr(@formula(Sat ~ Infl + Type + Cont), housing, LogitLink(), 
    NLoptSolver(algorithm=:LD_SLSQP); wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.496141  0.124541 -3.98376   0.0002
intercept2|3      0.690706  0.125212  5.51628    <1e-6
Infl: Medium      0.566392  0.104963  5.39611    <1e-5
Infl: High         1.28881  0.126705  10.1718   <1e-14
Type: Apartment  -0.572352  0.118747 -4.81991    <1e-5
Type: Atrium     -0.366182  0.156766 -2.33586   0.0226
Type: Terrace     -1.09101  0.151514 -7.20075    <1e-9
Cont: High        0.360284 0.0953574  3.77825   0.0003

Use LBFGS (quasi-Newton algorithm) in NLopt.jl package

polr(@formula(Sat ~ 0 + Infl + Type + Cont), housing, LogitLink(), 
    NLoptSolver(algorithm=:LD_LBFGS); wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type + Cont

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.496111  0.124541 -3.98353   0.0002
intercept2|3      0.690732  0.125212  5.51649    <1e-6
Infl: Medium      0.566394  0.104963  5.39613    <1e-5
Infl: High         1.28882  0.126705  10.1718   <1e-14
Type: Apartment  -0.572352  0.118747 -4.81991    <1e-5
Type: Atrium      -0.36616  0.156766 -2.33571   0.0227
Type: Terrace     -1.09102  0.151514  -7.2008    <1e-9
Cont: High        0.360319 0.0953575  3.77861   0.0003

Likelihood ratio test (LRT)

polr function calculates the Wald test (or t-test) p-value for each predictor in the model. To carry out the potentially more powerful likelihood ratio test (LRT), we need to fill the null and alternative models separately.

Step 1: Fit the null model with only Infl and Type factors.

house_null = polr(@formula(Sat ~ Infl + Type), housing; wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.672949  0.115559 -5.82341    <1e-6
intercept2|3      0.505629  0.115147  4.39116    <1e-4
Infl: Medium      0.548392  0.104613  5.24213    <1e-5
Infl: High          1.2373  0.125448  9.86306   <1e-13
Type: Apartment  -0.521441  0.117616 -4.43341    <1e-4
Type: Atrium     -0.289347  0.155074 -1.86587   0.0666
Type: Terrace     -1.01404   0.14976  -6.7711    <1e-8

Step 2: To test significance of the Cont variable, we use polrtest function. The first argument is the fitted null model, the second argument is the predictor vector to be tested

# last column of model matrix is coding for Cont (2-level factor)
cont = modelmatrix(house_po.model)[:, end]
# calculate p-value
polrtest(house_null, cont; test=:LRT)
0.000155351855453278

Score test

User can perform score test using the polrtest function too. Score test has the advantage that, when testing a huge number of predictors such as in genomewide association studies (GWAS), one only needs to fit the null model once and then testing each predictor is cheap. Both Wald and likelihood ratio test (LRT) need to fit a separate alternative model for each predictor being tested.

Step 1: Fit the null model with only Infl and Type factors.

house_null = polr(@formula(Sat ~ Infl + Type), housing; wts = housing[:Freq])
StatsModels.DataFrameRegressionModel{OrdinalMultinomialModel{Int64,Float64,LogitLink},Array{Float64,2}}

Formula: Sat ~ Infl + Type

Coefficients:
                  Estimate Std.Error  t value Pr(>|t|)
intercept1|2     -0.672949  0.115559 -5.82341    <1e-6
intercept2|3      0.505629  0.115147  4.39116    <1e-4
Infl: Medium      0.548392  0.104613  5.24213    <1e-5
Infl: High          1.2373  0.125448  9.86306   <1e-13
Type: Apartment  -0.521441  0.117616 -4.43341    <1e-4
Type: Atrium     -0.289347  0.155074 -1.86587   0.0666
Type: Terrace     -1.01404   0.14976  -6.7711    <1e-8

Step 2: To test significance of the Cont variable, we use polrtest function. The first argument is the fitted null model, the second argument is the predictor vector to be tested

# last column of model matrix is coding for Cont (2-level factor)
cont = modelmatrix(house_po.model)[:, end]
# calculate p-value
polrtest(house_null, cont; test=:score)
0.0001648743597587817

Step 3: Now suppose we want to test significance of another predictor, z1. We just need to call polrtest with z1 and the same fiited null model. No model fitting is needed.

For demonstration purpose, we generate z1 randomly. The score test p-value of z1 is, not suprisingly, large.

z1 = randn(nobs(house_null))
polrtest(house_null, z1)
0.1673512522966108

Step 4: We can also test a set of precitors or a factor.

z3 = randn(nobs(house_null), 3)
polrtest(house_null, z3)
6.709335149358069e-10