flagr-package for FLaG models R
Before installing flagr package, make sure you have installed the following
Dependent packages: Rcpp
, RcppProgress
and RcppArmadillo
For installing RcppArmadillo
, see the instructions below
After you login to Yeti, to get the newest RcppArmadillo package: $ module load gcc/4.9.1
then enter R, install the package by typing: > install.packages(“RcppArmadillo”)
quit R environment by typing : > quit()
copy the flagr source package “flagr_0.3.2.tar.gz” to your home directory, and type command: $ R CMD INSTALL flagr_0.3.2.tar.gz
In this section we illustrate severals model our package will use, all the model is under 0/1 condition by default. We offer functions to transform between 0/1 and +1/-1 condition.
We use the mirt function in the mirt
package to fit a simulated data.
rm(list = ls())
library(flagr)
library(mirt)
# This data is generated from mirt model with multinormal prior(J=30,K=3,N=10000)
load("~/Github/flagr-github/flagr/Medium-scale-poscor.RData")
cmodel <- mirt.model('F1=1-10,
F2=11-20,
F3=21-30,
COV = F1*F2*F3')
system.time(result_bi <- mirt(response, cmodel,method = "EM"))
theta_est <- fscores(result_bi)
In this section, we show how to use
cMIRT(Y, Q, A0 = NA, b0 = NA, Sigma0 = NA, print_proc = T, accuracy = NA, min_step = 5, max_step = 1000)
function in our package to model a simulated data, be noted that our model uses a different prior as usual mirt model uses multivariate normal prior, we’ll show a example code and give a comparison between our function and mirt function.
# We use the previous data generated from mirt model
Q <- A>0
K <- ncol(A)
system.time(result22 <- confirmIRT22(A0, b0, Sigma0, response, Q, accuracy = .1))
# calculate the posterior mean of theta
theta_est22 <- t(result22$Sigma%*%t(result22$A)%*%t(response))
#calculate Kendall's tau correlation between two estimated posterior mean of theta
for(i in 1:K){
cat(cor(theta_est[,i],theta_est22[,i],method = "kendall"),"\n")
}
# Another example, in this example, response data is simulated multivariate IRT model with our spcial prior
# J=60, K=3, N=10000
rm(list = ls())
load("~/Github/flagr-github/flagr/confirmIsing_exam.RData")
system.time(result22 <- confirmIRT22(A0, b0, Sigma0, response, Q, accuracy = 3))
# calculate the posterior mean of theta
theta_est22 <- t(result22$Sigma%*%t(result22$A)%*%t(response))
cmodel <- mirt.model('F1=1-10,
F2=11-20,
F3=21-30,
COV = F1*F2*F3')
system.time(result_bi <- mirt(response, cmodel,method = "EM"))
system.time(result_bi <- mirt(response, cmodel,method = "MHRM"))
# Third example, data is simulated from mirt model with multivariate normal prior(J=30,K=3,)
K=2,J=20 | K=3,J=30 | K=3, J=60 | K=4,J=80 | |
---|---|---|---|---|
confirmIRT22 | 1.829s | 4.923s | 31.007s | 37.331s |
mirt(method=“EM”) | 7.132s | 41.600s | 297.545s * | 555.753s * |
mirt(method=“MHRM”) | 193.078s | * | * |
*: the mirt function stopped with the warning message: Log-likelihood was decreasing near the ML solution. EM method may be unstable.
As we can see, when scale go large, the mirt package can not handle very well, it’s result is unreliable anymore.
In out flagr
package, we use function
confirmIRT_cov <- function(A0, b0, beta0, response, Z, Q, print_proc = T, accuracy = 1, min_step = 5, max_step = 1000)
to estimate confrimatory low-rank Ising model with covariates.
beta0 is the initial value of coefficient, Z is a matrix for covariates.
In our flagr
package, we use function
explorIRT(A0, b0, response, print_proc = T, accuracy = 0.01, min_step = 5, max_step = 1000)
to estimate parameters, the following is an example:
rm(list = ls())
# This data is generated from mirt model with our special prior(J=20,K=2,N=1000)
load("~/Github/flagr-github/flagr/explorIsing_exam.RData")
system.time(result <- explorIRT(A0, rep(0,J), response, accuracy = 0.01))
One advantage of our package is that we can handle large scale parameters estimating problem in a fast speed like the following one:
rm(list = ls())
#This data is generated from mirt model with our prior(J=100,K=5,N=1000)
load("~/Github/flagr-github/flagr/explorIsing_exam2.RData")
system.time(result <- explorIRT(A0, b0, response, accuracy = 1))
# Even we start from a far initial value
system.time(result <- explorIRT(A0, rep(0,J), response, accuracy = 1))
Our package can also handle the exploratory low-rank Ising model with covariates information, we illustrate the usage of
explorIRT_cov <- function(A0, b0, beta0, response, Z, print_proc = T, accuracy = 1, min_step = 5, max_step = 1000)
by the following example:
J = 20;
N = 1000;
K = 2;
x <- 0.5;
A <- matrix(0, J, K);
A[1:10, 1] <- x;
A[11:20,2] <- x;
S <- matrix(0, J, J);
diag(S) <- -2.5;
Sigma <- matrix(c(1,0,0,1),2)
Z <- matrix(rnorm(N), N, 1)
beta <- matrix(c(rep(0,J/2),rep(1,J/2)),1,J)
system.time(response1 <- sample_flag_cpp1(A, S, beta, Z, N, burn = 200))
beta_n <- nrow(beta)
A0 <- matrix(0, J, K);
A0[1:10, 1] <- x + rnorm(10);
A0[11:20,2] <- x + rnorm(10);
b0 <- rep(0, J)
beta0 <- matrix(0,beta_n,J)
system.time(result1 <- explorIRT_cov(A0, b0, beta0, response1, Z, accuracy = 0.01))
We can give variance estimation by the following command:
result.var <- IRT.var(result1, response1, Z)
result.sd <- sqrt(diag(result.var))
We can use function
confirmFLaG <- function(A0, S0, Sigma0, Q, gamma, response, stepA = 5, stepS = 5, stepB = 5, discA = 0.5, discS = 0.5, discB = 0.5, tol = 1e-7, print_proc = T)
to model data using confirmatory FLaG model
rm(list = ls())
# This data is generated from FLaG model(J=20,K=2,N=10000)
load("~/Github/flagr-github/flagr/confirmFLaG_exam1.RData")
result22 <- confirmFLaG(A0, S0, diag(2), Q, gamma, response)
Specificly, if the support of graph is known, we can add the information supp
as a parameter of the function
confirmFLaG_supp <- function(A0, S0, Sigma0, Q, supp, response, stepA = 1, stepS = 1, stepB = 1, discA = 0.5, discS = 0.5, discB = 0.5, tol = 1e-7, print_proc = T)
to make the estimation more accurate.
result_supp22<- confirmFLaG_supp(A0, S0, diag(2), Q, supp, response,tol = 1e-7)
We can use function
ADMM5(L0, S0, response, delta, gamma, lambda = 5, tol = 1e-5)
to model data using exploratory FLaG model, see the following example:
J = 20;
N = 1000;
K = 2;
x <- 0.5;
A <- matrix(0, J, K);
A[1:10, 1] <- x;
A[11:20,2] <- x;
L <- A%*% t(A);
S <- matrix(0, J, J);
for(i in seq(1, J-1, by = 2)){
S[i,i+1] <- 1;
S[i+1,i] <- 1;
}
diag(S) <- -3.5;
system.time(response <- sample_flag_cpp(A, S, N, burn = 200))
colMeans(response)
L0 = matrix(0, J, J);
S0 = matrix(0, J, J);
delta = 0.06;
gamma = 0.03;
system.time(result <- ADMM5(L0, S0, response, delta, gamma))
result_A <- loadings(result)
varimax(result_A)[["loadings"]][[1]]
# Let's start with small scale problem(J=300,K=1, N=1000000)
J = 300;
N = 1e6;
K = 1;
x <- 0.5;
A <- matrix(x, J, K)
S <- matrix(0, J, J);
diag(S) <- -3;
system.time(response <- sample_flag_cpp(A, S, N, burn = 200))
colMeans(response)
###########
J=300
N=1e6
K=1
A <- matrix(1,J,K)
response <- matrix(0,N,J)
theta <- matrix(rnorm(N,0,1),N,1)
b <- 0
for(i in 1:N){
for(j in 1:J){
p <- sum(A[j,] * theta[i,]) + b
response[i,j] <- rbinom(1, 1, 1/(1+exp(-p)))
}
}
Note the various macros within the vignette
section of the metadata block above. These are required in order to instruct R how to build the vignette. Note that you should change the title
field and the \VignetteIndexEntry
to match the title of your vignette.
The html_vignette
template includes a basic CSS theme. To override this theme you can specify your own CSS in the document metadata as follows:
output:
rmarkdown::html_vignette:
css: mystyles.css
The figure sizes have been customised so that you can easily put two images side-by-side.
plot(1:10)
plot(10:1)
You can enable figure captions by fig_caption: yes
in YAML:
output:
rmarkdown::html_vignette:
fig_caption: yes
Then you can use the chunk option fig.cap = "Your figure caption."
in knitr.
You can write math expressions, e.g. \(Y = X\beta + \epsilon\), footnotes1, and tables, e.g. using knitr::kable()
.
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108.0 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258.0 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360.0 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225.0 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
Duster 360 | 14.3 | 8 | 360.0 | 245 | 3.21 | 3.570 | 15.84 | 0 | 0 | 3 | 4 |
Merc 240D | 24.4 | 4 | 146.7 | 62 | 3.69 | 3.190 | 20.00 | 1 | 0 | 4 | 2 |
Merc 230 | 22.8 | 4 | 140.8 | 95 | 3.92 | 3.150 | 22.90 | 1 | 0 | 4 | 2 |
Merc 280 | 19.2 | 6 | 167.6 | 123 | 3.92 | 3.440 | 18.30 | 1 | 0 | 4 | 4 |
Also a quote using >
:
“He who gives up [code] safety for [code] speed deserves neither.” (via)
Using the sandwich estimator to estimate variance of loadings A
and descrimination b
:
J = 20;
N = 1000;
K = 2;
x <- 1.5;
A <- matrix(0, J, K);
A[1:10, 1] <- x;
A[11:20,2] <- x;
b <- rep(0, J)
Q <- A>0
Sigma <- matrix(c(1,0,0,1),2)
A
## [,1] [,2]
## [1,] 1.5 0.0
## [2,] 1.5 0.0
## [3,] 1.5 0.0
## [4,] 1.5 0.0
## [5,] 1.5 0.0
## [6,] 1.5 0.0
## [7,] 1.5 0.0
## [8,] 1.5 0.0
## [9,] 1.5 0.0
## [10,] 1.5 0.0
## [11,] 0.0 1.5
## [12,] 0.0 1.5
## [13,] 0.0 1.5
## [14,] 0.0 1.5
## [15,] 0.0 1.5
## [16,] 0.0 1.5
## [17,] 0.0 1.5
## [18,] 0.0 1.5
## [19,] 0.0 1.5
## [20,] 0.0 1.5
b
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Sigma
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
results under 100 simulations:
load("variance_conf_ind.RData")
average_res = colMeans(res_point)
A=matrix(0,20,2)
A[1:10,1]=average_res[1:10]
A[11:20,2]=average_res[11:20]
A
## [,1] [,2]
## [1,] 0.3197856 0.0000000
## [2,] 0.3215668 0.0000000
## [3,] 0.3190496 0.0000000
## [4,] 0.3210873 0.0000000
## [5,] 0.3223154 0.0000000
## [6,] 0.3251583 0.0000000
## [7,] 0.3264015 0.0000000
## [8,] 0.3203971 0.0000000
## [9,] 0.3234195 0.0000000
## [10,] 0.3272770 0.0000000
## [11,] 0.0000000 0.3256401
## [12,] 0.0000000 0.3258470
## [13,] 0.0000000 0.3212222
## [14,] 0.0000000 0.3210236
## [15,] 0.0000000 0.3209790
## [16,] 0.0000000 0.3198053
## [17,] 0.0000000 0.3201307
## [18,] 0.0000000 0.3213765
## [19,] 0.0000000 0.3218453
## [20,] 0.0000000 0.3236067
b=average_res[21:40]
b
## [1] 0.0060823391 0.0034769598 0.0011617287 0.0027049749 0.0037400256
## [6] -0.0082606372 -0.0001975696 0.0004260730 -0.0043843911 0.0067771553
## [11] -0.0043611927 -0.0042773736 0.0039032161 0.0120867333 0.0082584887
## [16] -0.0035608326 -0.0023652364 -0.0073793718 0.0065301187 -0.0100547650
SE | SEE |
---|---|
0.0234219 | 0.0246073 |
0.0252923 | 0.0245810 |
0.0245263 | 0.0244840 |
0.0250991 | 0.0245905 |
0.0240166 | 0.0247542 |
0.0244815 | 0.0249228 |
0.0253028 | 0.0249089 |
0.0251527 | 0.0245719 |
0.0249885 | 0.0248041 |
0.0259739 | 0.0249775 |
0.0288664 | 0.0249329 |
0.0246151 | 0.0248858 |
0.0226919 | 0.0247120 |
0.0235724 | 0.0246705 |
0.0257544 | 0.0246624 |
0.0291460 | 0.0245736 |
0.0237269 | 0.0247957 |
0.0222439 | 0.0247639 |
0.0288633 | 0.0248576 |
0.0242742 | 0.0248165 |
0.0656738 | 0.0724263 |
0.0708166 | 0.0725486 |
0.0765891 | 0.0724397 |
0.0687344 | 0.0725231 |
0.0682073 | 0.0725347 |
0.0681675 | 0.0726526 |
0.0759917 | 0.0727231 |
0.0744917 | 0.0725017 |
0.0698336 | 0.0726103 |
0.0688569 | 0.0727634 |
0.0757150 | 0.0726038 |
0.0676043 | 0.0726280 |
0.0735290 | 0.0724435 |
0.0756669 | 0.0724471 |
0.0797230 | 0.0724501 |
0.0725141 | 0.0723966 |
0.0618378 | 0.0723328 |
0.0781103 | 0.0724510 |
0.0682416 | 0.0724257 |
0.0749746 | 0.0725300 |
J = 20;
N = 1000;
K = 2;
x <- 1.5;
A <- matrix(0, J, K);
A[1:10, 1] <- x;
A[11:20,2] <- x;
b <- rep(0, J)
Q <- A>0
Sigma <- matrix(c(1,0.3,0.3,1),2)
A
## [,1] [,2]
## [1,] 1.5 0.0
## [2,] 1.5 0.0
## [3,] 1.5 0.0
## [4,] 1.5 0.0
## [5,] 1.5 0.0
## [6,] 1.5 0.0
## [7,] 1.5 0.0
## [8,] 1.5 0.0
## [9,] 1.5 0.0
## [10,] 1.5 0.0
## [11,] 0.0 1.5
## [12,] 0.0 1.5
## [13,] 0.0 1.5
## [14,] 0.0 1.5
## [15,] 0.0 1.5
## [16,] 0.0 1.5
## [17,] 0.0 1.5
## [18,] 0.0 1.5
## [19,] 0.0 1.5
## [20,] 0.0 1.5
b
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Sigma
## [,1] [,2]
## [1,] 1.0 0.3
## [2,] 0.3 1.0
results under 100 simulations:
load("variance_conf_cor.RData")
average_res = colMeans(res_point)
A=matrix(0,20,2)
A[1:10,1]=average_res[1:10]
A[11:20,2]=average_res[11:20]
A
## [,1] [,2]
## [1,] 0.3231537 0.0000000
## [2,] 0.3188977 0.0000000
## [3,] 0.3180814 0.0000000
## [4,] 0.3174558 0.0000000
## [5,] 0.3181322 0.0000000
## [6,] 0.3195720 0.0000000
## [7,] 0.3216282 0.0000000
## [8,] 0.3181760 0.0000000
## [9,] 0.3199462 0.0000000
## [10,] 0.3207602 0.0000000
## [11,] 0.0000000 0.3196714
## [12,] 0.0000000 0.3177176
## [13,] 0.0000000 0.3221163
## [14,] 0.0000000 0.3163536
## [15,] 0.0000000 0.3232995
## [16,] 0.0000000 0.3230148
## [17,] 0.0000000 0.3214805
## [18,] 0.0000000 0.3188889
## [19,] 0.0000000 0.3153299
## [20,] 0.0000000 0.3185675
b=average_res[21:40]
b
## [1] -0.009009077 0.003921514 0.007357218 0.004927920 0.001050645
## [6] -0.006365575 -0.002099554 -0.001802810 0.001247737 0.010514934
## [11] 0.003393165 -0.006382762 0.000094245 -0.005147718 0.001771069
## [16] -0.001534055 -0.007747437 -0.004909551 0.002292500 -0.006232389
SE | SEE |
---|---|
0.0256555 | 0.0256911 |
0.0255997 | 0.0254838 |
0.0245582 | 0.0253566 |
0.0244937 | 0.0254540 |
0.0268723 | 0.0254660 |
0.0252039 | 0.0256531 |
0.0245700 | 0.0256068 |
0.0260349 | 0.0255406 |
0.0254411 | 0.0255923 |
0.0248871 | 0.0255178 |
0.0221761 | 0.0239484 |
0.0244666 | 0.0238029 |
0.0277688 | 0.0240292 |
0.0233263 | 0.0237430 |
0.0270260 | 0.0240584 |
0.0236524 | 0.0240483 |
0.0230023 | 0.0239832 |
0.0224501 | 0.0238005 |
0.0236259 | 0.0236726 |
0.0230402 | 0.0238611 |
0.0692641 | 0.0733547 |
0.0701528 | 0.0731796 |
0.0628415 | 0.0731442 |
0.0754760 | 0.0730990 |
0.0744631 | 0.0731637 |
0.0718771 | 0.0732168 |
0.0670819 | 0.0733380 |
0.0800385 | 0.0731766 |
0.0729246 | 0.0732374 |
0.0755642 | 0.0733428 |
0.0793707 | 0.0719600 |
0.0730091 | 0.0718903 |
0.0719055 | 0.0720433 |
0.0694591 | 0.0718479 |
0.0766842 | 0.0721292 |
0.0765639 | 0.0720976 |
0.0721641 | 0.0720169 |
0.0771268 | 0.0719439 |
0.0811696 | 0.0717857 |
0.0683278 | 0.0719017 |
A footnote here.↩