Regression: Models, Methods And Applications [BEST]
Stefan Lang is Professor for Applied Statistics at University of Innsbruck, Austria. He received his PhD at Ludwig-Maximilians-University Munich. From 2005 to 2006 he has been Professor for Statistics at University of Leipzig. He is currently editor of Advances of Statistical Analysis and Associate Editor of Statistical Modelling. His main research interests include semiparametric and spatial regression, multilevel modelling and complex Bayesian models, with applications among others in environmetrics, marketing science, real estate and actuarial science.
Regression: Models, Methods and Applications
Stefan Lang is a Professor of Applied Statistics at the University of Innsbruck, Austria. He received his PhD at LMU Munich. From 2005 to 2006 he was Professor of Statistics at the University of Leipzig. He is currently Associate Editor of the journal Statistical Modelling. His main research interests include semiparametric and spatial regression, multilevel modelling and complex Bayesian models, with applications, among others, in development economics, environmetrics, marketing science, real estate and actuarial science.
We comprehensively review currently proposed methodologies of time-dependent ROC curves which use single or longitudinal marker measurements, aiming to provide clarity in each methodology, identify software tools to carry out such analysis in practice and illustrate several applications of the methodology. We have also extended some methods to incorporate a longitudinal marker and illustrated the methodologies using a sequential dataset from the Mayo Clinic trial in primary biliary cirrhosis (PBC) of the liver.
From our methodological review, we have identified 18 estimation methods of time-dependent ROC curve analyses for censored event times and three other methods can only deal with non-censored event times. Despite the considerable numbers of estimation methods, applications of the methodology in clinical studies are still lacking.
QMETH 528 Survey Sampling Applications (4)Introduction to design and implementation of sample surveys with emphasis on business applications. Simple random, stratified, cluster, multistage sample methods. Probability sampling, optimal allocation of sampling units. Mail, telephone, interview methods. Estimation methods, Questionnaire design. Non-response. Prerequisite: QMETH 500 or B A 500 or equivalent or permission of instructor.View course details in MyPlan: QMETH 528
QMETH 579 Special Topics in Quantitative Methods (2-4, max. 12)Presentation of topics of current concern to students and faculty in operations research and applied business statistics. Potential topics include applications and extensions of mathematical programming, stochastic processes, discrete programming, networks models, and the application of statistical techniques.View course details in MyPlan: QMETH 579
The Graduate Certificate in Theory and Applications of Regression Models (GCTARM) is designed for professionals and graduate students in diverse fields with a strong math background and a desire to improve their computational and statistical skills. The GCTARM covers applications of multiple regression and generalized regression models, as well as providing some theoretical background for these topics. The focus of this certificate is on understanding patterns and structure in data, and explanation/presentation of findings. It includes core coursework in probability, statistical inference and categorical data analysis, in addition to regression analysis. The GCTARM is available both online and on-campus.
Linear and Longitudinal Regression (BST215) is intended for students who are already very comfortable with fundamental techniques in statistics. The course will cover methods for building and interpreting linear regression models, including statistical assumptions and diagnostics, estimation and testing, and model building techniques. These models will be extended to handle data arising from longitudinal studies employing repeated measurement of subjects over time. Lectures will be accompanied by computing exercises using the Stata statistical package.Effectiveness Research with Longitudinal Healthcare Databases (EPI253) Large longitudinal healthcare databases (e.g. claims, electronic health records) have become important tools for studying the utilization patterns and clinical effectiveness of medical products and interventions in a wide variety of care settings and for evaluating the impact of clinical programs or policy changes. This course will prepare students to identify and use longitudinal databases in their own research.
Methods for Decision Making in Medicine (RDS288) deals with intermediate-level topics in the field of medical decision making. Topics that will be addressed include building decision models, evaluation of diagnostic tests, utility assessment, multi-attribute utility theory, Markov cohort models, microsimulation state-transition models, calibration and validation of models, probabilistic sensitivity analysis, value of information analysis, and behavioral decision making. The course will focus on the practical application of techniques and will include published examples and a computer practicum. Students will learn to apply state-of-the-art modeling methods (using software packages) to evaluate the comparative effectiveness and cost-effectiveness of health interventions. While the primary emphasis is on application, essential underlying theoretical concepts will also be discussed. During the course you will have the opportunity to work on a decision problem which you select yourself. This is not an introductory course. Prerequisites are an introductory course in Decision Analysis (RDS280 or RDS286s or faculty approval of equivalent course) and knowledge of probability and statistics. The course has limited enrollment.
Survival Methods in Clinical Research (BST224) will cover statistical methods of survival analysis used in clinical research, including study design and power analysis, Kaplan-Meier product-limit estimation, Cox proportional hazards models, models with time-dependent covariates and repeated events, and models with competing risks. We will use SAS software in the course; however, students can use Stata, R, SPSS or other software. Students are encouraged to bring in their own project data for consultation. Course evaluation will be based on 13 daily quizzes.Analytic Issues of Clinical Epidemiology (EPI236) examines some features of study design, but is primarily focused on analytic issues encountered in clinical research. These include techniques for stratified analysis, regression modeling, propensity scores, and matching. Emphasis is placed on the use of these techniques for the control of confounding and for the development of clinical prediction rules. The focus of this course is on applications and interpretations of results with limited introduction to theory that underlies these techniques. Course Activities include computer lab workshops that are scheduled during regular class time. Students must develop written summaries of the analyses of an assigned clinical data set from the results of daily computer workshops.
Our studies with the synthetic datasets show that the predictive performance of LASSO without variable screening is remarkably excellent for zero-inflated data. The screening+GLM and screening+LASSO methods also work reasonably well if the power of screening is high. The analysis also shows that random effects can help to improve the predictive accuracy. The applications of the three strategies of predictive analysis methods in the PD microbiome dataset show that LASSO can predict PD with microbial composition and two covariates (age and sex) accurately with error rates near 0.2 and AUCs higher than 0.8. The best predictive accuracy for PD with this dataset is obtained with a screening+LASSO method, which gives predictive metrics as follows: ER = 0.199, AUC = 0.872, AUPRC = 0.912. Our predictive analysis results provide strong evidences of the relationship between PD and the gut microbiome.
Table 4 shows the predictive performance of each approach based on variance-stability transformed OTU counts for dataset 1. We calculate the mean and standard deviation (SD) of predictive metrics over 100 iterations. For comparison, it also shows the predictive performance of the oracle method, which fits a GLMM given the 20 truly related OTUs, 2 fixed factors X(1), X(2), and random effect W as predictors. The ER of oracle case is 0.04. To better understand the predictive metrics, we mention that the naive prediction without considering any predictor will give an error rate of 0.4 for dataset 1. We can see that LASSO without variable screening performs remarkably well on dataset 1 based on variance-stability transformation. The ER of LASSO is 0.21 () and both the AUC and AUPRC are 0.87. The features selected by LASSO include 20 true signals, which shows LASSO is able to identify the truly related features when there are signal predictors for the response. The screening+GLM and screening+LASSO methods also work well for these datasets when the screening is based on TPNB models, with predictive metrics close to those of LASSO. This is due to the high power of TPNB. The SDs of AUC and AUPRC of LASSO are smaller than that of TPNB+GLM and TPNB+LASSO, which shows LASSO is more stable over 100 iterations. It is interesting to notice that when we exclude the random effect from each model, the ER of all methods increase by 0.01, and the AUC and AUPRC decrease by 0.01. However, Table 3 shows that the power increases without considering random effect. This result indicates that random effect can help to increase the predictive performance when random effect really exists. Table 5 shows the performance of each approach based on binary transformed OTUs for dataset 1. In this case, LASSO is not able to improve the predictive performance from baseline. ERs of Screening+GLM or Screening+LASSO increase by 0.02 or 0.03 from baseline ER of 0.4. This shows that binary transformation is not an appropriate transformation for this dataset. 041b061a72