Monday, 24 September 2012

Analysis of Survey Data from Complex Sample Designs

When you first took statistics, surveys such as presidential opinion polls were probably prominent in learning inference for proportions.  Unfortunately, that "simple random sample" from your textbook is more a creature of myth than an actual reality.  Most surveys nowadays are complex, with stratification, multi-stage sampling, cluster sampling, etc.  Analysis via a simple "confidence interval for a proportion" is rarely suitable.

In "Analysis of Survey Data from Complex Sample Designs," you'll learn how to estimate variances for complex surveys, and also how to model the results using linear and logistic regression, and other generalized linear models with Dr. Brady T. West and Ms. Patricia Berglund at For more details please visit at

Participants could use R, WesVar, or IVEware (free packages) or SAS, Stata, SUDAAN, or SPSS (commercial packages, with SPSS users required to purchase the Complex Samples Module).

Aim of Course:
In order to extract maximum information at minimum cost, sample designs are typically more complex than simple random samples. Cluster sampling and stratified designs are common. But how do you analyze the resulting data - in particular, how do you determine margins of error? This course teaches you how to estimate variances when analyzing survey data from complex samples, and also how to fit linear and logistic regression models to complex sample survey data.

Who Should Take This Course:
Anyone designing surveys or analyzing survey data.

Course Program:

SESSION 1: Overview

§   Applied Survey Data Analysis: An Overview
  • Important terms, concepts, and notation
  • Software Overview

§   Getting to Know the Complex Sample Design
  • Classification of Sample Designs
  • Target Populations and Survey Populations
  • Simple Random Sampling
  • Complex Sample Design Effects
  • Complex Samples: Clustering and Stratification
  • Weighting in Analysis of Survey Data
  • Multi-stage Area Probability Sample Designs

SESSION 2: Overview continued

§   Foundations and Techniques for Design-based Estimation and Inference
  • Finite Populations and Superpopulation Models
  • Confidence Intervals for Population Parameters
  • Weighted Estimation of Population Parameters
  • Probability Distributions and Design-based Inference
  • Variance Estimation
  • Hypothesis Testing in Survey Data Analysis
  • Total Survey Error

§   Preparation for Complex Sample Survey Data Analysis
  • Analysis Weights: Review by the Data User
  • Understanding and Checking the Sampling Error Calculation Model
  • Addressing Item Missing Data in Analysis Variables
  • Preparing to Analyze Data from Sample Subclasses
  • A Final Checklist for Data Users

SESSION 3: Descriptive Statistics

§   Descriptive Analysis for Continuous Variables
  • Special Considerations in Descriptive Analysis of Complex Sample Survey Data
  • Simple Statistics for Univariate Continuous Distruibutions
  • Bivariate Relationships between Two Continuous Variables
  • Descriptive Statistics for Subpopulations
  • Linear Functions of Descriptive Estimates and Differences of Means

§   Categorical Data Analysis
  • A Framework for Analysis of Categorical Survey Data
  • Univariate Analysis of Categorical Data
  • Bivariate Analysis of Categorical Data
  • Analysis of Multivariate Categorical Data

SESSION 4: Regression Models

§   Linear Regression Models
  • The Linear Regression Model
    • Fitting linear regression models to survey data
  • Four Steps in Linear Regression Analysis
  • Some Practical Considerations and Tools
  • Application: Modeling Diastolic Blood Pressure with the NHANES Data

§   Logistic Regression and Generalized Linear Models for Binary Survey Variables
  • Generalized Linear Models (GLMs) for Binary Survey Responses
  • Building the Logistic Regression Model: Stage 1-Model Specification
  • Building the Logistic Regression Model: Stage 2-Estimation of Model Parameters and Standard Errors
  • Building the Logistic Regression Model: Stage 3-Evaluation of the Fitted Model
  • Building the Logistic Regression Model: Stage 4-Interpretation and Inference
  • Analysis Application
  • Comparing the Logistic, Probit, and Complementary-Log-Log (C-L-L) GLMs for Binary Dependent Variables

The instructors are Dr. Brady West (Lead Statistician, Center for Statistical Consultation and Research, Univ. of Michigan) and Ms. Patricia Bergland (Senior Research Associate in the Youth and Social Indicators Program and the Survey Methodology Program at the University of Michigan-Institute for Social Research).  Brady West is the lead author of "Linear Mixed Models: A Practical Guide using Statistical Software" (Chapman Hall/CRC) and a co-author of "Applied Survey Data Analysis" (Chapman Hall/CRC).

You will be able to ask questions and exchange comments with Dr. Brady West and Ms. Patricia Bergland via a private discussion board throughout the course.   The courses take place online at in a series of 4 weekly lessons and assignments, and require about 15 hours/week.  Participate at your own convenience; there are no set times when you must be online. You have the flexibility to work a bit every day, if that is your preference, or concentrate your work in just a couple of days.

For Indian participants accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune.

For India Registration and pricing, please visit us at

Call: 020 66009116


No comments: