Wednesday, 6 June 2012

Smoothing with P-splines (Using R)

The linear model is ubiquitous in classical statistics, yet real-life data rarely follow a purely linear pattern.  Smoothing is a commonly-used technique in such cases; "Smoothing with P-splines (Using R)" will be offered online at by Dr. Brian Marx and Dr. Paul Eilers. For more details please visit at

Real-life data often do not follow a pattern that is well-described by a single simple function of any sort.  Splines are combinations of different functions that are used to describe and model data differentially in a smooth fashion over different ranges.  P-splines are especially popular (over 500 citations for the instructors' original article in Statistical Science introducing P-splines) because they are widely applicable and effective. In this course, you will learn how to use R software to develop P-splines for data smoothing.  You will be introduced to P-splines via B-splines (basis splines), and learn about the role of difference penalties.  You will learn how to balance the competing demands of fidelity to the data and smoothness, and how to optimize the smoothing.  The final session of the course covers multidimensional smoothing.

Dr. Brian Marx is Professor of Statistics at Louisiana State University, and has taught Categorical Data Analysis for over ten years. He is currently serving as Chair of the Statistical Modelling Society and is the Coordinating Editor of Statistical Modelling: An International Journal. Dr. Marx has numerous publications in peer reviewed journals.

Dr. Paul Eilers is Professor of Genetical Statistics at the Erasmus University Medical Center (Netherlands). Dr. Eilers' research interests include genomic data analysis (esp. high throughput genomic data), chemometrics, smoothing, longitudinal data analysis, survival analysis, and statistical computing.  Dr. Eilers' statistical hobby is filtering and smoothing of time series and signals from chemical instruments.

Who can take this course:
Medical and social science researchers, data miners, environmental analysts;  any researcher who must develop statistical models with "messy" data.

Course Program:

Course outline: The course is structured as follows
SESSION 1:  Smoothing via Regression - Local vs Global Bases
  • Global bases can be ineffective
  • Local bases are attractive
  • B-splines
  • Difference penalties

SESSION 2: Introducing P-splines
  • Dealing with non-normal data
  • Moving from GLM to P-spline
  • Density estimation
  • Variance smoothing

SESSION 3: Optimizing the Smoothing
  • Fidelity to the data vs smooth curve
  • Cross-validation, AIC
  • Error bands

SESSION 4: Multidimensional Smoothing
  • Generalized Addition Models
  • Varying coefficient models
  • Tensor products

You will be able to ask questions and exchange comments with the instructors via a private discussion board throughout the course.   The courses take place online at in a series of 4 weekly lessons and assignments, and require about 15 hours/week.  Participate at your own convenience; there are no set times when you must be online. You have the flexibility to work a bit every day, if that is your preference, or concentrate your work in just a couple of days.

For Indian participants accepts registration for its courses at special prices in Indian Rupees through its partner, the Center for eLearning and Training (C-eLT), Pune (

For India Registration and pricing, please visit us at

Call: 020 66009116


No comments: