**Statistics (II)**

**Spring Semester, 2012**

“ST”, Tuesdays, 9:00–10:45, Snellius lecture room t.b.a., Nielsbohrweg 1, Leiden. Masters, or Advanced Bachelor’s level.

**Slides (pdf), notes, and R scripts**

Note: the slides and the Sweave documents also contain a few exercises.

Slides for the lectures in 2009, as far as they were completed

Solution to theoretical exercises on the multivariate normal dsn.

R script for lecture on maximum likelihood estimation

sequel to same

Sweave input for likelihood scripts, latex pdf output for same

R histograms

Sweave input for histogram script, latex pdf output for same

Sweave input for frequency polygon script, latex pdf output for same, after completion of latex by RDG

First student project!!!

Second student project!!!

Third student project!!!

You may be interested in the R graphical user interface Rcommander

http://socserv.mcmaster.ca/jfox/Misc/Rcmdr

and you should be interested in the tools for combining R and latex, Sweave

http://www.stat.umn.edu/~charlie/Sweave

http://www.statistik.lmu.de/~leisch/Sweave

Sweave manual (pdf)

This course is conceived as the sequel to the first introduction to mathematical statistics in Leiden, which is the basic material on statistical testing, estimation and confidence intervals usually taught from the book of J.A. Rice “Mathematical Statistics and Data Analysis”, chapters 7, 8, 9 and 10 (pub. Duxbury press), together with usually a quick introduction to regression analysis (chapter 14).

I plan to start by reviewing regression analysis and also looking at “analysis of variance”, chapter 12 from Rice. We will see that analysis of variance can also be cast in the form of the general linear model, studied in regression analysis; the rather special looking analysis methods described in Rice are nothing else than yet more applications of the least squares method; which itself is nothing else than maximum likelihood estimation under the assumption of normal errors.

After that I would like to treat some more advanced topics related to linear models. What they would be, would depend very much on the interests of the students. I see two main options

1) if there are many students who have a general and practical inclination I would like to steer the course in the direction of a variety of modern applied statistical methods for dealing with multivariate data, especially regression type models but no longer necessarily linear or even parametric. I would use the book “Modern Applied Statistics with S” by W.N. Venables and B.D. Ripley (pub. Springer), and the students will do some applied data analysis themselves.

By the way, the S language for statistical analysis has been implemented under the name “R”, and is freely available and widely used everywhere where experimental statistical methodology is developed. A huge user's community has contributed many specialist packages to the system.

2) if there are many students of a more theoretical inclination we could study J.R. Rao’s classic book “Linear Statistical Inference and its Applications” which goes much more deeply into the mathematics of the standard linear model.

I am supposing that the number of students following the course will be small enough that it can be taylor-made to their interests and need not follow the traditional format.

**Literature**:

J.A. Rice (2006), Mathematical Statistics and Data Analysis (3rd edition), Duxbury press

W.N. Venables and B.D. Ripley (2002), Modern Applied Statistics with S (4th edition), Springer.

gill@math.leidenuniv.nl