# Introductory Statistics with R

## by Peter Dalgaard

### Paperback 229mm × 155mm, xv+267 pages Published in August 2002.

### Description:

R is an Open Source implementation of the well-known S language. It works on multiple computing platforms and can be freely downloaded. R is thus ideally suited for teaching at many levels as well as for practical data analysis and methodological development. This book provides an elementary-level introduction to R, targeting both non-statistician scientists in various fields and students of statistics.

The main mode of presentation is via code examples with liberal commenting of the code and the output, from the computational as well as the statistical viewpoint. Brief sections introduce the statistical methods before they are used. A supplementary R package can be downloaded and contains the data sets. All examples are directly runnable and all graphics in the text are generated from the examples.

The statistical methodology covered includes statistical standard distributions, one- and two-sample tests with continuous data, regression analysis, one- and two-way analysis of variance, regression analysis, analysis of tabular data, and sample size calculations. In addition, the last four chapters contain introductions to multiple linear regression analysis, linear models in general, logistic regression, and survival analysis.

### Contents

1. Basics
2. Probability and Distributions
3. Descriptive Statistics and Graphics
4. One and two-sample tests
5. Regression and Correlation
6. ANOVA and Kruskal-Wallis
7. Tabular Data
8. Power and the Computation of Sample Size
9. Multiple Regression
10. Linear Models
11. Logistic Regression
12. Survival Analysis

#### Appendices:

1. Obtaining and Installing R
2. Data Sets in the ISwR Package
3. Compendium

### Errata and Notes

(Most of these were fixed in the corrected 3rd printing.)
• p. iv, l. 2: "Biostastics" should be "Biostatistics"
• p. 2: The screen dump wasn't updated for R 1.5.0. Also, it shows rnorm(1000) where the main text has rnorm(500)
• p. 4, l. 4: "R" should be in Sans Serif font
• p. 5, l. 13: 65 should be 60
• p. 26, l. 4: first "useful" should be omitted
• p. 27, l. -9: "wants" should be "want"
• p. 28, 2nd code snippet: line=-1:5 should be line=-1:4 (this has no visible impact since line 5 is outside the plotting area, though)
• p. 31, l. 7: Closing parenthesis missing.
• p. 33, l. 26: "any" should be "all"
• p. 36, 1st paragraph: I should have said that ESS also works fine on Windows.
• p. 36, l. 7: "instruction" should be "instructions"
• p. 43, l. 22: Epi-Info .rec files can also be read (now).
• p. 43, l. -12: "foreign library" should be "package"
• p. 43, penultimate paragraph. Some Unix databases, e.g. PostgreSQL, also allow ODBC connections.
• p. 71, l. -1: "isdone" should be "is done"
• p. 78, l. -6: "scorned upon" should be "frowned upon"
• p. 82, after second displayed formula: Spurious indentation.
• p. 86, midpage. Test statistic V should be math italic.
• p. 108, l. 16: use='complete.obs' should use double quotes
• p. 114, l. 13: 2nd appearance of "SSD_B and MS_B" should have have index "W" instead.
• p. 120, midpage: "thar" should be "that".
• p. 121, last equation: "xi." is missing the bar.
• p. 126, 1st paragraph needs rephrasing: "We have seen the use of analysis of variance tables in..."
• p. 173, l. 16: "19" should be "(p.19)"
• p. 183, last two lines: "...but NOT quite as conspicuous..."
• p. 183, Fig. 10.6, and p. 184, Fig. 10.7: Some y-labels are partly clipped (this is because margins have been reduced - shouldn't happen if you type in the commands).
• p. 191, l. -6: "log[p(1-p)]" should be "log[p/(1-p)]"
• p. 198, l. 9: "are are" should be "are"
• p. 200, l. -8: "9.24" should be "9.29"
• p. 221: The install procedure for Debian Linux is elaborated in the FAQ,