Graphical descriptions of data are important. However, many times we want to have a number to help describe a data set. As an example, in baseball a pitcher is considered good if he has a low number of earned runs per nine innings. A baseball hitter is considered good if he has a high batting average. These numbers tell us a great deal about a player. There are similar numbers in other sports such as percentage of field goals made in basketball. There are also similar numbers in other aspects of life. If you want to know how much money you will make when you graduate from college and are employed in your chosen field, you could look at the average salary that someone with your degree earns. If you want to know if you can afford to purchase a home, you could look at the median price of homes in the area. To understand how to find this information, we need to look at the different numerical descriptive statistics that exist out there.
Numerical Descriptive Statistics: These are numbers that are calculated from the sample and are used to describe or estimate the population parameter.
Statistics that we can calculate are proportion, location of center (average), measures of spread (variability), and percentiles. There are other numbers, but these are the ones that we will concentrate on in this book.
- 2.1: Proportion
- Proportions are usually calculated when dealing with qualitative variables.
- 2.3: Measures of Spread
- The location of the center of a data set is important, but also important is how much variability or spread there is in the data.
- 2.4: The Normal Distribution
- There is a special symmetric shaped distribution called the normal distribution. It is high in the middle and then goes down quickly and equally on both ends. It looks like a bell, so sometimes it is called a bell curve. One property of the normal distribution is that it is symmetric about the mean. Another property has to do with what percentage of the data falls within certain standard deviations of the mean.
- 2.5: Correlation and Causation, Scatter Plots
- There are many studies that exist that show that two variables are related to one another. The strength of a relationship between two variables is called correlation. Variables that are strongly related to each other have strong correlation. However, if two variables are correlated it does not mean that one variable caused the other variable to occur.