Skip to main content

# 3: Summarizing Data


• 3.1: Measures of the Center of the Data
The mean and the median can be calculated to help you find the "center" of a data set. The mean is the best estimate for the actual data set, but the median is the best measurement when a data set contains several outliers or extreme values. The mode will tell you the most frequently occurring datum (or data) in your data set. The mean, median, and mode are extremely helpful when you need to analyze your data.
• 3.2: Skewness and the Mean, Median, and Mode
Looking at the distribution of data can reveal a lot about the relationship between the mean, the median, and the mode. There are three types of distributions. A right (or positive) skewed distribution, a left (or negative) skewed distribution and a symmetrical distribution.
• 3.3: Measures of the Spread of the Data
An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation. The standard deviation is a number that measures how far data values are from their mean.
• 3.4: Measures of the Location of the Data
The values that divide a rank-ordered set of data into 100 equal parts are called percentiles and are used to compare and interpret data. For example, an observation at the 50th percentile would be greater than 50 % of the other obeservations in the set. Quartiles divide data into quarters. The first quartile is the 25th percentile, the second quartile is 50th percentile, and the third quartile is the the 75th percentile. The interquartile range is the range of the middle 50 % of the data values