Glossary
- Page ID
- 162855
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)| Words (or words that have the same definition) | The definition is case sensitive | (Optional) Image to display with the definition [Not displayed in Glossary, only in pop-up on pages] | (Optional) Caption for Image | (Optional) External or Internal Link | (Optional) Source for Definition |
|---|---|---|---|---|---|
| (Eg. "Genetic, Hereditary, DNA ...") | (Eg. "Relating to genes or heredity") | ![]() |
The infamous double helix | https://bio.libretexts.org/ | CC-BY-SA; Delmar Larsen |
| Word(s) | Definition | Image | Caption | Link | Source |
|---|---|---|---|---|---|
| Statistics | a science that deals with any aspect of the collection, analysis, interpretation, and presentation of data | ||||
| Data | a collection of observations | ||||
| Population | the collection of all individuals or items under consideration in a statistical study | ||||
| Census | a study that involves the entire population | ||||
| Sampling | a process of obtaining a sample from the population | ||||
| Sample | a part of the population from which information is obtained | ||||
| Representative sample | a sample that reflects as closely as possible the relevant characteristics of the population under consideration | ||||
| Descriptive statistics | the methods for organizing and summarizing information | ||||
| Inferential statistics | the methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population | ||||
| Sampling error | the natural variation that results from selecting a sample to represent a larger population | ||||
| Non-sampling error | an issue that affects the reliability of sampling data other than the natural variation | ||||
| Observational study | a study in which researchers simply observe characteristics and take measurements | ||||
| Designed experiment | a study in which the data do not exist until someone does "the experiment" that produces the data | ||||
| Statistically significant result | a result that is very unlikely to occur by chance | ||||
| Practically significant result | a result that is big enough to be meaningful in the real world regardless of its statistical significance | ||||
| Lurking variable | a variable that causes the changes in the two variables under consideration | ||||
| Sampling bias | a measure of how not representative the sample is due to not all members of the population being equally likely to be selected | ||||
| Simple random sampling | sampling procedure for which each possible sample of a given size is equally likely to be the one obtained | ||||
| Sampling with replacement | sampling in which every selected member of the population is returned to the population for the future selection | ||||
| Sampling without replacement | sampling in which every member of the population may be chosen for inclusion in a sample only once | ||||
| Systematic sampling | a method for selecting a random sample by randomly picking the first object and then every k-th object after that for some k approximately equal to the number of individuals in the population divided by the desired sample size | ||||
| Cluster sampling | a method for selecting a random sample by dividing the population into groups and using simple random sampling to select a set of groups from which every object is included in the sample | ||||
| Stratified sampling | a method for selecting a random sample by dividing the population into strata and then using simple random sampling to pick objects from each stratum so that each stratum is represented in the sample proportionally to its size | ||||
| Voluntary response sampling | a method of sampling in which the respondents themselves decide whether to be included | ||||
| Convenience sampling | a nonrandom method of selecting a sample; this method selects individuals that are easily accessible and may result in biased data | ||||
| Placebo | a fake drug used in the testing of medication | ||||
| Control group | in an experiment, the group that does not receive the experimental treatment | ||||
| Experimental group | in an experiment, the group that is exposed to the treatment | ||||
| Explanatory variable | a variable that we think explains or causes changes in the response variable | ||||
| Response variable | a variable that measures an outcome or result of a study | ||||
| Treatment | a specific condition applied to the individuals in an experiment | ||||
| Blinding | a technique where the subjects do not know whether they are receiving a treatment or a placebo | ||||
| Double-blinding | a technique where both the subject and data recorder do not know the treatment | ||||
| Statistical variable | a characteristic that varies from one object of the population to another | ||||
| Qualitative data | data that consist of names and labels describing the attributes of a population | ||||
| Quantitative data | data that consist of numbers that are the result of counting or measuring attributes of a population | ||||
| Categorical data | qualitative data that do not have a natural ordering | ||||
| Ordinal data | qualitative data that have a natural ordering | ||||
| Discrete data | quantitative data whose all possible values can be listed | ||||
| Continuous data | quantitative data whose all possible values form an interval | ||||
| Nominal level of measurement | a level of measurement using which we can capture the variation in kind or quality but not in amount | ||||
| Ordinal level of measurement | a level of measurement using which we can yield rank-ordered data | ||||
| Interval level of measurement | a level of measurement using which we can obtain the meaningful difference between the values but there is no natural zero | ||||
| Ratio level of measurement | a level of measurement using which we can obtain the meaningful difference between the values and there is a natural zero | ||||
| Frequency | the number of times a particular distinct value occurs | ||||
| Frequency distribution table | a listing of the distinct values and their frequencies | ||||
| Relative frequency | the frequency divided by the total number of observation | ||||
| Relative frequency distribution table | a listing of the distinct values and their relative frequencies | ||||
| Bar chart | a chart that displays the distinct values of the qualitative data on a horizontal axis and the relative frequencies (or frequencies) of those values on a vertical axis | ||||
| Pie chart | a disk divided into wedge-shaped pieces proportional to the relative frequencies of the qualitative data | ||||
| Pareto chart | a chart that consists of bars that are sorted into order by category size (largest to smallest) | ||||
| Single-value grouping | a way to group quantitative data into categories in which each category represents a single possible value | ||||
| Interval grouping | a way to group quantitative data into categories in which each category is an interval of values called a class | ||||
| Class | a category in which quantitative data can be arranged | ||||
| Lower class limit | the smallest value that could go in a class | ||||
| Upper class limit | the smallest value that could go into the next higher class | ||||
| Class width | the difference between the lower limit of a class and the lower limit of the next-higher class | ||||
| Class midpoint | the average of the lower limit of a class and the lower limit of the next-higher class | ||||
| Histogram | a chart that displays the classes of the quantitative data on a horizontal axis and the frequencies (relative frequencies) of those classes on a vertical axis | ||||
| Dot plot | a chart in which each observation is recorded by placing a dot over the appropriate value on the horizontal axis | ||||
| Stem-and-leaf diagram | a chart in which each observation is recorded by placing its unit digit in the row with the appropriate quantity for its number of tens | ||||
| Frequency polygon | graph of a frequency distribution that shows the number of instances of obtained scores, usually with the data points connected by straight lines | ||||
| Cumulative frequency of a class | the sum of the frequencies for that class and all previous classes | ||||
| Ogive | a cumulative frequency polygon | ||||
| Distribution of data | a table, graph, or formula that provides the values of the observations and how often they occur | ||||
| Shape of the distribution | one of the features of the distribution that characterizes the visual appearance of the distribution | ||||
| Scatterplot | a chart in which each observation is plotted as a point on the coordinate plane | ||||
| Time-series | a time-ordered sequence of observations taken at regular intervals | ||||
| Contingency table | a two-way frequency table used to group bivariate data | ||||
| Center | the most "typical values" of the data set | ||||
| Mean | the sum of the observations divided by the number of observations | ||||
| Median | the number that divides the bottom 50% of the data from the top 50%; same as the 50-th percentile | ||||
| Mode | the most frequently occurring value in the data set | ||||
| Resistant measure | a measure that is not sensitive to a few extreme observations | ||||
| Non-Resistant measure | a measure that is sensitive to a few extreme observations | ||||
| Parameter | a descriptive measure of the population | ||||
| Statistic | a descriptive measure of a sample | ||||
| Estimator | a value that is used to estimate the parameter | ||||
| Estimating | an attempt to estimate the parameter | ||||
| Biased estimator | an estimator that doesn't correctly estimate the unknown parameter on average in a long run | ||||
| Unbiased estimator | an estimator that correctly estimates the unknown parameter on average in a long run | ||||
| Population mean | the mean of the population | ||||
| Sample mean | the mean of a sample | ||||
| Measures of variation | measures that indicate the amount of variation, or spread, in a data set | ||||
| Range | the difference between the maximum (largest) and minimum (smallest) observations | ||||
| Variance | the average squared deviation of the data in a dataset from its mean | ||||
| Standard deviation | the square root of the variance | ||||
| Population variance | the variance of the population | ||||
| Sample variance | the variance of a sample | ||||
| Population standard deviation | the standard deviation of the population | ||||
| Sample standard deviation | the standard deviation of a sample | ||||
| Outlier | an observation with an "extreme" deviation from the mean; in the context of linear regression - an observation that lies too far from the regression line, relative to other data points | ||||
| Three standard deviations rule | the rule according to which almost all the observations in any data set lie within 3 standard deviations to either side of the mean | ||||
| Chebyshev's Theorem | the rule that gives the approximate distribution of observations within k standard deviations (k>1) of the mean regardless of the shape of the distribution | ||||
| Empirical rule | the rule that gives the approximate distribution of observations within 1 standard deviation (68%), 2 standard deviations (95%), and 3 standard deviations (99.7%) of the mean when the shape of the distribution can be assumed normal | ||||
| Zeroth quartile | the value for which 0% of the data is less than it; same as the minimum | ||||
| First quartile | the value for which 25% of the data is less than it | ||||
| Second quartile | the value for which 50% of the data is less than it; same as the median | ||||
| Third quartile | the value for which 75% of the data is less than it | ||||
| Fourth quartile | the value for which 100% of the data is less than it; same as the maximum | ||||
| Five-number summary | a numerical summary that consists of the Q0, Q1, Q2, Q3, Q4 | ||||
| Interquartile range | a measure of variability, defined to be the difference between the third and first quartiles | ||||
| Boxplot | a graphical representation of the five-number summary of a dataset obtained by drawing a box that ranges from Q1 to Q3 and "whiskers" extend to the most extreme observations that still lie within the adjacent values | ||||
| Probability | a measure of how likely something is to happen | ||||
| Experiment | an action whose result is not certain | ||||
| Simple outcome | one possible result of an experiment | ||||
| Sample space | the set of all outcomes of an experiment | ||||
| Event | a collection of outcomes from the sample space | ||||
| Impossible event | an event that never occurs | ||||
| Certain event | an event that always occurs | ||||
| (not A) | the event that occurs when A doesn't occur | ||||
| (A and B) | the event that occurs when both, A and B, occur | ||||
| (A or B) | the event that occurs when A or B or (A and B) occur | ||||
| Mutually exclusive events | the events that have no common outcomes | ||||
| Marginal probability | the probability of an event obtained from the rightmost column or the bottom row of a contingency table | ||||
| Joint probability | the probabilities of an event obtained from the cell at the intersection of a row and a column | ||||
| Type 1 error | the event in which false positive occurred; in the context of hypothesis testing - rejecting a true H0 claim | ||||
| Type 2 error | the event in which false negative occurred; in the context of hypothesis testing - failing to reject a false H0 claim | ||||
| Conditional probability | the probability of event A knowing that B has occurred | ||||
| Independent events | events such that the occurrence of one doesn't affect the probability of another | ||||
| Compliment rule | for any event A, the probability of A not happening is P(not A) = 1 - P(A) | ||||
| General addition rule | for any two events, A and B, the probability of A or B is P(A or B) = P(A) + P(B) - P(A and B) | ||||
| Special addition rule | if events A and B are mutually exclusive, then P(A or B) = P(A) + P(B) | ||||
| General multiplication rule | for any two events, A and B, the probability of A and B is P(A and B) = P(A|B)P(B) | ||||
| Special multiplication rule | if events A and B are independent, then P(A and B) = P(A)P(B) | ||||
| Fair coin | a coin for which the outcomes, heads and tails, are equally likely | ||||
| Unfair coin | a coin for which the outcomes, heads and tails, are not equally likely | ||||
| Permutation | an arrangement of distinct objects in order | ||||
| Combination | a collection of distinct objects | ||||
| Basic counting principle | the rule that states that when there are m ways to do one thing, and n ways to do another, then altogether there are m×n ways of doing both | ||||
| Factorial | the product of the natural numbers less than or equal to the given number | ||||
| Special permutation rule | the rule that states that the number of permutations of n objects is n! | ||||
| nPr | the number of permutations of r objects chosen from a set of n objects | ||||
| nCr | the number of combinations of r objects chosen from a set of n objects | ||||
| Random variable | an unknown quantity whose value depends on chance | ||||
| Probability distribution table | the table that summarizes the possible values of a discrete random variable and their corresponding probabilities | ||||
| Probability histogram | a graph that displays the possible values of a discrete random variable on the horizontal axis and their corresponding probabilities on the vertical axis | ||||
| Discrete random variable | a random variable whose all possible values can be listed | ||||
| Continuous random variable | a random variable whose all possible values form a continuous interval | ||||
| Expected value | the average value of the random variable over a large number of experiments | ||||
| Standard deviation | the square root of the average squared deviation of the values of the random variable from its expected values | ||||
| Bernoulli trials | a sequence of the same Binomial experiment performed independently n times | ||||
| Binomial random variable | a discrete random variable that represents the number of successes among n Bernoulli trials | ||||
| Binomial experiment | an experiment in which there are exactly two possible outcomes - success and failure, with the probability of success P(S)=p and the probability of failure P(F)=1-p | ||||
| Continuous random variable | a random variable whose all possible values form an interval | ||||
| Probability density curve | a curve that (1) is always on or above the horizontal axis; (2) has the total area between itself and the horizontal axis equal to 1 | ||||
| Standard normal curve | a bell-shaped probability density curve that (1) has the peak at 0; (2) symmetric about 0; (3) extends indefinitely in both directions, approaching but never touching the horizontal axis; (4) the empirical rule holds | ||||
| Standard normal variable | a continuous random variable that has the standard normal probability density curve | ||||
| T-curve | a probability density curve that (1) extends indefinitely in both directions, approaching, but never touching, the horizontal axis; (2) symmetric about 0; (3) as the number of degrees of freedom becomes larger, it increasingly looks like the standard normal curve | ||||
| Chi-square curve | a probability density curve that (1) starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis; (2) is right-skewed; (3) as the number of degrees of freedom becomes larger, it increasingly looks like a normal curve | ||||
| F-curve | a probability density curve that (1) starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis; (2) is right-skewed; (3) possess the reciprocal property | ||||
| Uniform random variable | a random variable whose probability density function is portrayed as a horizontal line 1/(b-a) above the x-axis over the interval from a to b | ||||
| Normal probability curve | a bell-shaped probability density curve that (1) has the peak at μ; (2) symmetric about μ; (3) extends indefinitely in both directions, approaching but never touching the horizontal axis; (4) empirical rule holds | ||||
| Normal random variable | a variable that has a normal probability density curve | ||||
| Normality plot | a plot used to determine whether the population is normal based on the sample of a small size | ||||
| Normal approximation | a technique used to approximate the probabilities of a binomial random variable using the normal random variable with the same parameters | ||||
| Correction for continuity | an adjustment made in the binomial approximation to account for the missing probabilities at the end of the range | ||||
| Sample mean | a random variable whose values are the averages of the samples of size n from the population | ||||
| Sample proportion | a random variable that represents the number of cases falling into one category of the variable divided by the number of cases in the sample | ||||
| Sample sum | a random variable that represents the sum of the samples of size n from the population | ||||
| Sample variance | a random variable whose possible values are the variances of the samples of size n from the population | ||||
| The Central Limit Theorem for Sample Means | for sufficiently large samples (n>30), the distribution of the sample mean is approximately normal | ||||
| The Central Limit Theorem for Sample Proportions | for sufficiently large samples (np>10 and n(1-p)>10), the distribution of the sample proportion is approximately normal | ||||
| The Central Limit Theorem for Sample Sums | for sufficiently large samples (n>30), the distribution of the sample sum is approximately normal | ||||
| Point estimate | the value of a statistic used to estimate the parameter | ||||
| Biased estimate | a statistic such that the mean of all its possible values doesn't equal the parameter | ||||
| Unbiased estimate | a statistic such that the mean of all its possible values equals the parameter | ||||
| Confidence interval | an interval of numbers obtained from a point estimate of a parameter along with the percentage confidence that the parameter lies in the range | ||||
| Confidence level | the confidence we have that the parameter lies in the confidence interval (i.e., that the confidence interval contains the parameter) | ||||
| Confidence estimate | the confidence level and confidence interval | ||||
| Margin of error | the distance from the center of a confidence interval to the end | ||||
| Null hypothesis | a to be tested statistical claim in the form of an equation | ||||
| Alternative hypothesis | a statistical claim in the form of an inequality to be tested as an alternative to the null hypothesis | ||||
| Hypothesis test | a test to decide whether the null hypothesis should be rejected in favor of the alternative hypothesis or not | ||||
| Significance level | the largest tolerated probability of type I error | ||||
| Test statistic | a value that measures how far a particular observation is away from its expected value | ||||
| Critical value | a value that defines a rejection region for a hypothesis test | ||||
| P-value | the probability of a statistic to be at least as far from the assumed value as the current observation | ||||
| Hypothesis | a testable claim, often implied by a theory, which is either true or false | ||||
| Statistical claim | a statement about a population parameter in the form of an equation or an inequality | ||||
| Independent samples | a pair of samples in which the observations in one sample do not influence the observations in the other | ||||
| Dependent samples | a pair of samples in which the observations in one sample somehow influence the observations in the other | ||||
| Paired samples | a pair of samples in which the observations are paired in a distinct way | ||||
| Pooled proportion | the proportion obtained by treating the two samples as one | ||||
| Pooled standard deviation | the standard deviation computed by treating the two samples as one | ||||
| Goodness-of-fit test | a hypothesis test in which the null hypothesis is "the variable has the specified distribution", and the alternative hypothesis is "the variable doesn't have the specified distribution" | ||||
| Homogeneity test | a hypothesis test in which the null hypothesis is "the distribution of one variable is the same for each value of the other variable", and the alternative hypothesis is "the distribution of one variable is not the same for all values of the other variable" | ||||
| Independence test | a hypothesis test in which the null hypothesis is "the variables are independent", and the alternative hypothesis is "the variables are dependent" | ||||
| Homogeneity | the quality of being similar or comparable in kind or nature | ||||
| Independent variables | statistical variables such that knowing the outcome of one doesn't affect the probability of another variable's outcome | ||||
| Uniform discrete distribution | a probability distribution in which the frequencies are evenly spread out across the values of a discrete variable | ||||
| Categorical variable | a qualitative variable associated with categories | ||||
| One-Way ANOVA | a hypothesis test in which the null hypothesis is "all the population means are equal", and the alternative hypothesis is "population means are not equal" | ||||
| Two-Way ANOVA | a hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and a scale-dependent variable | ||||
| Coefficient of determination | a numerical measure of the association between two variables | ||||
| Correlation coefficient | a numerical measure of the strength of the linear relationship between two variables | ||||
| Extrapolation | using the regression line outside of the domain | ||||
| Influential observation | an observation whose removal causes the regression equation to change considerably | ||||
| Explanatory variable, predictor variable, independent variable | a variable used to explain the outcome variable | ||||
| Outcome variable, response variable, dependent variable | a variable that changes in response to explanatory variable | ||||
| Predicted value | the output computed using the regression line | ||||
| Negative association | large values of one variable are associated with large values of the other | ||||
| Positive association | large values of one variable are associated with small values of the other, and vice versa | ||||
| Linear relationship | data tend to cluster around straight line when plotted on a scatterplot | ||||
| Explained difference | the difference between the predicted and the average values | ||||
| Unexplained difference | the difference between the predicted and the observed values | ||||
| Residual plot | a plot in which residuals are plotted against the values of the explanatory variable | ||||
| Least-squares regression line | the line for which the sum of squared vertical distances is as small as possible |



