Skip to main content
Mathematics LibreTexts

11.1: One Mean Z Test

  • Page ID
    105860
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Section 0.1: Introduction to Hypothesis Testing - Fundamentals

    Let’s start with a definition of a hypothesis.

    Definition: Hypothesis

    A hypothesis is a testable claim that is either true or false.

    For example, someone may claim that Earth is round. To test a hypothesis means to verify its truthfulness in a scientific way that is universally acceptable by both, the opponents and proponents of the hypothesis. How would we test whether Earth is round or not? We may send a sailing expedition that travels only west, and we would agree that if the sailing expedition comes all the way around then Earth is round and if the sailing expedition falls of the edge then Earth is flat.

    By the definition, the hypothesis can be either true or false. Testing the hypothesis may also result in two outcomes – the test will either confirm or reject the hypothesis. In case when the expedition sails all the way around - the true hypothesis will be confirmed. In case when the expedition falls off the edge - the false hypothesis will be rejected. If either one of these scenarios was to happen then the testing procedure would serve its true purpose of figuring out the truth. But is it possible that something goes wrong? What if Earth is round but the boat falls off Niagara Falls? Then we would believe that it fell of the edge of Earth, and for a wrong reason we would have to reject the hypothesis that is true. On the other hand, what if Earth is flat, and the expedition only sailed around the continent due to a compass malfunction? Then we would believe that it sailed around Earth, and for a wrong reason we would have to confirm the hypothesis that is false. If either one of these scenarios was to happen then the testing procedure would return the wrong result.

    Earth is round

    Confirmed

    Rejected

    True

    Sailed around Earth – CORRECT

    Fell off Niagara Falls – ERROR!

    False

    Sailed around a continent – ERROR!

    Fell off the edge of Earth - CORRECT

    No matter how perfect the testing procedure is – there is always a chance of making an error.

    Definition: Type 1 and 2 errors

    Rejecting a true hypothesis is called type 1 error. Confirming or failing to reject a false hypothesis is called type 2 error.

    The following table summarizes all possible outcomes of the hypothesis testing:

    A hypothesis

    Confirmed

    Rejected

    True

    Correct Inference

    Error Type 1

    False

    Error Type 2

    Correct Inference

    The difference between the two types of errors is clear and depends on the context. Consider the following example. When a suspect is brought to a trial, they are either guilty or not, and there are 2 outcomes of the trial – the suspect is either convicted or cleared. You can clearly see the difference between the two types of errors as it is not the same to clear the guilty and convict the innocent.

    A suspect is …

    Convicted

    Cleared

    Guilty

    Correct Inference

    Error Type 1

    Innocent

    Error Type 2

    Correct Inference

    At different times different justice systems may prioritize one type of error over the other one.

    In the testing procedure that we will develop, we will set a maximum threshold for type 1 errors and we will call it alpha – the significance level.

    In this course, we will learn how to make and test statistical claims.

    Definition: Statistical claim

    A statistical claim is a statement about a population parameter in the form of an equation or an inequality.

    Example \(\PageIndex{0.1.1}\)

    For example,

    • “The average height of students is less than 170 cm” is a statistical claim about the population mean, and it can be expressed symbolically as \(\mu<170\).
    • “The proportion of females on campus is more than 60%” is a statistical claim about the population proportion, and it can be expressed symbolically as \(p>0.60\).
    • “The students’ GPA variance is 1.21” is a statistical claim about the population variance, and it can be expressed symbolically as \(\sigma^2=1.21\).
    • “The company’s average salary is different from the state’s average of $67,565” is a statistical claim about the population mean, and it can be expressed symbolically as \(\mu\neq67565\).

    Next, we will develop a formal procedure to test statistical claims.

    Section 0.2: Introduction to Hypothesis Testing - Choosing Hypotheses

    Due to the nature of the relations that we discovered in statistics we will always be testing a hypothesis in the context of another hypothesis – one in the form of an equation and the other in the form of an inequality.

    Definition: Null hypothesis

    A null hypothesis is a to be tested statistical claim in the form of an equation. It is usually labeled as \(H_0\).

    Definition: Alternative hypothesis

    An alternative hypothesis is a statistical claim in the form of an inequality to be tested as an alternative to the null hypothesis. It is usually labeled as \(H_a\) or \(H_1\).

    A testing procedure can be classified as left-tail, two-tail, or right-tail depending on the inequality used in the alternative hypothesis:

    \(<\) \(\neq\) \(>\)

    LT

    TT

    RT

    Example \(\PageIndex{0.2.1}\)

    For example,

    • “The average height of students is less than 170 cm” can be expressed symbolically as \(\mu<170\). Since the claim is in the form of an inequality, therefore it must be set as an alternative hypothesis, therefore the null hypothesis is \(\mu=170\) and the test is left-tail, that is:

    \(H_0: \mu=170\)

    \(H_a: \mu<170\)

    LT

    • “The proportion of females on campus is more than 60%” can be expressed symbolically as \(p>0.60\). Since the claim is in the form of an inequality, therefore it must be set as an alternative hypothesis, therefore the null hypothesis is \(p=0.60\) and the test is right-tail, that is:

    \(H_0: p=0.60\)

    \(H_a: p>0.60\)

    RT

    • “The students’ GPA variance is 1.21” can be expressed symbolically as \(\sigma^2=1.21\). Since the claim is in the form of an equation, therefore it must be set as a null hypothesis, therefore the alternative hypothesis is \(\sigma^2\neq1.21\) and the test is two-tail, that is:

    \(H_0: \sigma^2=1.21\)

    \(H_0: \sigma^2\neq1.21\)

    TT

    • “The company’s average salary is different from the state’s average of $67,565” can be expressed symbolically as \(\mu\neq67565\). Since the claim is in the form of an inequality, therefore it must be set as an alternative hypothesis, therefore the null hypothesis is \(\mu\neq67565\) and the test is two-tail, that is

    \(H_0: \mu=67565\)

    \(H_a: \mu\neq67565\)

    TT

    Although right now naming the procedure as left-tail, right-tail, or two-tail may not make sense, soon, we will be using this information as a part of the testing procedure.

    The idea is simple – the inequality defines a spectrum which we will use to evaluate the strength of the evidence. For example, in a right-tail test only values on the right side of the spectrum will be considered as a strong evidence. The further to the right the stronger the evidence.

    Section 0.3: Introduction to Hypothesis Testing - Test Statistic

    Next, we will discuss the concept of a test statistic and what constitutes an evidence against the null hypothesis.

    In the statistical hypothesis testing:

    1. We will treat the null hypothesis as the assumed status quo.
    2. The goal of the testing procedure is to evaluate the strength of any evidence against the null hypothesis.
    3. We will not reject the null hypothesis unless the evidence strongly suggests the alternative hypothesis.

    In other words, we are not going to change the status quo unless the evidence is beyond a reasonable doubt. One may think of it as the presumption of the status quo principle as an equivalent of the presumption of innocence principle in a justice system. Next, we need to clarify what constitutes an evidence and what evidence is beyond a reasonable doubt.

    What is considered an evidence against the status quo? Whether we have an evidence or not depends on the inequality in the alternative hypothesis and a particular observation of a sample statistic. Consider the following example in which the null hypothesis assumes that the height is \(170\) cm. We are going to evaluate whether a particular observation is an evidence or not for different alternative hypotheses:

    • If the average height of a sample is less than \(170\) then it is interpreted as an evidence against the null hypothesis in LT, TT but not the RT test.
    • If the average height of a sample is exactly \(170\) then it can’t be interpreted as an evidence against the null hypothesis as it literally confirms it.
    • If the average height of a sample is greater than \(170\) then it is interpreted as an evidence against the null hypothesis in TT, RT but not the LT test.

    The following table summarizes whether a certain observation may be interpreted as evidence in favor of the alternative hypothesis:

    Observation

    \(H_0: \mu<170\)

    \(H_0: \mu\neq170\)

    \(H_0: \mu>170\)

    \(<170\)

    yes

    yes

    no

    \(=170\)

    no

    no

    no

    \(>170\)

    no

    yes

    Yes

    Definition: Significance level

    A significance level is the maximum tolerated probability of type 1.

    Usually, the significance level is determined from the context of the problem. The most common is 5% but it can be brought up to 10% or down to 1%. The smaller the significance level the stricter the test is. What considered a strong evidence under 5% significance level may not be such at 1% significance level.

    Definition: Test statistic

    A test statistic is a value that measures how far a particular observation is away from what it is expected to be according to the null hypothesis.

    The way we compute the test statistic will vary for different procedures. When \(Z\)-distribution is involved the test statistic is simply the \(z\)-score of the sample statistic. When other distributions are involved, such as \(T\), \(\chi^2\), or \(F\) it will be analogous to the \(z\)-score but it will not be called so. Note that each sample has a test statistic value that measures the strength of it as an evidence. So how exactly a test statistic can help us to determine whether we have an evidence or not?

    The observations that are smaller than expected will have smaller test statistic value than the observations that are larger than expected. So, a small test statistic is an evidence that the population parameter is smaller than assumed in the null hypothesis. Similarly, a large test statistic is an evidence that the population parameter is larger than assumed in the null hypothesis.

    For example, in case when a \(z\)-score is used as the test statistic, we know that the average is zero, so negative test statistic means "small" and positive test statistic means "large". The intuition is similar when a student T-distribution is involved. But for other distributions, such as \(\chi^2\) and \(F\), the definitions of "small" and "large" are not so intuitive.

    Now that we know what observations constitute an evidence against the null hypothesis, we need to decide what a strong evidence is. In other words, at which point the test statistic is far enough from expected for us to reject the null hypothesis. This is when the significance level would come to play.

    Section 0.4: Introduction to Hypothesis Testing - Critical Value Approach (CVA)

    In the critical value approach (CVA), "too far" is defined by the range of values called the "rejection region" (RR). If a statistic falls into the RR, then it is interpreted as a strong evidence against the null hypothesis in favor of the considered alternative.

    1. Draw the region according to the type of the test. For demonstration purposes we use a generic probability density curve of some distribution \(X\). But in an actual application it can be \(Z\), \(T\), \(X^2\), or \(F\) depending on the procedure.

    clipboard_ea492d2b69bcfc25ab5fbc3e4914c2c7d.png

    It should be easy to see the association between the type of the test and the way the rejection region is drawn.

    1. The area of the region must be equal to \(\alpha\), the significance level. By having the area of the rejection region equal to \(\alpha\) we guarantee that the probability of type 1 error is not greater than a preset significance level.

    clipboard_e91d3bd90b15eb9d311420e871ca50de2.png

    Geometrically the rejection region is defined by its boundaries which we call the critical values:

    • for a right-tail test, the rejection region is the right tail with the total area \(\alpha\), so it is defined by only the right critical value which can be expressed as \(x_\alpha\)
    • for a left-tail test, the rejection region is the left tail with the total area \(\alpha\), so it is defined by only the left critical value which can be expressed as \(x_{1-\alpha}\)
    • for a two-tail test, the rejection region with the total area \(\alpha\) is split evenly between the left and the right tail, so there are two critical values, the left and the right, \(x_{1-\alpha/2}\) and \(x_{\alpha/2}\)

    In the Critical Value Approach, the decision whether to reject the null hypothesis or not is made based on whether the test statistics is in the rejection region or not.

    Section 0.5: Introduction to Hypothesis Testing - P-Value Approach (PVA)

    In the p-value approach (PVA), "too far" is defined by the probability of a statistic to be at least as far from the expected value as the current observation. We called such probability the p-value of the test statistic.

    The region the area of which is called p-value is determined by the type of the test and the value of the test statistic \(x_0\) as shown below. Symbolically the p-value can be expressed in the following way:

    • In the right-tail test it is the area to the right of the test statistic.
    • In the left-tail test it is the area to the left of the test statistic.
    • In the two-tail test it is the area to the right or the left of whichever one is smaller multiplied by 2.

     

    clipboard_ec1897088bfc8fe8f5c07aab95234c55d.png

    Note that \(X\) stands for some generic distribution. But in a real application it will be \(Z\), \(T\), \(\chi^2\), or \(F\).

    In the P-Value Approach, the decision whether to reject the null hypothesis or not is made based on the relation between the p-value and \(\alpha\). If the p-value of the test statistic is less than \(\alpha\), then we consider the test statistic to be a strong evidence against the null hypothesis in favor of the alternative.

    In summary, the p-value is a measure of how likely we were to observe such or stronger evidence assuming that the null hypothesis is true. When the p-value is less than a preset threshold we consider our observation a strong evidence against the null hypothesis.

    Section 0.6: Introduction to Hypothesis Testing - Conclusion

    Let’s summarize all the steps:

    In step 1, we decide on the hypotheses that need to be tested and the type of the test.

    In step 2, we decide on the significance level alpha.

    In step 3, we compute the test statistic using the formula that is specific for each procedure.

    In step 4, we perform the testing using either the critical value approach or the p-value approach.

    In step 5, we will draw the conclusion to the testing procedure and decide whether to reject the null hypothesis or not. Note that the result is the same regardless of the approach.

    In step 6, we interpret the results.

    Due to the nature of the relations that we discovered in statistics the application of the test is never to confirm the null hypothesis but to evaluate the strength of the evidence against it. In case when there is enough evidence against it, the null hypothesis will be rejected in favor of the alternative hypothesis. However, the lack of any evidence against the null hypothesis does not imply that the null hypothesis is true. In other words, if you fail to prove something is false it does not necessarily make it true. For example, when there is not enough evidence to convict a suspect, it does not necessarily imply that they are innocent. As a result, there are only two possible logical conclusions to the hypothesis testing procedure - either reject the null hypothesis in favor of the alternative hypothesis or do not to reject the null hypothesis due to the lack of evidence. Again, the testing procedure is not intended to prove or verify the null hypothesis.

    We summarize the procedure by creating the following template:

    Step 1

    The null hypothesis

    The alternative hypothesis

    Step 2

    The significance level

    Step 3

    The test statistic

    Step 4

    Testing

    Step 5

    Conclusion

    Step 6

    Interpretation: Under \(\alpha\cdot100\%\) significance level we DO or DO NOT have sufficient evidence to suggest the alternative hypothesis.

    Now we have completed the development of the hypothesis testing procedure that can be used to test a variety of statistical claims.

    Section 1: Hypothesis Test - One Mean \(Z\) Procedure

    Next, we will learn how to apply the One Mean \(Z\) Procedure to test a statistical claim about a population mean. Consider the following example.

    An incubation period is a time between when you contract a virus and when your symptoms start. Assume that the population of incubation periods for a novel coronavirus is normally distributed with the population standard deviation of \(2\) days. By surveying randomly selected local hospitals, a researcher was able to obtain the following sample of incubation periods of \(10\) patients:

    6, 2, 4, 3, 3, 7, 5, 8, 5, 7

    At \(5\%\) significance level, test the claim that the average incubation period of the novel coronavirus is less than \(6\) days.

    Note that the average of the sample is \(\bar{x}_i=5\) and the population standard deviation is given \(\sigma=2\).

    Now, let’s identify the statistical claim that needs to be tested:

    “the average incubation period of the novel coronavirus is less than \(6\) days”

    The key word “average” suggests that the claim is about the parameter \(\mu\), so we can symbolically express the claim as

    \(\mu<6\)

    Since the claim is about the population mean and the standard deviation is known we will use the one mean \(Z\) procedure. Let’s check if all necessary assumptions are satisfied:

    • The sample is simple random
    • The population is normally distributed
    • The population standard deviation, \(\sigma\), is known and \(\sigma=2\)

    We will use the following template to perform the hypothesis testing:

    In step 1, we will set up the hypothesis:

    Since the claim is in the form of an inequality, therefore it must be set as an alternative hypothesis, therefore the null hypothesis is \(\mu=6\) and the test is left-tail, that is:

    \(H_0: \mu=6\)

    \(H_a: \mu<6\)

    LT

    In step 2, we will identify the significance level:

    The significance level can always be found in the text of the problem. In our case it is \(5\%\), thus:

    \(\alpha=0.05\)

    In step 3, we will find the test statistic using the formula:

    \(z_0=\frac{\bar{x}_i-\mu_0}{\sigma/\sqrt{n}}=\frac{5-6}{2/\sqrt{10}}=-1.58\)

    In step 4, we will perform either the critical value approach or p-value approach to test the claim:

    • In critical value approach, we construct the rejection region:

    RR: less than \(-z_{0.05}=-1.645\)

    • In p-value approach, we compute the p-value:

    P-Value: \(P(Z<-1.58)=0.057\)

    In step 5, we will draw the conclusion:

    • In the critical value approach, we must check whether the test statistic is in the rejection region or not. Our test statistic is \(-1.58\) and it is to the right of the critical value \(-1.645\), thus it is not in the rejection region.
    • In the p-value approach, we must check whether the p-value is less than the significance level or not. Our p-value is \(0.057\) and it is greater than \(\alpha\).

    Both tests suggest that we DO NOT reject the null hypothesis in favor of the alternative.

    In step 6, we will interpret the results:

    Under \(5\%\) significance level we DO NOT have sufficient evidence to suggest that the mean incubation period is less than \(6\) days.

    We discussed how to apply the One Mean \(Z\) Procedure to test a statistical claim about a population mean when the population standard deviation is given.


    11.1: One Mean Z Test is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?