# Test Coin

https://math.stackexchange.com/questions/2033370/how-to-determine-the-number-of-coin-tosses-to-identify-one-biased-coin-from-anot/2033739#2033739

Suppose there are two coins and the percentage that each coin flips a Head is $$p$$ and $$q$$, respectively. $$p, q \in [0,1]$$ and the values are given and known. If you are free to flip one of the coins, how many times $$n$$ do you have to flip the coin to decide with some significance level $$\left( \textrm{say } \alpha = 0.05 \right)$$ that it’s the $$p$$ coin or the $$q$$ coin that you’ve been flipping?

The distribution of heads after $$n$$ flips for a coin will be a binomial distribution with means at $$pn$$ and $$qn$$.

Two binomial distributions, n = 20. The means are pn = 10 and qn = 14.

The Usual Hypothesis Test

In the usual hypothesis test, for example with data $$x_i, i=1, 2, 3, …, n$$ from a random variable $$X$$, to find the if the mean $$\mu$$ is $$\leq$$ some constant $$\mu_0$$:

\begin{align}
H_0 & : \mu \leq \mu_0 ( \textrm{ and } X \sim N(\mu_0, \textrm{ some } \sigma^2 ) )
H_1 & : \mu > \mu_0
\end{align}

If the sample mean of the data points $$\overline{x}$$ is “too large compared to” $$\mu_0$$, then we reject the null hypothesis $$H_0$$.

If we have the probability distribution of the random variable (even if we don’t know the true value of the mean $$\mu$$), we may be able to know something about the probability distribution of a statistic obtained from manipulating the sample data, e.g. the sample mean.  This, the probability distribution of a statistic (obtained from manipulating sample data), is called the sampling distribution.  And a property of the sampling distribution, the standard deviation of a statistic, is the standard error.  For example, the standard error of the mean is:

Sample Data: $$x$$ $$\qquad$$ Sample Mean: $$\overline{x}$$

Variance: $$Var(x)$$ $$\qquad$$ Standard Deviation: $$StDev(x) = \sigma(x) = \sqrt{Var(x)}$$

Variance of the Sample Mean: $$Var( \overline{x} ) = Var \left( \frac{1}{n} \sum_{i=0}^{n}{ x_i } \right) = \frac{1}{n^2} \sum_{i=0}^{n} { Var(x_i) } = \frac{1}{n^2} n Var(x) = \frac{1}{n} Var(x) = {\sigma^2 \over n}$$

Standard Deviation of the Sample Mean, Standard Error of the Mean: $$\frac{1}{\sqrt{n}} StDev(x) = {\sigma \over \sqrt{n}}$$

Thus, if the random variable is $$i.i.d.$$ (independent and identically distributed), then with the sample mean $$\overline{x}$$ we obtain from the data, we can assume this $$\overline{x}$$ has a standard deviation of $$\frac{\sigma}{\sqrt{n}}$$.  This standard deviation, being smaller than the standard deviation of the original $$X$$, i.e. $$\sigma$$, means that $$\overline{X}$$ is narrower around the mean than $$X$$. This means $$\overline{X}$$ gives us a better ability to hone in on what the data says about $$\mu$$ than $$X$$’s ability to hone, i.e. a narrower, more precise, “range of certainty,” from the sample data, with the same significance level.

Thus, given our sample $$x_i, i = 1, \dots, n$$, we can calculate the statistic $$\overline{x} = \frac{1}{n} \sum_{i=1}^{n} {x_i}$$, our sample mean.  From the data (or given information), we would like to calculate the standard error of the mean, the standard deviation of this sample mean as a random variable (where the sample mean is a statistic, i.e. can be treated as a random variable): $$\frac{1}{\sqrt{n}} StDev(x) = {\sigma \over \sqrt{n}}$$. This standard error of the mean gives us a “range of certainty” around the $$\overline{x}$$ with which to make an inference.

A. If we know/are given the true standard deviation $$\sigma$$

If we are given the true standard deviation $$\sigma$$ of the random variable $$X$$, then we can calculate the standard error of the sample mean: $$\frac{\sigma}{\sqrt{n}}$$.  So under the null hypothesis $$H_0: \mu \leq \mu_0$$, we want to check if the null hypothesis can hold against a test using the sample data.

A.a Digression about $$H_0: \mu \leq \mu_0$$ and $$H_0: \mu = \mu_0$$

If the $$\mu$$ we infer from the sample data is “too extreme,” in this case “too large” compared to $$\mu_0$$, i.e. the test statistic is > some critical value that depends on $$\mu_0$$, i.e. $$c(\mu_0)$$, we reject the null hypothesis. If we check a $$\mu_1$$ that is $$\mu_1 < \mu_0$$ (since our null hypothesis is $$H_0: \mu < \mu_0$$), our critical value $$c(\mu_1)$$ will be less extreme than $$c(\mu_0)$$ (in other words $$c(\mu_1) < c(\mu_0)$$), and thus it would be "easier to reject" the null hypothesis if using $$c(\mu_1)$$. Rejecting a hypothesis test ought to be conservative since rejecting a null hypothesis is reaching a conclusion, so we would like the test to be "the hardest to reject" that we can (a conclusion, i.e. a rejection here, should be as conservative as possible). The "hardest to reject" part of the range of $$H_0: \mu \leq \mu_0$$ would be $$\mu = \mu_0$$ where the critical value $$c(\mu_0)$$ would be the largest possible critical value. Testing a $$\mu_1 < \mu_0$$ would mean that we may obtain a test statistic that rejects is too extreme/large) for $$\mu_1$$ (i.e. $$t > c(\mu_1)$$ ) but not too extreme/large for $$\mu_0$$ (i.e. $$t \not> c(\mu_0)$$ ). But if we test using $$\mu_0$$, if the test statistic is extreme/large enough that we reject the null hypothesis of $$\mu = \mu_0$$, that would also reject all other null hypotheses using $$\mu_1$$ where $$\mu_1 < \mu_0$$.

So under the null hypothesis $$H_0: \mu \leq \mu_0$$ or the “effective” null hypothesis $$H_0: \mu = \mu_0$$, we have that $$X \sim N(\mu_0, \sigma^2)$$ with $$\sigma$$ known, and we have that $$\overline{X} \sim N(\mu_0, \sigma^2/n)$$.  This means that

$$\frac{\overline{X} – \mu_0} { ^{\sigma}/_{\sqrt{n}} } \sim N(0, 1)$$

Then we can use a standard normal table to find where on the standard normal is the $$\alpha = 0.05$$ cutoff – for a one-tailed test, the cutoff is at $$Z_{\alpha} = 1.645$$ where $$Z \sim N(0, 1)$$.  So if

$$\frac{\overline{X} – \mu_0} { ^{\sigma}/_{\sqrt{n}} } > 1.645 = Z_{\alpha}$$,

then this result is “too large compared to $$\mu_0$$” so we reject the null hypothesis $$H_0: \mu \leq \mu_0$$.  If $$\frac{\overline{X} – \mu_0} { ^{\sigma}/_{\sqrt{n}} } \leq 1.645 = Z_{\alpha}$$, then we fail to reject the null hypothesis $$H_0: \mu \leq \mu_0$$.

B. If we don’t know the standard deviation $$\sigma$$

If we don’t know the value of the standard deviation $$\sigma$$ of our random variable $$X \sim N( \mu, \sigma^2 )$$ (which would be somewhat expected if we already don’t know the value of the mean $$\mu$$ of $$X$$), then we need to estimate $$\sigma$$ from our data $$x_i, i = 1, 2, \dots, n$$.  We can estimate $$\sigma$$ by taking the sample standard deviation of $$x_i, i = 1, \dots, n$$ by doing $$s = \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } }$$, or rather the sample variance $$s^2 = { \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } }$$ and then taking the square root of that.

However, note that while the estimator for the sample variance is unbiased:

\begin{align}
\mathbb{E}\left[s^2\right] & = \mathbb{E}\left[ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } \right] \\
& = \frac{1}{n-1} \mathbb{E} \left[ \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } \right] = \frac{1}{n-1} \mathbb{E} \left[ \sum_{i=0}^{n} { \left[ (x_i -\mu + \mu – \overline{x})^2 \right] } \right] \\
& = \frac{1}{n-1} \mathbb{E} \left[ \sum_{i=0}^{n} { \left[ \left( (x_i -\mu) – (\overline{x} – \mu) \right)^2 \right] } \right] \\
& = \frac{1}{n-1} \mathbb{E} \left[ \sum_{i=0}^{n} { \left[  (x_i – \mu)^2 – 2 (x_i – \mu) (\overline{x} – \mu) + (\overline{x} – \mu)^2  \right] } \right] \\
& = \frac{1}{n-1} \mathbb{E} \left[ \sum_{i=0}^{n} { \left[  (x_i – \mu)^2 – 2 (x_i – \mu) (\overline{x} – \mu) + (\overline{x} – \mu)^2  \right] } \right] \\
& = \frac{1}{n-1} \mathbb{E} \left[   \sum_{i=0}^{n} { (x_i – \mu)^2 } – 2 (\overline{x} – \mu) \sum_{i=0}^{n} { (x_i – \mu) } + \sum_{i=0}^{n} { (\overline{x} – \mu)^2 }   \right]  \\
& = \frac{1}{n-1} \mathbb{E} \left[   \sum_{i=0}^{n} { (x_i – \mu)^2 } – 2 (\overline{x} – \mu)   (n \overline{x} – n \mu) + n (\overline{x} – \mu)^2    \right]   \\
& = \frac{1}{n-1} \mathbb{E} \left[   \sum_{i=0}^{n} { (x_i – \mu)^2 } – 2 n (\overline{x} – \mu)^2 + n (\overline{x} – \mu)^2    \right]   \\
& = \frac{1}{n-1} \mathbb{E} \left[   \sum_{i=0}^{n} { (x_i – \mu)^2 } – n (\overline{x} – \mu)^2    \right]   \\
& = \frac{1}{n-1}    \sum_{i=0}^{n} { \mathbb{E} \left[ (x_i – \mu)^2 \right] } – n \mathbb{E} \left[ (\overline{x} – \mu)^2 \right]  = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \mathbb{E} \left[ (x_i – \mu)^2 \right] } – n \mathbb{E} \left[ (\overline{x} – \mu)^2 \right]  \right)  \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \mathbb{E} \left[ x_i^2 – 2 \mu x_i + \mu^2 \right] } – n \mathbb{E} \left[ \overline{x}^2 – 2 \mu \overline{x} + \mu^2 \right]  \right) \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left( \mathbb{E} \left[ x_i^2 \right] – 2 \mu \mathbb{E} [x_i] + \mu^2 \right) } – n \left( \mathbb{E} \left[ \overline{x}^2 \right] – 2 \mu \mathbb{E} [\overline{x}] + \mu^2 \right)  \right) \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left( \mathbb{E} \left[ x_i^2 \right] – 2 \mu^2 + \mu^2 \right) } – n \left( \mathbb{E} \left[ \overline{x}^2 \right] – 2 \mu^2 + \mu^2 \right)  \right) \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left( \mathbb{E} \left[ x_i^2 \right] – \mu^2 \right) } – n \left( \mathbb{E} \left[ \overline{x}^2 \right] –  \mu^2 \right)  \right) \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left( \mathbb{E} \left[ x_i^2 \right] – \left( \mathbb{E} [x_i] \right)^2 \right) } – n \left( \mathbb{E} \left[ \overline{x}^2 \right] –  \left( \mathbb{E} [\overline{x}] \right)^2 \right)  \right) \\
& = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left(  Var(x_i) \right) } – n Var(\overline{X})  \right) = \frac{1}{n-1} \left(    \sum_{i=0}^{n} { \left( \sigma^2 \right) } – n \frac{\sigma^2}{n} \right) \\
&  = \frac{1}{n-1} \left(    n \sigma^2 – \sigma^2 \right)  = \sigma^2 \\
\end{align}

that does not allow us to say that the square root of the above estimator gives us an unbiased estimator for the standard deviation $$\sigma$$. In other words:

$$\mathbb{E}\left[s^2\right] = \mathbb{E}\left[ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } \right] = \sigma^2$$

but

$$\mathbb{E} [s] = \mathbb{E} \left[ \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } } \right] \neq \sigma$$

because the expectation function and the square root function do not commute:

$$\sigma = \sqrt{\sigma^2} = \sqrt{ \mathbb{E}[s^2] } \neq \mathbb{E}[\sqrt{s^2}] = \mathbb{E}[s]$$

B.a The sample standard deviation $$s = \sqrt{s^2} = \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } }$$ is a biased estimator of $$\sigma$$

In fact, we can infer the bias of $$\mathbb{E} [s]$$ to some extent. The square root function $$f(x) = \sqrt{x}$$ is a concave function. A concave function $$f$$ is:

$$\forall x_1, x_2 \in X, \forall t \in [0, 1]: \quad f(tx_1 + (1 – t) x_2 ) \geq tf(x_1) + (1 – t) f(x_2)$$

The left-hand side of the inequality is the blue portion of the curve $$\{ f( \textrm{mixture of } x_1 \textrm{ and } x_2 ) \}$$ and the right-hand side of the inequality is the red line segment $$\{ \textrm{a mixture of } f(x_1) \textrm{ and } f(x_2) \}$$. We can see visually what it means for a function to be concave, where between to arbitrary $$x$$-values $$x_1$$ and $$x_2$$, the blue portion is always $$\geq$$ the red portion between two $$x$$-values, .

Jensen’s Inequality says that if $$g(x)$$ is a convex function, then:

$$g( \mathbb{E}[X] ) \leq \mathbb{E}\left[ g(X) \right]$$

and if $$f(x)$$ is a concave function, then:

$$f( \mathbb{E}[X] ) \geq \mathbb{E}\left[ f(X) \right]$$

The figure above showing the concave function $$f(x) = \sqrt{x}$$ gives an intuitive illustration of Jensen’s Inequality as well (since Jensen’s Inequality can be said to be a generalization of the “mixture” of $$x_1$$ and $$x_2$$ property of convex and concave functions to the expectation operator). The left-hand side $$f(\mathbb{E}[X])$$ is like $$f( \textrm{a mixture of } X \textrm{ values} )$$ and the right-hand side $$\mathbb{E}\left[ f(X) \right]$$ is like $${\textrm{a mixture of } f(X) \textrm{ values} }$$ where the “mixture” in both cases is the “long-term mixture” of $$X$$ values that is determined by the probability distribution of $$X$$.

Since $$f(z) = \sqrt{z}$$ is a concave function, going back to our estimation of the standard deviation of $$X$$ using $$\sqrt{s^2}$$, we have
\begin{align}
f( \mathbb{E}[Z] ) & \geq \mathbb{E}\left[ f(Z) \right] \longrightarrow \\
\sqrt{\mathbb{E}[Z]} & \geq \mathbb{E}\left[ \sqrt{Z} \right] \longrightarrow \\
\sqrt{ \mathbb{E}[s^2] } & \geq \mathbb{E}\left[ \sqrt{s^2} \right] \longrightarrow \\
\sqrt{ Var(X) } & \geq \mathbb{E}\left[s\right] \\
\textrm{StDev} (X) = \sigma(X) & \geq \mathbb{E}\left[s\right] \\
\end{align}

Thus, $$\mathbb{E} [s] = \mathbb{E} \left[ \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } } \right] \leq \sigma$$. So $$\mathbb{E} [s]$$ is biased and underestimates the true $$\sigma$$.

However, the exact bias $$\textrm{Bias}(s) = \mathbb{E} [s] – \sigma$$ is not as clean to show.

https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation

$$\frac{(n-1)s^2}{\sigma^2} = \frac{1}{\sigma^2} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } \sim$$ a $$\chi^2$$ distribution with $$n-1$$ degrees of freedom. In addition, $$\sqrt{ \frac{(n-1)s^2}{\sigma^2} } = \frac{\sqrt{n-1}s}{\sigma} = \frac{1}{\sigma} \sqrt{ \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } } \sim$$ a $$\chi$$ distribution with $$n-1$$ degrees of freedom. A $$\chi$$ distribution with $$k$$ degrees of freedom has mean $$\mathbb{E} \left[ \frac{\sqrt{n-1}s}{\sigma} \right] = \mu_{\chi} = \sqrt{2} \frac{\Gamma ( ^{(k+1)} / _2 ) } { \Gamma ( ^k / _2 )}$$ where $$\Gamma(z)$$ is the Gamma function.

https://en.wikipedia.org/wiki/Gamma_function

If $$n$$ is a positive integer, then $$\Gamma(n) = (n – 1)!$$. If $$z$$ is a complex number that is not a non-positive integer, then $$\Gamma(z) = \int_{0}^{\infty}{x^{z-1} e^{-x} dx}$$. For non-positive integers, $$\Gamma(z)$$ goes to $$\infty$$ or $$-\infty$$.

From the mean of a $$\chi$$ distribution above, we have:

$$\mathbb{E}[s] = {1 \over \sqrt{n – 1} } \cdot \mu_{\chi} \cdot \sigma$$

and replacing $$k$$ with $$n-1$$ degrees of freedom for the value of $$\mu_{\chi}$$, we have:

$$\mathbb{E}[s] = \sqrt{ {2 \over n – 1} } \cdot { \Gamma(^n/_2) \over \Gamma(^{n-1}/_2) } \cdot \sigma$$

Wikipedia tells us that:

$$\sqrt{ {2 \over n – 1} } \cdot { \Gamma(^n/_2) \over \Gamma(^{n-1}/_2) } = c_4(n) = 1 – {1 \over 4n} – {7 \over 32n^2} – {19 \over 128n^3} – O(n^{-4})$$

So we have:

$$\textrm{Bias} (s) = \mathbb{E}[s] – \sigma = c_4(n) \cdot \sigma – \sigma = ( c_4(n) – 1) \cdot \sigma$$

$$= \left( \left( 1 – {1 \over 4n} – {7 \over 32n^2} – {19 \over 128n^3} – O(n^{-4}) \right) – 1 \right) \cdot \sigma = – \left( {1 \over 4n} + {7 \over 32n^2} + {19 \over 128n^3} + O(n^{-4}) \right) \cdot \sigma$$

Thus, as $$n$$ becomes large, the magnitude of the bias becomes small.

From Wikipedia, these are the values of $$n$$, $$c_4(n)$$, and the numerical value of $$c_4(n)$$:

\begin{array}{|l|r|c|}
\hline
n & c_4(n) & \textrm{Numerical value of } c_4(n) \\
\hline
2 & \sqrt{2 \over \pi} & 0.798… \\
3 & {\sqrt{\pi} \over 2} & 0.886… \\
5 & {3 \over 4}\sqrt{\pi \over 2} & 0.940… \\
10 & {108 \over 125}\sqrt{2 \over \pi} & 0.973… \\
100 & – & 0.997… \\
\hline
\end{array}

Thus, for the most part, we don’t have to worry too much about this bias, especially with large $$n$$. So we have
$$\mathbb{E}[\hat{\sigma}] \approx \mathbb{E}[s] = \mathbb{E}[\sqrt{s^2}] = \mathbb{E} \left[ \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } } \right]$$

More rigorously, our estimator $$\hat{\sigma} = s = \sqrt{ \frac{1}{n-1} \sum_{i=0}^{n} { \left[ (x_i – \overline{x})^2 \right] } }$$ is a consistent estimator of $$\sigma$$ (even though it is a biased estimator of $$\sigma$$).

An estimator is consistent if $$\forall \epsilon > 0$$:

$$\lim\limits_{n \to \infty} \textrm{Pr } (|\hat{\theta} – \theta| > \epsilon ) = 0$$

In other words, as $$n \to \infty$$, the probability that our estimator $$\hat{\theta}$$ “misses” the true value of the parameter $$\theta$$ by greater than some arbitrary positive amount (no matter how small) goes to $$0$$.

For the sample standard deviation $$s$$ as our estimator of the true standard deviation $$\sigma$$ (i.e. let $$\hat{\sigma} = s$$),

$$\lim_{n \to \infty} (|\hat{\sigma} – \sigma|) = \lim_{n \to \infty} ( | c_4(n) \sigma – \sigma |) = (| \sigma – \sigma |) = 0$$

so

$$\lim_{n \to \infty} \textrm{Pr } (|\hat{\sigma} – \sigma| > \epsilon) = \textrm{Pr } ( 0 > \epsilon ) = 0$$

Since $$s$$ is a consistent estimator of $$\sigma$$, we are fine to use $$s$$ to estimate $$\sigma$$ as long as we have large $$n$$.

So back to the matter at hand: we want to know the sampling distribution of $$\overline{X}$$ to see “what we can say” about $$\overline{X}$$, specifically, the standard deviation of $$\overline{X}$$, i.e. the standard error of the mean of $$X$$. Not knowing the true standard deviation $$\sigma$$ of $$X$$, we use a consistent estimator of $$\sigma$$ to estimate it: $$s = \sqrt{{1 \over n-1} \sum_{i=1}^n {(x_i – \overline{x})^2}}$$.

So instead of the case where we know the value of $$\sigma$$
$$\overline{X} \sim N(\mu, \sigma^2/n)$$
$$\overline{X} \quad “\sim” \quad N(\mu, s^2/n)$$

When we know the value of $$\sigma$$, we have
$${ \overline{X} – \mu \over \sigma/\sqrt{n} } \sim N(0,1)$$
When we don’t know the value of $$\sigma$$ and use the estimate $$s$$ instead of having something like
$${ \overline{X} – \mu \over s/\sqrt{n} } \quad “\sim” \quad N(0,1)$$
we actually have the exact distribution:
$${ \overline{X} – \mu \over s/\sqrt{n} } \sim T_{n-1}$$
the student’s t-distribution with $$n-1$$ degrees of freedom.

Thus, finally, when we don’t know the true standard deviation $$\sigma$$, under the null hypothesis $$H_0: \mu \leq \mu_0$$, we can use the expression above to create a test statistic
$$t = { \overline{x} – \mu_0 \over s/\sqrt{n} } ~ T_{n-1}$$
and check it against the student’s t-distribution with $$n-1$$ degrees of freedom $$T_{n-1}$$ with some critical value with some significance level, say $$\alpha = 0.05$$.

So if the test statistic exceeds our critical value $$\alpha 0.05$$:

$$t = { \overline{x} – \mu_0 \over s/\sqrt{n} } > T_{n-1, \alpha}$$

then we reject our null hypothesis $$H_0: \mu \leq \mu_0$$ at $$\alpha = 0.05$$ significance level. If not, then we fail to reject our null hypothesis.

asdf

we know the standard deviation of a data point

If under the null hypothesis $$H_0$$ we have a probability distribution, the sample data gives us a sample standard deviation, i.e. the standard error.

Back to our case with 2 coins.  Let’s say we want to test if our coin is the $$p$$ coin and let’s say we arbitrarily decide to call the smaller probability $$p$$, i.e. $$p \leq q$$.  We know that coin flips give us a binomial distribution, and we know the standard error of the mean proportion of heads from $$n$$ flips.  So a 0.05 significance level would mean some cutoff value $$c$$ where $$c > p$$.  But note that if $$c$$ ends up really big relative to $$q$$, e.g. it gets close to $$q$$ or even exceeds $$q$$, we are in a weird situation.

we can decide on some cutoff value $$c$$ between $$p$$ and $$q$$.  If we change around $$c$$, what happens is that the significance level and the power of the test, whether testing $$p$$ or $$q$$, changes.