Categories
Guides

How to increase power in statistics?

Have you ever run a test and were sure that that you would get good results for your company. Turns out the result you got wasn’t what you expected! Your test failed and you were baffled. This might be because the statistical power of your test wasn’t high enough and you failed to observe a false null hypothesis. This article will learn you how to increase power in statistics.

To be able to increase your power you will need a good understanding of the term. This article begins by describing what power is in statistics. Furthermore, it will describe how to increase power in statistics and how power is calculated. It will also describe why it’s important to have high power in statistics and what statistical power rate is an industry-average for conversion rate experts and researchers.

How to increase power? #1: What is power in statistics?

Power in statistics is the chance that a test result rejects a false null hypothesis. A null hypothesis is an event that when you are testing and comparing two variants they have no relationship with each other. When the null hypothesis is false you would have had a good result but your a missing out on it. 

Power can range from 0 to 100% percent. The higher your power is, the lower the chance of getting a false null hypothesis. Here is an example of two tests evaluated with different statistical power levels.

Test 1:

Control sessions: 10000

Control conversions: 1000

Variant sessions: 10000

Variant conversions: 1000

Power: 80%

In this example, there is an 80% chance that a true positive result will be detected and rejecting the false null hypothesis. Hence you will also have a 20% chance of not detecting a true positive result and accepting a false null hypothesis.

Test 2:

Control sessions: 15000

Control conversions: 1000

Variant sessions: 15000

Variant conversions: 1000

Power: 90%

In this example, there is a 90% chance that a true positive result will be detected and rejecting the false null hypothesis. Hence you will also have a 10% chance of not detecting a true positive result and accepting a false null hypothesis. How to increase power in statistics is clearly visible here. More traffic equals more power!


Over 250+ professionals have already signed up!

Join GrowthPenguin to receive the latest blogs in your inbox every week!

Processing…
Success! You're on the list.


Wikipedia has this definition of power:

”The power of a binary hypothesis test is the probability that the test rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true. The statistical power ranges from 0 to 1, and as statistical power increases, the probability of making a type II error (wrongly failing to reject the null hypothesis) decreases.”

A good thing to remember here is that H1 will be used to describe an alternative hypothesis and an H0 as a null hypothesis. Where an alternative hypothesis is a result of what you are looking for. A power of 80% will give you an 80% chance to get a true H0 and a 20% chance to get a false H0 when the result is an H0.

I like to think of it as being on a battleship. So here we go:

You are in a battleship in the southern pacific.

Battleship

In the bridge of the ship, you are responsible for the radar screens. You have two radar screens, one for detecting the null hypothesis and another one for detecting the average hypothesis.

Bridge of a ship

When your vessel encounters a ship the commander will shout out if it is an enemy vessel which in this example will be the null hypothesis. 

Submarines

You need to detect if the enemy ship is being followed by a submarine filled with valuable war treasury. Because it would be a waste to miss out on all the sweet loot!

Money

A ship on its own we will consider to be a true null hypothesis. We will consider it a false null hypothesis when it’s being followed by a submarine. 

Submarine with flags

You look at your null hypothesis radar to start detecting a false null hypothesis submarine. The onboard radar has a power of 80%. This means you have an 80% chance of detecting the false null hypothesis submarine.

You consider this to be an acceptable risk because raising the power would affect your other radar. The one detecting the average hypothesis.

Kid jumping

When a friendly ship appears your commander will shout this out. In this example, it is considered as an average hypothesis.

Freightship

You need to detect if the friendly ship is being followed by a submarine with nuclear warheads. Because it would otherwise blow up the friendly ship which is filled to the brim with spoils of war.

Radiation zone

When a ship is on its own it will be considered a true average hypothesis.The ship will be considered a false average hypothesis when it’s being followed by the submarine with nuclear warheads.

Ship bridge again

The ship will be considered a false average hypothesis when it’s being followed by the submarine with nuclear warheads.

You look at your average null hypothesis radar and start detecting the false average hypothesis submarine. The radar aboard has a different setting which is statistical significance.

The ship will be considered a false average hypothesis when it’s being followed by the submarine with nuclear warheads.

The setting is set to a significance level of 95%. 

Settings

Which means you have a 95% of detecting the false average hypothesis submarine with nuclear warheads. How to increase power in statistics can be done by raising this level. But this is not recommended as you will increase the chance of missing out on a submarine filled a the spoils of war.

You consider this an acceptable risk because you think it’s better to miss out on an opportunity to make money by missing out on a submarine filled with treasure(missing out on a false null hypothesis) than missing out on a submarine who will blow up a friendly ship(missing out on a false average hypothesis)! You leave the settings of the radar as they are because this is what most radar experts (researchers) have used for years and always had great results.

How to increase power? #2: How is power calculated in statistics?

The calculation of power in statistics in incredibly difficult when doing manually as shown in this article by Real Statistics. Usually, the same result can be achieved by using free calculators on the internet and let those do the critical thinking for you. 

It is wise to understand the data you are inputting into these calculators because then you can make sure your calculation is correct. It uses a lot of terms that a common in statistics. We will look at a calculator and explain the different terms.

To calculate the power I like to use this website:

https://www.stat.ubc.ca/~rollin/stats/ssize/n2.html

This first step is to set the option to calculate Power:

Calculate power setting

Then you need to input the following data to make it work:

  • μ1: The mean of population 1
  • μ2: The mean of population 2
  • 𝚺: Common standard deviation
  • One-sided or two-sided test
  • α: Type 1 error rate.

 μ: The mean of a population

Arithmetic mean

To understand the mean of a population you need to know what a mean is in arithmetics. It’s a set of numerical values that added together and divided by there numbers of items in the set. For example, you could add 3 + 4 + 5 + 6 all the numbers and get 18 in total. Then we will need to divide the set of numbers by the number of numbers in the set. In the set are 4 numbers! This means you need to divide 18 by 4 to get the mean in this example. Check out this example from GeeksforGeeks.

Population calculation

A population in statistics is a collection of persons, items, or objects. What we need is the population mean for the first two fields. To get this we will need to use this formula.

Population formula

Which is equal to this formula we discussed before:

Population formula

Let’s say we would be testing for 7 days and this is the visitors you are getting for one variant:

  • Day 1: 10000 visitors
  • Day 2: 20000 visitors
  • Day 3: 15000 visitors
  • Day 4: 18000 visitors
  • Day 5: 17000 visitors
  • Day 6: 12000 visitors
  • Day 7: 21000 visitors

This implies to get the mean of the population you will need to add all the visitors together and divide it by 7 days. 

(10000 + 20000 + 15000 + 18000 + 17000 + 12000 + 21000) / 7 = ~16,142 = μ1

Do the same thing for the visitors of your other variant and you will have the first two fields of the calculator filled in.

  • Day 1: 12000 visitors
  • Day 2: 18000 visitors
  • Day 3: 14000 visitors
  • Day 4: 12000 visitors
  • Day 5: 15000 visitors
  • Day 6: 17000 visitors
  • Day 7: 24000 visitors

(12000  + 18000 + 14000 + 12000 + 15000 + 17000 + 24000) / 7 = 16000 = μ2

Power calculator

𝚺: Sigma

Next up is sigma. Sigma is also known as the common standard deviation. It’s often safe to assume that both populations have equal standard deviation so we will use one population to calculate this value. 

You need to subtract the mean of your population from every number of the range:

  • (12000 – 16000)2 = (-4000)2 = 16000000
  • (18000 – 16000)2 = (2000)2 = 4000000
  • (14000 – 16000)2 = (-2000)2 = 4000000
  • (12000 – 16000)2 = (-4000)2 = 16000000
  • (15000 – 16000)2 = (-1000)2 = 1000000
  • (17000 – 16000)2 = (1000)2 = 1000000
  • (24000 – 16000)2 = (8000)2 = 64000000

Then you will need to work out the mean for these numbers using the same technique in the previous example.

16000000 + 4000000 + 4000000 + 16000000 + 1000000 + 1000000 + 64000000 = 106000000 / 7 = ~15,142,857

This is the variance, to get the standard deviation you need to calculate the square root of the variance.

√(15,142,857) = ~ 3891

This technique is used when calculating for the entire population. If you are analyzing a sample of the population you will need a slightly different formula.

You can skip this step entirely by going here and selecting sample or population and inputting your daily sessions separated by a comma:

https://www.calculator.net/standard-deviation-calculator.html

How to increase power in statistics can be done here by improving your process. If your standard deviation is lower you will increase your power.

One-sided or two-sided

Now you need to decide if you want a one-sided or two-sided test. Both tests have different ups and downsides and you must understand what happens with your result when choosing one.

One-tailed test

How to increase power in statistics can be done by choosing a one-tailed test. These kinds of tests will have more statistical power because you are only testing in one direction. You should only choose this test if the effects of your test can only exist in one direction. An example hypothesis one-tailed test would be if you are testing if something is above a certain percentage. When the result is underneath his certain percentage it doesn’t matter so you are testing in one direction. Learn more about the differences between these tests here:

https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/one-tailed-test-or-two/

Two-tailed test:

When you are testing if the result might be underneath or above or underneath a certain percentage then both results matter. It’s important to know if it worse so you can optimize your variant and to implement the variant if the result is better. Usually, with conversion optimization, you want to choose a two-tailed test.

Type 1 error rate

Now you need to decide to number for α, which is your type 1 error rate. As discussed before your Type 1 error rate is the chance of not detecting a false average hypothesis. I recommend leaving this to 0.05 as this is a 5% chance which is a research standard because it’s a good balance between power and significance.

The sample size

Now you need to take the sample size which you calculated the sigma for and input that into the calculator. This is only adding the sessions of one variant together.

12000  + 18000 + 14000 + 12000 + 15000 + 17000 + 24000 = 112000

Now hit calculate and bam you have your statistical power and all the tools you need to increase it!

Turns out in my example we have a 100% power!

How to increase power? #3: Why is power important?

Because the power in statistics is the chance that a test result rejects a false null hypothesis, you need to make sure that this isn’t to low as you would be missing out on great results for your study. That means you need to get it a higher value to decrease this chance.

How to increase power? #4: What power should you aim for?

The most researcher like to use 80% power combined with 95% significance because this offers a reasonable risk of a type 1 and type 2 error. But depending on the circumstances of your study you can choose to change these numbers. Do remember that you cannot have 100% power and 100% significance because you would never reach an average hypothesis because this two push and pull each other. 

How to increase power in statistics? #5: Summary

Now that you know everything that affects power and how to increase power in statistics for ab testing purposes! Here is a brief summary:

  • Use one-tailed instead of two-tailed
    • If your type of hypothesis allows this kind of one-directional testing you can use this to increase your power!
  • Increasing traffic
    • Obviously increasing your traffic will increase your sample size and this is the best way of increasing your power.
  • Higher significance level
    • The problem with this is that you are increasing the chance of rejecting a true null hypothesis so I would recommend messing with this.
  • Improve your sessions gathering systems
    • If the standard deviation is lower between your testing days you will have a higher statistical power.

I hope this article has given you the insights you need to increase your power. By discussing what power is and how it is calculated your show be able to run the numbers yourself and make risk assessment accordingly.

If there are any questions of rectifications about this article you can drop them down in the comments below! If you found this article interesting feel free to share it on social media to help us out!

Want to know more about conversion rate optimization every week? Subscribe to our newsletter and receive the latest blogs to your mail every week!

You are now ready to start testing with a statistical backbone, setup your first test with Google Optimize today!

Leave a Reply

Your email address will not be published. Required fields are marked *