plotly.py/doc/unconverted/python/anova.md at main · plotly/plotly.py

jupyter

jupytext

kernelspec

plotly

notebook_metadata_filter

text_representation

all

extension	format_name	format_version	jupytext_version
.md	markdown	1.1	1.1.1

display_name	language	name
Python 2	python	python2

description	display_as	has_thumbnail	language	layout	name	order	page_type	permalink	thumbnail
Learn how to perform a one and two way ANOVA test using Python.	statistics	false	python	base	Anova	8	example_index	python/anova/	/images/static-image

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Imports

The tutorial below imports NumPy, Pandas, SciPy, and Statsmodels.

importplotly.plotlyaspyimportplotly.graph_objsasgofromplotly.toolsimportFigureFactoryasFFimportnumpyasnpimportpandasaspdimportscipyimportstatsmodelsimportstatsmodels.apiassmfromstatsmodels.formula.apiimportols

One-Way ANOVA

An Analysis of Variance Test or an ANOVA is a generalization of the t-tests to more than 2 groups. Our null hypothesis states that there are equal means in the populations from which the groups of data were sampled. More succinctly:

$$ \begin{align*} \mu_1 = \mu_2 = ... = \mu_n \end{align*} $$

for $n$ groups of data. Our alternative hypothesis would be that any one of the equivalences in the above equation fail to be met.

moore=sm.datasets.get_rdataset("Moore", "car", cache=True) data=moore.datadata=data.rename(columns={"partner.status" :"partner_status"}) # make name pythonicmoore_lm=ols('conformity ~ C(fcategory, Sum)*C(partner_status, Sum)', data=data).fit() table=sm.stats.anova_lm(moore_lm, typ=2) # Type 2 ANOVA DataFrameprint(table)

In this ANOVA test, we are dealing with an F-Statistic and not a p-value. Their connection is integral as they are two ways of expressing the same thing. When we set a significance level at the start of our statistical tests (usually 0.05), we are saying that if our variable in question takes on the 5% ends of our distribution, then we can start to make the case that there is evidence against the null, which states that the data belongs to this particular distribution.

The F value is the point such that the area of the curve past that point to the tail is just the p-value. Therefore:

$$ \begin{align*} Pr(>F) = p \end{align*} $$

For more information on the choice of 0.05 for a significance level, check out this page.

Let us import some data for our next analysis. This time some data on tooth growth:

data=pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/tooth_growth_csv') df=data[0:10] table=FF.create_table(df) py.iplot(table, filename='tooth-data-sample')

Two-Way ANOVA

In a Two-Way ANOVA, there are two variables to consider. The question is whether our variable in question (tooth length len) is related to the two other variables supp and dose by the equation:

$$ \begin{align*} len = supp + dose + supp \times dose \end{align*} $$

formula='len ~ C(supp) + C(dose) + C(supp):C(dose)'model=ols(formula, data).fit() aov_table=statsmodels.stats.anova.anova_lm(model, typ=2) print(aov_table)

fromIPython.displayimportdisplay, HTMLdisplay(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />')) display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">')) ! pipinstallgit+https://github.com/plotly/publisher.git--upgradeimportpublisherpublisher.publish( 'python-Anova.ipynb', 'python/anova/', 'Anova | plotly', 'Learn how to perform a one and two way ANOVA test using Python.', title='Anova in Python | plotly', name='Anova', language='python', page_type='example_index', has_thumbnail='false', display_as='statistics', order=8, ipynb='~notebook_demo/108')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anova.md

anova.md

New to Plotly?

Imports

One-Way ANOVA

Two-Way ANOVA

Files

anova.md

Latest commit

History

anova.md

File metadata and controls

New to Plotly?

Imports

One-Way ANOVA

Two-Way ANOVA