jupyter | ||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!
The tutorial below imports Numpy, Pandas, and SciPy.
importplotly.plotlyaspyimportplotly.graph_objsasgofromplotly.toolsimportFigureFactoryasFFimportnumpyasnpimportpandasaspdimportscipy
We will import a dataset to perform our discrete frequency analysis on. We will look at the consumption of alcohol by country in 2010.
data=pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2010_alcohol_consumption_by_country.csv') df=data[0:10] table=FF.create_table(df) py.iplot(table, filename='alcohol-data-sample')
We can produce a histogram plot of the data with the y-axis representing the probability distribution of the data.
x=data['alcohol'].values.tolist() trace=go.Histogram(x=x, histnorm='probability', xbins=dict(start=np.min(x), size=0.25, end=np.max(x)), marker=dict(color='rgb(25, 25, 100)')) layout=go.Layout( title="Histogram with Probability Distribution" ) fig=go.Figure(data=go.Data([trace]), layout=layout) py.iplot(fig, filename='histogram-prob-dist')
trace=go.Histogram(x=x, xbins=dict(start=np.min(x), size=0.25, end=np.max(x)), marker=dict(color='rgb(25, 25, 100)')) layout=go.Layout( title="Histogram with Frequency Count" ) fig=go.Figure(data=go.Data([trace]), layout=layout) py.iplot(fig, filename='histogram-discrete-freq-count')
trace=go.Histogram(x=x, histnorm='percent', xbins=dict(start=np.min(x), size=0.25, end=np.max(x)), marker=dict(color='rgb(50, 50, 125)')) layout=go.Layout( title="Histogram with Frequency Count" ) fig=go.Figure(data=go.Data([trace]), layout=layout) py.iplot(fig, filename='histogram-percentage')
We can also take the cumulative sum of our dataset and then plot the cumulative density function, or CDF
, as a scatter plot
cumsum=np.cumsum(x) trace=go.Scatter(x=[iforiinrange(len(cumsum))], y=10*cumsum/np.linalg.norm(cumsum), marker=dict(color='rgb(150, 25, 120)')) layout=go.Layout( title="Cumulative Distribution Function" ) fig=go.Figure(data=go.Data([trace]), layout=layout) py.iplot(fig, filename='cdf-dataset')
fromIPython.displayimportdisplay, HTMLdisplay(HTML('<link href="//fonts.googleapis.com/css?family=Open+Sans:600,400,300,200|Inconsolata|Ubuntu+Mono:400,700" rel="stylesheet" type="text/css" />')) display(HTML('<link rel="stylesheet" type="text/css" href="http://help.plot.ly/documentation/all_static/css/ipython-notebook-custom.css">')) ! pipinstallgit+https://github.com/plotly/publisher.git--upgradeimportpublisherpublisher.publish( 'python-Discrete-Frequency.ipynb', 'python/discrete-frequency/', 'Discrete Frequency | plotly', 'Learn how to perform discrete frequency analysis using Python.', title='Discrete Frequency in Python. | plotly', name='Discrete Frequency', language='python', page_type='example_index', has_thumbnail='false', display_as='statistics', order=3, ipynb='~notebook_demo/110')