Skip to content

Latest commit

 

History

History
307 lines (225 loc) · 8.91 KB

distplot.md

File metadata and controls

307 lines (225 loc) · 8.91 KB
jupyter
jupytextkernelspeclanguage_infoplotly
notebook_metadata_filtertext_representation
all
extensionformat_nameformat_versionjupytext_version
.md
markdown
1.3
1.14.1
display_namelanguagename
Python 3
python
python3
codemirror_modefile_extensionmimetypenamenbconvert_exporterpygments_lexerversion
nameversion
ipython
3
.py
text/x-python
python
python
ipython3
3.8.8
descriptiondisplay_aslanguagelayoutnameorderpage_typepermalinkthumbnail
How to make interactive Distplots in Python with Plotly.
statistical
python
base
Distplots
4
example_index
python/distplot/
thumbnail/distplot.jpg

Combined statistical representations with px.histogram

Several representations of statistical distributions are available in plotly, such as histograms, violin plots, box plots (see the complete list here). It is also possible to combine several representations in the same plot.

For example, the plotly.express function px.histogram can add a subplot with a different statistical representation than the histogram, given by the parameter marginal. Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures.

importplotly.expressaspxdf=px.data.tips() fig=px.histogram(df, x="total_bill", y="tip", color="sex", marginal="rug", hover_data=df.columns) fig.show()
importplotly.expressaspxdf=px.data.tips() fig=px.histogram(df, x="total_bill", y="tip", color="sex", marginal="box", # or violin, rughover_data=df.columns) fig.show()

Combined statistical representations in Dash

Dash is the best way to build analytical apps in Python using Plotly figures. To run the app below, run pip install dash, click "Download" to get the code and run python app.py.

Get started with the official Dash docs and learn how to effortlessly style & deploy apps like this with Dash Enterprise.

fromIPython.displayimportIFramesnippet_url='https://python-docs-dash-snippets.herokuapp.com/python-docs-dash-snippets/'IFrame(snippet_url+'distplot', width='100%', height=1200)

Sign up for Dash Club → Free cheat sheets plus updates from Chris Parmer and Adam Schroeder delivered to your inbox every two months. Includes tips and tricks, community apps, and deep dives into the Dash architecture. Join now.

Combined statistical representations with distplot figure factory

The distplot figure factory displays a combination of statistical representations of numerical data, such as histogram, kernel density estimation or normal curve, and rug plot.

Basic Distplot

A histogram, a kde plot and a rug plot are displayed.

importplotly.figure_factoryasffimportnumpyasnpnp.random.seed(1) x=np.random.randn(1000) hist_data= [x] group_labels= ['distplot'] # name of the datasetfig=ff.create_distplot(hist_data, group_labels) fig.show()

Plot Multiple Datasets

importplotly.figure_factoryasffimportnumpyasnp# Add histogram datax1=np.random.randn(200) -2x2=np.random.randn(200) x3=np.random.randn(200) +2x4=np.random.randn(200) +4# Group data togetherhist_data= [x1, x2, x3, x4] group_labels= ['Group 1', 'Group 2', 'Group 3', 'Group 4'] # Create distplot with custom bin_sizefig=ff.create_distplot(hist_data, group_labels, bin_size=.2) fig.show()

Use Multiple Bin Sizes

Different bin sizes are used for the different datasets with the bin_size argument.

importplotly.figure_factoryasffimportnumpyasnp# Add histogram datax1=np.random.randn(200)-2x2=np.random.randn(200) x3=np.random.randn(200)+2x4=np.random.randn(200)+4# Group data togetherhist_data= [x1, x2, x3, x4] group_labels= ['Group 1', 'Group 2', 'Group 3', 'Group 4'] # Create distplot with custom bin_sizefig=ff.create_distplot(hist_data, group_labels, bin_size=[.1, .25, .5, 1]) fig.show()

Customize Rug Text, Colors & Title

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(26) x2=np.random.randn(26) +.5group_labels= ['2014', '2015'] rug_text_one= ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] rug_text_two= ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii', 'jj', 'kk', 'll', 'mm', 'nn', 'oo', 'pp', 'qq', 'rr', 'ss', 'tt', 'uu', 'vv', 'ww', 'xx', 'yy', 'zz'] rug_text= [rug_text_one, rug_text_two] # for hover in rug plotcolors= ['rgb(0, 0, 100)', 'rgb(0, 200, 200)'] # Create distplot with custom bin_sizefig=ff.create_distplot( [x1, x2], group_labels, bin_size=.2, rug_text=rug_text, colors=colors) fig.update_layout(title_text='Customized Distplot') fig.show()

Plot Normal Curve

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(200) x2=np.random.randn(200) +2group_labels= ['Group 1', 'Group 2'] colors= ['slategray', 'magenta'] # Create distplot with curve_type set to 'normal'fig=ff.create_distplot([x1, x2], group_labels, bin_size=.5, curve_type='normal', # override default 'kde'colors=colors) # Add titlefig.update_layout(title_text='Distplot with Normal Distribution') fig.show()

Plot Only Curve and Rug

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(200) -1x2=np.random.randn(200) x3=np.random.randn(200) +1hist_data= [x1, x2, x3] group_labels= ['Group 1', 'Group 2', 'Group 3'] colors= ['#333F44', '#37AA9C', '#94F3E4'] # Create distplot with curve_type set to 'normal'fig=ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors) # Add titlefig.update_layout(title_text='Curve and Rug Plot') fig.show()

Plot Only Hist and Rug

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(200) -1x2=np.random.randn(200) x3=np.random.randn(200) +1hist_data= [x1, x2, x3] group_labels= ['Group 1', 'Group 2', 'Group 3'] colors= ['#835AF1', '#7FA6EE', '#B8F7D4'] # Create distplot with curve_type set to 'normal'fig=ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.25, show_curve=False) # Add titlefig.update_layout(title_text='Hist and Rug Plot') fig.show()

Plot Hist and Rug with Different Bin Sizes

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(200) -2x2=np.random.randn(200) x3=np.random.randn(200) +2hist_data= [x1, x2, x3] group_labels= ['Group 1', 'Group 2', 'Group 3'] colors= ['#393E46', '#2BCDC1', '#F66095'] fig=ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=[0.3, 0.2, 0.1], show_curve=False) # Add titlefig.update(layout_title_text='Hist and Rug Plot') fig.show()

Plot Only Hist and Curve

importplotly.figure_factoryasffimportnumpyasnpx1=np.random.randn(200) -2x2=np.random.randn(200) x3=np.random.randn(200) +2hist_data= [x1, x2, x3] group_labels= ['Group 1', 'Group 2', 'Group 3'] colors= ['#A56CC1', '#A6ACEC', '#63F5EF'] # Create distplot with curve_type set to 'normal'fig=ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.2, show_rug=False) # Add titlefig.update_layout(title_text='Hist and Curve Plot') fig.show()

Distplot with Pandas

importplotly.figure_factoryasffimportnumpyasnpimportpandasaspddf=pd.DataFrame({'2012': np.random.randn(200), '2013': np.random.randn(200)+1}) fig=ff.create_distplot([df[c] forcindf.columns], df.columns, bin_size=.25) fig.show()

Reference

For more info on ff.create_distplot(), see the full function reference

close