import streamlit as st
import pandas as pd
from palmerpenguins import load_penguins
= load_penguins()
penguins
"Penguin Dataset Explorer")
st.header(
= st.dataframe(penguins, on_select="rerun")
selected_rows
st.write(selected_rows)
16 Bidirectional Inputs - Dataframes, Charts and Maps
In more recent versions of Streamlit, it has become possible to feed certain interactions with graphs - like selecting a series of points on a scatterplot - or with maps - like zooming to focus in on a subset of markers - back into the app.
This can allow you to do things like provide a dataframe below a map that responsively filters to only show the data relating to the points currently visible on a map.
This is a more advanced topic. It is recommended that you have become comfortable with inputs and outputs in Streamlit before attempting to use the contents of this chapter.
16.1 Dataframes
In the case of dataframes, maybe we want to allow users to easily select a subset of rows to be plotted on a graph or map, or use this subset of rows to calculate some summary statistics.
Let’s load in the penguins dataset.
Notice that we have now saved the output of st.dataframe
to a variable, and also added the parameter on_select="rerun"
.
Before we start filtering by what is returned, let’s first just see what actually is returned and explore how this updates.
Note that selecting a subset of cells like this is not sufficient.
You must select the full rows using the dataset column at the far left, to the left of the index column if displayed.
We can then use the selected row indices to restrict the rows we use for subsequent calculations.
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
from palmerpenguins import load_penguins
= load_penguins()
penguins
"Penguin Dataset Explorer")
st.header(
= st.dataframe(penguins, on_select="rerun")
selected_rows
= selected_rows["selection"]["rows"]
row_indices
print(f"You've selected {len(row_indices)} penguins")
= penguins.iloc[row_indices]
filtered_df
f"Mean Weight: {np.mean(filtered_df['body_mass_g'])}")
st.write(
st.plotly_chart(px.pie('sex']).value_counts(dropna=False).reset_index(),
pd.DataFrame(filtered_df[='count', names='sex', title="Sex of Selected Penguins")
values
)
"Your Filtered Dataframe")
st.subheader(
st.dataframe(filtered_df)
The selection_mode
parameter can be passed to st.dataframe
to allow selection of single or multiple rows, single or multiple columns, or some combination of the two.
Note that enabling column selection disables column sorting.
16.2 Graphs
Streamlit also supports monitoring st.plotly_chart
, st.altair_chart
, and st.vega_lite_chart
for point selections and using this as an input for further actions.
In this book we focus on the use of plotly; take a look at the Streamlit documentation to see how this could work with the Altair and Vega Lite plotting libraries instead.
When hovering over the plot, users are given options such as ‘box select’ (to choose a box-shaped subset of points) or lasso select (to select an irregular set of points.)
Let’s start by creating a scatterplot of the penguins dataset.
Notice that we have now saved the output of st.plotly_chart
to a variable, and also added the parameter on_select="rerun"
.
import streamlit as st
import pandas as pd
import plotly.express as px
from palmerpenguins import load_penguins
= load_penguins()
penguins
= px.scatter(penguins, x="body_mass_g", y="bill_length_mm", color="species")
fig
= st.plotly_chart(fig, on_select="rerun")
selected_data
st.write(selected_data)
Now let’s see how we could use this to update some outputs.
import streamlit as st
import pandas as pd
import plotly.express as px
from palmerpenguins import load_penguins
= load_penguins()
penguins
= px.scatter(penguins, x="body_mass_g", y="bill_length_mm")
fig
= st.plotly_chart(fig, on_select="rerun")
selected_data
= [
selected_point_indices
pointfor point
in selected_data["selection"]["point_indices"]
]
st.dataframe(
penguins.iloc[selected_point_indices, :] )
Here, we’ve just chosen a very simple example where there is no colour applied to the points in the graph.
If the color
parameter is passed to px.scatter
then the resulting point indices are related to the rows for that colour only - e.g. if we coloured by the species, then a point_index
parameter of 139
wouldn’t relate back to an index of 139
in the original dataset - it would be point 139
for that particular species.
Always explore and test the outputs of your filtering carefully to ensure it’s returning what you think it’s returning!
As of the time of writing (August 2024), this feature is quite new and there are not many examples of more advanced usage of it.
16.3 Maps
For maps, we need to use the external streamlit_folium
library, which must be installed via pip
before use - it doesn’t come bundled with Streamlit itself.
16.3.1 Filtering with the bidirectional Folium Component
When using this component, data is constantly being returned as the map is updated.
Let’s take a look at what is being returned as the map is updated.
import geopandas
import pandas as pd
import matplotlib.pyplot as plt
import streamlit as st
import folium
from streamlit_folium import st_folium
= geopandas.read_file("https://files.catbox.moe/atzk26.gpkg")
gp_list_gdf_sw
# Filter out instances with no geometry
= gp_list_gdf_sw[~gp_list_gdf_sw['geometry'].is_empty]
gp_list_gdf_sw
# Create a geometry list from the GeoDataFrame
= [[point.xy[1][0], point.xy[0][0]] for point in gp_list_gdf_sw.geometry]
geo_df_list
= folium.Map(
gp_map_tooltip =[50.7, -4.2],
location=8,
zoom_start='openstreetmap',
tiles
)
for i, coordinates in enumerate(geo_df_list):
= gp_map_tooltip.add_child(
gp_map_tooltip
folium.Marker(=coordinates,
location=gp_list_gdf_sw['name'].values[i],
tooltip=folium.Icon(icon="user-md", prefix='fa', color="black")
icon
)
)
= st_folium(gp_map_tooltip)
returned_map_data
st.write(returned_map_data)
16.3.1.1 Using the returned data
Let’s get the bounds of the map to filter a dataframe to just contain the points within the area the user has zoomed to.
import geopandas
import pandas as pd
import matplotlib.pyplot as plt
import streamlit as st
import folium
from streamlit_folium import st_folium
= geopandas.read_file(
gp_list_gdf_sw "https://files.catbox.moe/atzk26.gpkg"
)
# Filter out instances with no geometry
= gp_list_gdf_sw[~gp_list_gdf_sw['geometry'].is_empty]
gp_list_gdf_sw
# Create a geometry list from the GeoDataFrame
= [[point.xy[1][0], point.xy[0][0]] for point in gp_list_gdf_sw.geometry]
geo_df_list
= folium.Map(
gp_map_tooltip =[50.7, -4.2],
location=8,
zoom_start='openstreetmap',
tiles
)
for i, coordinates in enumerate(geo_df_list):
= gp_map_tooltip.add_child(
gp_map_tooltip
folium.Marker(=coordinates,
location=gp_list_gdf_sw['name'].values[i],
tooltip=folium.Icon(icon="user-md", prefix='fa', color="black")
icon
)
)
= st_folium(gp_map_tooltip)
returned_map_data
= returned_map_data['bounds']['_southWest']['lng']
xmin = returned_map_data['bounds']['_northEast']['lng']
xmax = returned_map_data['bounds']['_southWest']['lat']
ymin = returned_map_data['bounds']['_northEast']['lat']
ymax = gp_list_gdf_sw.cx[xmin:xmax, ymin:ymax]
gp_list_gdf_filtered
f"Returning data for {len(gp_list_gdf_filtered)} practices")
st.write(
st.dataframe('name', 'address_1', 'postcode', 'Total List Size']]
gp_list_gdf_filtered[[ )