top of page
Shiv Jaipersaud

Streamlit: A Powerful Tool for Data Science


The Importance of Data Science

As the world of data becomes increasingly more prevalent in every aspect of today’s world, it has never been more important to examine the trends and patterns that are changing every day. With both competition and consumption steadily increasing, having an idea of the demands of tomorrow can help you get one step ahead.


What is Streamlit?

Streamlit is an open-source framework that allows businesses and individuals alike to easily transform their data into visible models. This means that users of Streamlit can visualize existing trends in their data without needing the services of specialized data scientists and programmers. Instead, with a bit of simple Python knowledge, Streamlit does all the work for you.


How to set up Streamlit

Streamlit is an integrated framework to be used in a python IDE (Integrated Development Environment). Our recommended environment is Microsoft’s Visual Studio Code. Once an IDE is set up, the Python environment needs to be established. This can be done using a graphical interface like Anaconda Distribution or through the command terminal using venv. Finally, to finish installing Streamlit, simply run “pip install streamlit” within the terminal while the environment is active. And there you have it. Streamlit and Python are ready to go!


Utilizing Streamlit and Integration with Other Python Libraries

Now that Streamlit is set, let’s talk commands. Once Streamlit is imported in the python libraries to use, Streamlit has a couple main commands, the main one being “st.write”. To use st.write, first import data into Python. This can be done in a single line using the pandas Python library command read_csv to turn csv files into a DataFrame object. Then assign any formatting you want before applying “st.write(DataFrameVariable)” to instantly transform that data into a visual table!

Now, CSV files and tables are fairly similar so the real usefulness of Streamlit processing comes with its seamless integration with other Python libraries. For example, Plotly allows transformation of data into graphs and Choropleth Maps. Streamlit allows the writing of these maps and graphs directly onto webpages and dashboards using the same st.write function. In addition, Streamlit allows interactivity with these graphs by creating “selectboxes”, drop down menus to filter the graphs and allow changes in real time.


Snowflake Integration

But why should we use Streamlit over simply processing graphs and other using other visualization tools such as Microsoft Excel? Streamlit was acquired by Snowflake on March 2, 2022. Snowflake is a data cloud service that allows easy storage and manipulation of large databases on the cloud, thereby allowing multiple sources to push data updates in real time. Snowflake also provides a safe cloud space to deploy large applications.

With their Streamlit acquisition, Snowflake created a single integrated environment that allows both development of apps with python scripts and an easy deployment of the apps directly to the cloud. The best part is that setup of Snowflake is as easy as setting up Streamlit. With a pip install of the Snowflake python connector and by importing it as a python library, Snowflake is ready to be used alongside Streamlit in the same IDE! Now changes to any application can be sent to the cloud straight away, allowing for testing in real time.


Limitations of Streamlit

While Streamlit is a powerful visualization and filtering tool, it comes with a set of limitations.

  1. Dataset Size: While Streamlit can quickly transform formatted data into a visual display, large datasets can slow the processing and transformation time. This presents some challenges to users who need to use a large dataset and have real time transformations happening with multiple filters.

  2. Dependency on Data Processing Libraries: While Streamlit is excellent at transforming formatted data to visual forms, getting the data transformed is not a part of Streamlit’s capabilities. This places a constraint on the user to determine which Python libraries are compatible and fit their needs, and then learn how to properly utilize those libraries.

  3. Availability of Learning Material: While Streamlit is easy to use and great for non-programmers, the learning material available is constrained to Streamlit Docs for the official documentation and guides, and then community created guides and discussions. This can be challenging for the inexperience end user to navigate when looking for specific effects. As Streamlit becomes more popular, more resources will become available.


As shown above, the limitations of Streamlit can provide some challenges to end users. However, most of these challenges can be worked around with some extra resources. Extra memory and processing power can help with dataset processing times, while time investment can be useful in finding the exact Python library or walkthrough needed to provide the specific desired Streamlit output.


Overall

Despite its challenges, Streamlit remains a powerful tool for data visualization. Its quick transformations and ease of use provides users with the necessary speed and accessibility to view and adapt to daily changes in data. As data becomes increasingly relevant in all aspects of life, Streamlit has already proven to be valuable in navigating the world.

1 view0 comments

Comentarios


bottom of page