Python Modules & Packages
Feb 15, 2023 · 3 min read
Amongst the thousands of modules and packages for the Python language, here I share the ones you should first learn, along with some of my personal favorite.
I use Pluralsight, an online education platform, to help me learn skills like Python faster and to track my progress. I highly recommend signing up for free to use their Skill IQ tests to assess your proficiency in Python and receive helpful feedback on what you should focus your learning on!
NOTE: This article was inspired by my much larger article on Learn How to Program in Python.
Python is a toy that comes with batteries included. Right from the start you'll have access to a set of helpful modules and packages that come from the standard library of Python. In order to effectively learn the language, I recommend mastering this.
Python's Standard Library:
"Python’s standard library is very extensive, offering a wide range of facilities... The library contains built-in modules (written in C) that provide access to system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as well as modules written in Python that provide standardized solutions for many problems that occur in everyday programming. Some of these modules are explicitly designed to encourage and enhance the portability of Python programs by abstracting away platform-specifics into platform-neutral APIs."
The standard library is quite large. Because of that, here are some module highlights that you can keep in mind:
sys - Provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.
os - A portable way of using operating system dependent functionality.
subprocess - Allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
traceback - A standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace.
datetime - Supplies classes for manipulating dates and times.
time - Provides various time-related functions.
zoneinfo - Uses your system's time zone data (if available) or interfaces with the first-party tzdata package if no time zone data is available.
json - Used to work with JSON files/objects, which is a lightweight format for storing and transporting data.
urllib - A package that contains several modules for working with URLs
random - Implements pseudo-random number generators for various distributions.
math - Provides access to the mathematical functions defined by the C standard. These functions cannot be used with complex numbers.
statistics - Provides functions for calculating mathematical statistics of numeric and is aimed at the level of graphing and scientific calculators.
itertools - Provides a standardized collection of tools for building custom iterations that are fast, memory efficient, and form an “iterator algebra” that makes it possible to construct specialized tools succinctly and efficiently in pure Python.
functools - For higher-order functions that act on or return other functions.
Other Modules/Packages:
"The standard library isn't the only place you'll find excellent importable modules to use with your code. The Python community also supports a thriving collection of third-party modules... If you want a preview, check out the community-run repository: http://pypi.python.org" - Python Head First, 2nd edition
Adding new modules and packages to your Python toolkit should be secondary to learning the standard library. However, you'll inevitably want to use other tools available from the Python community. I've found that it helps to first narrow the focus of your learning to a small, manageable set of functions, modules, and packages.
For example, here is a list of modules/packages that I like! Each of these can be installed from pypi.org (see the next section).
numpy - The fundamental package for scientific computing with Python.
pandas - A powerful Python data analysis toolkit
sqlalchemy - A SQL toolkit and Object Relational Mapper.
requests - A simple, yet elegant, HTTP library.
pyautogui - A cross-platform GUI automation tool for human beings that is used to programmatically control the mouse and keyboard.
scipy - A useful package for mathematics, science, and engineering. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation.
statsmodels - Provides a complement to SciPy for statistical computations, including descriptive statistics and estimation/inference for statistical models.
jupyter - Installs the Jupyter system, including the notebook, qtconsole, and IPython kernel.
pystan - An interface to Stan, a package for Bayesian inference.
pymer4 - Makes it simple to perform multi-level modeling and fitting a variety of standard regression models with robust, bootstrapped, and permuted estimators. Inspired by R's lme4 package.
prophet - Used for forecasting time series data where non-linear trends can be fit with yearly, weekly, and daily seasonality, plus holiday effects.
Django - A high-level web framework that encourages rapid development and clean, pragmatic design.
dash - A framework for building reactive web-apps that ties modern UI elements like drop-downs, sliders, and graphs directly to your analytical Python code.
plotly - An interactive data visualization library.
matplotlib - A comprehensive library for creating static, animated, and interactive visualizations in Python.
seaborn - A data visualization library that provides a high-level interface for drawing attractive statistical graphics.
plotnine - An implementation of a grammar of graphics in Python, based on R's ggplot2 library, which allows users to compose plots by explicitly mapping data to the visual objects that make up the plot.
scikit-learn - The most popular package for machine learning and data science/mining.
streamlit - A framework to quickly create beautiful and performant data applications.
tensorflow - A library that allows easy deployment of high-performance, numerical computations across a variety of hardware (i.e., CPUs, GPUs, and TPUs) and platforms (i.e., desktops, server clusters, and edge devices).
apache-airflow - A platform makes it easy to programmatically author, schedule, and monitor workflows.
boto3 - Used to create, configure, access, and manage AWS services.
psycopg2 - The most popular PostgreSQL database adapter for the Python programming language.
beautifulsoup4 - A library that makes it easy to scrape information from web pages.
dotenv - Reads key-value pairs from a .env file and can set them as environment variables.
pydantic - Data validation and settings management using type hints.
typing - Defines a standard notation for annotations that can be used for documenting code in a concise, standard format.
If you found any of my content helpful, please consider donating
using one of the following options — Anything is appreciated!