If your work is related to SQL & databases, you may be frustrated with troubleshooting your SQL code or SQL practice since:
- Terminal does not have a beautiful UI & is hard to manipulate your code
- IDE such as MySQLWorkbench does not support multiple outputs
Let’s see a demo output of a tool, ipython-sql, I would like to introduce today.
You may recognize that demo output looks like a screenshot of Jupyter Notebook. You’re right! We all like Jupyter since the notebook supports multiple outputs, which makes your debugging work much easier!
BTW one of the amazing features of Databricks is similar to the above screenshot in that you can run SQL code directly in Databricks notebook to query databases and data warehouse.
Python users can also use SQLAlchemy. However, ipython-sql can save your time in inputting print command or other commands
This post assumes you have the following programs:
- Python 3.9 with
- Any SQL server (local or remote)
Go to your projects dir. Created a dir named
SQL, and enable
# Go to your dir projects
pipenv shell with Python3.9
pipenv shell - python=3.9
Install required packages:
(SQL)$ pipenv install jupyter jupyterlab ipython-sql mysqlclient pandas matplotlib
mysqlclient is the connection drive for
ipython-sql to connect MySQL databases. If you have different databases, you need to download the required drivers.
ipython-sql uses SQLAlchemy connection string.
Add the current
pipenv shell to
For more info, see Jupyter Notebook Kernels: How to Add, Change, Remove.
(SQL)$ ipython kernel install --name "SQL" --user
- - -
Run the following SQL code on your terminals to create a database
create database if not exists leetcode;
Run Jupyter lab or notebook:
Add a new
.ipynb, and you should be able to select kernel
Load sql extension and setup MySQL connection string.
For more info about SQL connection string, see SQLAlchemy - Database URLs.
# load extension sql
Run the following SQL code on your Jupyter Notebook to create a demo table
SELECT in two ways
# Opt 2
# Variable substitution
Assign output to a variable
# Assign output to a variable
Convert a variable from
result = result.DataFrame()
result = %sql select player_id, sum(games_played) as games_played from Activity group by player_id
For complicated visualizations, convert the output from
pandas.core.frame.DataFrame then render your charts by your favorite visualization tools such as plotly.