Ipython-Sql
TL;DR
If your work is related to SQL & databases, you may be frustrated with troubleshooting your SQL code or SQL practice since:
- Terminal does not have a beautiful UI & is hard to manipulate your code
- IDE such as MySQLWorkbench does not support multiple outputs
Let’s see a demo output of a tool, ipython-sql, I would like to introduce today.
You may recognize that demo output looks like a screenshot of Jupyter Notebook. You’re right! We all like Jupyter since the notebook supports multiple outputs, which makes your debugging work much easier!
BTW one of the amazing features of Databricks is similar to the above screenshot in that you can run SQL code directly in Databricks notebook to query databases and data warehouse.
Python users can also use SQLAlchemy. However, ipython-sql can save your time in inputting print command or other commands
Installation
This post assumes you have the following programs:
- Python 3.9 with
pip
- Any SQL server (local or remote)
- optional:
pipenv
installed bypip
Go to your projects dir. Created a dir namedSQL
, and enablepipenv
:
1 | # Go to your dir projects |
Spawns a pipenv
shell with Python3.9
1 | pipenv shell - python=3.9 |
Install required packages:
1 | (SQL)$ pipenv install jupyter jupyterlab ipython-sql mysqlclient pandas matplotlib |
mysqlclient
is the connection drive for ipython-sql
to connect MySQL databases. If you have different databases, you need to download the required drivers. ipython-sql
uses SQLAlchemy connection string.
Add the current pipenv
shell to Jupyter
kernel.
For more info, see Jupyter Notebook Kernels: How to Add, Change, Remove.
1 | (SQL)$ ipython kernel install --name "SQL" --user |
- - -
Demo
Demo Database
Run the following SQL code on your terminals to create a database leetcode
1 | create database if not exists leetcode; |
Run Jupyter
Run Jupyter lab or notebook:
1 | jupyter lab |
Add a new .ipynb
, and you should be able to select kernel SQL
:
Load packages
Load sql extension and setup MySQL connection string.
For more info about SQL connection string, see SQLAlchemy - Database URLs.
1 | # load extension sql |
Demo Table
Run the following SQL code on your Jupyter Notebook to create a demo table Activity
:
1 | %%sql |
Demo Outputs
SELECT
in two ways
1 | %%sql |
Or
1 | # Opt 2 |
Variable substitution
1 | # Variable substitution |
Assign output to a variable
1 | # Assign output to a variable |
Convert a variable from sql.run.ResultSet
to pandas.core.frame.DataFrame
1 | result = result.DataFrame() |
Simple visualizations
1 | result = %sql select player_id, sum(games_played) as games_played from Activity group by player_id |
For complicated visualizations, convert the output from sql.run.ResultSet
to pandas.core.frame.DataFrame
then render your charts by your favorite visualization tools such as plotly.