Python Pandas read_sql() Method



The read_sql() method in Python's Pandas library is a powerful tool for loading a database table into a Pandas DataFrame or executing SQL queries and retrieving the results directly into a DataFrame. This function is useful for integrating SQL data processing within your Pandas workflow. It supports various database backends, including SQLite, PostgreSQL, and MySQL through libraries such as sqlite3, psycopg2, or pymysql.

This method is a convenient wrapper that uses the functionalities of read_sql_table() and read_sql_query(). Depending on the input provided, it automatically assigns the task to the appropriate function, read_sql_query() for executing raw SQL queries and read_sql_table() for loading an entire table by name. This design ensures backward compatibility and simplifies working with databases.

Syntax

Following is the syntax of the Python Pandas read_sql() method −

 pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None, dtype_backend=<no_default>, dtype=None) 

Parameters

The Python Pandas read_sql() method accepts the below parameters −

  • sql: A string representing the SQL query or the name of a table.

  • con: Database connection object. You can choose ADBC Connection, SQLAlchemy connectable, str, or sqlite3 connection.

  • index_col: Specifies a column (or multiple columns for MultiIndex) to set as the DataFrame index.

  • coerce_float: It is an optional parameter to convert non-numeric data to floats if possible.

  • params: Parameters for the SQL query, useful for query parameterization.

  • parse_dates: Columns to parse as dates, with options for specifying date formats or detailed parsing arguments.

  • columns: Specifies column names to select from the SQL table (used only when reading a table).

  • chunksize: Read data in chunks of specified size.

  • dtype_backend: Defines the back-end data type for the DataFrame, either NumPy-backed nullable dtypes or PyArrow-backed nullable ArrowDtypes.

  • dtype: Data type for the DataFrame or its columns, ignored if a table is passed instead of a query.

Return Value

The Pandas read_sql() method returns a pandas DataFrame containing the query results.

Example: Reading SQL Data

Here is a basic example demonstrating reading a SQL tabular data using the Pandas read_sql() method. Here we will retrieve the data from a database table using a simple SELECT query.

 import pandas as pd import sqlite3 # Create a connection to the database conn = sqlite3.connect(":memory:") # Create a sample table conn.execute("CREATE TABLE employees (id INTEGER, name TEXT, salary REAL, age INTEGER)") conn.execute("INSERT INTO employees VALUES (1, 'Ravi', 50000, 28), (2, 'Priya', 60000, 30), (3, 'Kiran', 65000, 35)") # Query the table query = "SELECT * FROM employees" df = pd.read_sql(query, conn) # Display the Output print("DataFrame from SQL table:") print(df) 

When we run above program, it produces following result −

 DataFrame from SQL table: 
idnamesalaryage
01Ravi50000.028
12Priya60000.030
23Kiran65000.035

Example: Specifying Columns to Read from SQL Database

This example demonstrates reading specific columns from a SQL using the pandas read_sql() method with the SELECT query.

 import pandas as pd import sqlite3 # Create a connection to the database conn = sqlite3.connect(":memory:") # Create a sample table conn.execute("CREATE TABLE employees (id INTEGER, name TEXT, salary REAL, age INTEGER)") conn.execute("INSERT INTO employees VALUES (1, 'Ravi', 50000, 28), (2, 'Priya', 60000, 30), (3, 'Kiran', 65000, 35)") # Query the table to select few columns query = "SELECT name, salary FROM employees" df = pd.read_sql(query, conn) # Display the output print("DataFrame with Selected Columns from SQL:") print(df) 

While executing the above code we obtain the following output −

 DataFrame with Selected Columns from SQL: 
namesalary
0Ravi50000.0
1Priya60000.0
2Kiran65000.0

Example: Filtering rows while Reading SQL Using Pandas read_SQL()

The following example demonstrates using the read_sql() method with the WHERE clause for filtering the rows when loading SQL data into Pandas DataFrame.

 import pandas as pd import sqlite3 # Create a connection to the database conn = sqlite3.connect(":memory:") # Create a sample table conn.execute("CREATE TABLE employees (id INTEGER, name TEXT, salary REAL, age INTEGER)") conn.execute("INSERT INTO employees VALUES (1, 'Ravi', 50000, 28), (2, 'Priya', 60000, 30), (3, 'Kiran', 65000, 35)") # Query the table to filter rows query = "SELECT * FROM employees WHERE salary > 55000" df = pd.read_sql(query, conn) # Display the output print("DataFrame with Filtered Rows from SQL:") print(df) 

Following is an output of the above code −

 DataFrame with Filtered Rows from SQL: 
idnamesalaryage
12Priya60000.030
23Kiran65000.035

Example: Specifying Column to Set it as DataFrame Index

This example demonstrates how to read SQL data into a Pandas DataFrame using the read_sql() method. In this example, we explicitly specified a column name to set it as the DataFrame index using the index_col parameter.

 import pandas as pd import sqlite3 # Create a connection to the database conn = sqlite3.connect(":memory:") # Create a sample table conn.execute("CREATE TABLE employees (id INTEGER, name TEXT, salary REAL, age INTEGER)") conn.execute("INSERT INTO employees VALUES (1, 'Ravi', 50000, 28), (2, 'Priya', 60000, 30), (3, 'Kiran', 65000, 35)") # Query the table query = "SELECT * FROM employees" # Read the table df = pd.read_sql(query, conn, index_col="id") # Display the output print("DataFrame from SQL:") print(df) 

Following is an output of the above code −

 DataFrame from SQL: 
namesalaryage
id
1Ravi50000.028
2Priya60000.030
3Kiran65000.035
python_pandas_io_tool.htm
Advertisements
close