Python Pandas read_table() Method



The read_table() method in Python's Pandas library is used to read data from a general delimited (including TSVs, CSVs, and other delimited formats) text file into a Pandas DataFrame. It provides flexible options for parsing data from various storage back-ends, including local files, URLs, and cloud storage services. It also supports various delimiters and file formats, making it ideal for handling structured data for analysis tasks.

The functionality of the read-table() method is similar to the read_csv(), but the primary difference between these two methods lies in their default behavior and intended use. The read_csv() method is specifically designed for reading comma-separated values (CSV) files, as it uses a comma (,) as the default delimiter. Whereas, read_table() is more flexible, primarily intended for files with non-standard delimiters, defaulting to a tab (\t) delimiter.

Syntax

The syntax of the read_table() method is as follows −

 pandas.read_table(filepath_or_buffer, *, sep=<no_default>, delimiter=None, header='infer', names=<no_default>, index_col=None, usecols=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=<no_default>, skip_blank_lines=True, parse_dates=False, infer_datetime_format=<no_default>, keep_date_col=<no_default>, date_parser=<no_default>, date_format=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, on_bad_lines='error', delim_whitespace=<no_default>, low_memory=True, memory_map=False, float_precision=None, storage_options=None, dtype_backend=<no_default>) 

Parameters

The Python Pandas read_table() method accepts the below parameters −

  • filepath_or_buffer: The file path, URL, or file-like object to read data from. Supports various schemes like http, ftp, s3, etc.

  • sep: Specifies the delimiter (Character or regex pattern) to use. Defaults to \t (tab-delimited files).

  • delimiter: An alternative to sep parameter for specifying delimiters.

  • header: Specifies the row number to use as column names. Defaults to infer, meaning Pandas will attempt to detect the header row automatically.

  • names: A list of column names to use when there is no header row. If the file contains a header row, you can override it by specifying custom column names.

  • index_col: Specifies a column (or multiple columns) to set as the DataFrame index.

  • usecols: Specifies which columns to load.

  • dtype: Defines the data type of columns.

  • engine: It specifies parser engine to use. Available options are "C", "python", "pyarrow".

  • converters: This parameter takes a function or a dictionary of functions for converting values in specified columns.

  • true_values and false_values: Values to consider as True and False, in addition to case-insensitive True and False.

  • skiprows: Skips the specified number of rows at the start.

  • skipinitialspace: If True, skips spaces after delimiters.

  • skipfooter: Number of lines to skip at the bottom of the file.

  • keep_default_na: Include default NaN values for missing data.

  • na_filter: Detect missing value markers. Improves performance for files without missing data.

  • skip_blank_lines: Skip over blank lines when reading.

  • parse_dates: Column/columns to parse as dates.

  • chunksize: Read data in chunks of specified size.

  • Other: It takes many other parameters for fine tuning the parsing behavior.

Return Value

The Pandas read_table() method returns a Pandas DataFrame or TextFileReader containing the data from a general delimited text file. If the iterator or chunksize parameters are specified, in which case a TextFileReader is returned.

Example: Reading a Tab-Delimited File

Before executing the below code please save the following tab-delimited tabular data in a text file.

 Car Date_of_purchase BMW 10-10-2024 Lexus 12-10-2024 Audi 17-10-2024 Mercedes 16-10-2024 Jaguar 19-10-2024 Bentley 22-10-2024 

Here is a basic example demonstrating reading a simple tab-delimited text file using the pandas read_table() method.

 import pandas as pd # Reading a tab-delimited file df = pd.read_table('foo.txt') print("DataFrame from Tab-Delimited File:") print(df) 

When we run above program, it produces following result −

 DataFrame from Tab-Delimited File: 
CarDate_of_purchase
0BMW10-10-2024
1Lexus12-10-2024
2Audi17-10-2024
3Mercedes16-10-2024
4Jaguar19-10-2024
5Bentley22-10-2024

Example: Specifying Columns to Read from a Tab-Delimited File

This example demonstrates reading specific columns from a Tab-Delimited text file using the pandas read_table() method with the usecols parameter.

 import pandas as pd # Reading a tab-delimited file df = pd.read_table('foo.txt', usecols=["Car"]) print("Selected Columns from File:") print(df.head()) 

When the above code is executed, it produces the following output −

 Selected Columns from File: 
Car
0BMW
1Lexus
2Audi
3Mercedes
4Jaguar
5Bentley

Example: Handling Missing Values

This example demonstrates how to treat specific values as missing (NaN) while reading the data using the read_table() with the na_values and keep_default_na parameters.

 import pandas as pd data = """ Name Age Salary Ravi 30 50000 Kiran NA 60000 Priya 35 N/A """ # Save data to a text file with open("foo.txt", "w") as file: file.write(data) # Reading file with custom NA values df = pd.read_table("foo.txt", na_values=["NA", "N/A"], keep_default_na=False) print("DataFrame with Custom Missing Values:") print(df) 

Following is an output of the above code −

 DataFrame with Custom Missing Values: 
NameAgeSalary
0Ravi30.050000.0
1KiranNaN60000.0
2Priya35.0NaN

Example: Skipping Rows and Columns While Reading Tab-Delimited Data

This example demonstrates how to skip specific rows and read only selected columns using the read_table() with the skiprows and usecols parameters respectively.

 import pandas as pd # Import StringIO to load a file-like object from io import StringIO data = """ Name Age Salary Ravi 30 50000 Kiran NA 60000 Priya 35 N/A """ # Use StringIO to convert the string data into a file-like object obj = StringIO(data) # Reading file with custom NA values df = pd.read_table(obj, skiprows=lambda x: x in [0, 2], usecols=["Name", "Salary"]) print("DataFrame with Skipped Rows and Selected Columns:") print(df) 

When we run above program, it produces following result −

 DataFrame with Custom Missing Values: 
NameSalary
1Kiran60000.0
2PriyaNaN
python_pandas_io_tool.htm
Advertisements
close