
- SciPy - Home
- SciPy - Introduction
- SciPy - Environment Setup
- SciPy - Basic Functionality
- SciPy - Relationship with NumPy
- SciPy Clusters
- SciPy - Clusters
- SciPy - Hierarchical Clustering
- SciPy - K-means Clustering
- SciPy - Distance Metrics
- SciPy Constants
- SciPy - Constants
- SciPy - Mathematical Constants
- SciPy - Physical Constants
- SciPy - Unit Conversion
- SciPy - Astronomical Constants
- SciPy - Fourier Transforms
- SciPy - FFTpack
- SciPy - Discrete Fourier Transform (DFT)
- SciPy - Fast Fourier Transform (FFT)
- SciPy Integration Equations
- SciPy - Integrate Module
- SciPy - Single Integration
- SciPy - Double Integration
- SciPy - Triple Integration
- SciPy - Multiple Integration
- SciPy Differential Equations
- SciPy - Differential Equations
- SciPy - Integration of Stochastic Differential Equations
- SciPy - Integration of Ordinary Differential Equations
- SciPy - Discontinuous Functions
- SciPy - Oscillatory Functions
- SciPy - Partial Differential Equations
- SciPy Interpolation
- SciPy - Interpolate
- SciPy - Linear 1-D Interpolation
- SciPy - Polynomial 1-D Interpolation
- SciPy - Spline 1-D Interpolation
- SciPy - Grid Data Multi-Dimensional Interpolation
- SciPy - RBF Multi-Dimensional Interpolation
- SciPy - Polynomial & Spline Interpolation
- SciPy Curve Fitting
- SciPy - Curve Fitting
- SciPy - Linear Curve Fitting
- SciPy - Non-Linear Curve Fitting
- SciPy - Input & Output
- SciPy - Input & Output
- SciPy - Reading & Writing Files
- SciPy - Working with Different File Formats
- SciPy - Efficient Data Storage with HDF5
- SciPy - Data Serialization
- SciPy Linear Algebra
- SciPy - Linalg
- SciPy - Matrix Creation & Basic Operations
- SciPy - Matrix LU Decomposition
- SciPy - Matrix QU Decomposition
- SciPy - Singular Value Decomposition
- SciPy - Cholesky Decomposition
- SciPy - Solving Linear Systems
- SciPy - Eigenvalues & Eigenvectors
- SciPy Image Processing
- SciPy - Ndimage
- SciPy - Reading & Writing Images
- SciPy - Image Transformation
- SciPy - Filtering & Edge Detection
- SciPy - Top Hat Filters
- SciPy - Morphological Filters
- SciPy - Low Pass Filters
- SciPy - High Pass Filters
- SciPy - Bilateral Filter
- SciPy - Median Filter
- SciPy - Non - Linear Filters in Image Processing
- SciPy - High Boost Filter
- SciPy - Laplacian Filter
- SciPy - Morphological Operations
- SciPy - Image Segmentation
- SciPy - Thresholding in Image Segmentation
- SciPy - Region-Based Segmentation
- SciPy - Connected Component Labeling
- SciPy Optimize
- SciPy - Optimize
- SciPy - Special Matrices & Functions
- SciPy - Unconstrained Optimization
- SciPy - Constrained Optimization
- SciPy - Matrix Norms
- SciPy - Sparse Matrix
- SciPy - Frobenius Norm
- SciPy - Spectral Norm
- SciPy Condition Numbers
- SciPy - Condition Numbers
- SciPy - Linear Least Squares
- SciPy - Non-Linear Least Squares
- SciPy - Finding Roots of Scalar Functions
- SciPy - Finding Roots of Multivariate Functions
- SciPy - Signal Processing
- SciPy - Signal Filtering & Smoothing
- SciPy - Short-Time Fourier Transform
- SciPy - Wavelet Transform
- SciPy - Continuous Wavelet Transform
- SciPy - Discrete Wavelet Transform
- SciPy - Wavelet Packet Transform
- SciPy - Multi-Resolution Analysis
- SciPy - Stationary Wavelet Transform
- SciPy - Statistical Functions
- SciPy - Stats
- SciPy - Descriptive Statistics
- SciPy - Continuous Probability Distributions
- SciPy - Discrete Probability Distributions
- SciPy - Statistical Tests & Inference
- SciPy - Generating Random Samples
- SciPy - Kaplan-Meier Estimator Survival Analysis
- SciPy - Cox Proportional Hazards Model Survival Analysis
- SciPy Spatial Data
- SciPy - Spatial
- SciPy - Special Functions
- SciPy - Special Package
- SciPy Advanced Topics
- SciPy - CSGraph
- SciPy - ODR
- SciPy Useful Resources
- SciPy - Reference
- SciPy - Quick Guide
- SciPy - Cheatsheet
- SciPy - Useful Resources
- SciPy - Discussion
SciPy - Working With Different File Formats
SciPy is versatile when it comes to working with different file formats. Beyond standard .mat, .npy and .npz formats SciPy offers support for other file types such as text files, CSV files, images and sound files which are commonly encountered in scientific computing, data analysis and machine learning.
Let's have a look at how to use SciPy with these various file formats in detail −
Text and CSV Files
Text and CSV files are among the most common formats for storing and exchanging tabular data. SciPy and NumPy provide efficient tools for reading from and writing to these formats by making it easy to handle datasets for scientific analysis and machine learning.
Reading Text and CSV Files
SciPy offers the scipy.io.loadtxt() and scipy.io.genfromtxt() functions for loading data from text and CSV files.
Using scipy.io.loadtxt()
loadtxt() is suitable for well-formatted numeric data with no missing values. It loads data directly into a NumPy array which can then be used for analysis. Here is the example of using the scipy.io.loadtxt() function −
import numpy as np # Load data from a CSV file with a comma delimiter data = np.loadtxt('data.csv', delimiter=',') print(data)
Following is the output of loading the text file data using the scipy.io.loadtxt() function −
<Compressed Sparse Row sparse matrix of dtype 'int32' with 4 stored elements and shape (4, 4)> Coords Values (0, 0) 1 (1, 1) 2 (2, 2) 3 (3, 3) 4
Using scipy.io.genfromtxt()
genfromtxt() is more versatile by handling missing values and various data types such as strings and floats. It's ideal for text files with inconsistent data. Below is the example which handles the missing values from the .csv file −
import numpy as np # Load data, filling missing values with zero data = np.genfromtxt('/files/data_with_missing.csv', delimiter=',', filling_values=0) print(data)
Following is the output of the handling the missing values using the scipy.io.genfromtxt() function −
[[ 0. 0. 0. 0.] [ 0. 0. 0. 88.] [ 0. 27. 0. 92.] [ 0. 22. 0. 95.] [ 0. 0. 0. 70.]]
Writing Text and CSV Files
SciPy itself doesnt provide direct functions for writing to text and CSV files so we typically rely on NumPys savetxt function or Pythons built-in CSV module.
Using NumPys savetxt with SciPy Data
np.savetxt() is versatile and supports writing arrays to text and CSV files by allowing control over delimiters, formatting and headers. Following is the example of using the savetxt() function of numpy with scipy data −
import numpy as np # Sample 2D array data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Save as CSV with comma delimiter np.savetxt('/files/output.csv', data, delimiter=',', fmt='%d') print("The file has been updated")
Following is the output of the writing data into the csv file −
The file has been updated
Working with Pandas for Advanced CSV Operations
For complex CSV files or files with headers and mixed data types then pandas is a useful library that offers additional features such as parsing dates and filtering columns by name. Here is the example which works with advanced csv operations −
import pandas as pd # Reading a CSV file with headers df = pd.read_csv('/files/data.csv') # Writing a DataFrame to CSV df.to_csv('/files/pandas_output.csv', index=False) print("The file has been updated")
Following is the output working with the pandas for advanced CSV operations −
The file has been updated
Image Files
SciPys scipy.ndimage and scipy.misc modules can handle image files. While scipy.misc has limited image capabilities, external libraries such as Pillow or imageio offer better support for reading and writing images which can be converted to NumPy arrays for use with SciPy.
Reading Images
We can use imageio or Pillow libraries to load images and convert them to NumPy arrays. Following is the example of loading images using the imageio −
import imageio import numpy as np # Read an image image = imageio.v2.imread('/Images/2d_fft.jpeg') print(image.shape) # Check dimensions of the image array
Here is the output of reading the image using the imageio library −
(400, 1200, 3)
Writing Images
For writing into an image i.e., to save an array as an image in Scipy, we can use the function imageio.imwrite. Here is the example which illustrates saving an array as an image with the help of imwrite() function −
from imageio import imwrite import numpy as np image = np.array([ [[255, 0, 0], [255, 128, 0], [255, 255, 0], [128, 255, 0], [0, 255, 0]], [[0, 255, 128], [0, 255, 255], [0, 128, 255], [0, 0, 255], [128, 0, 255]], [[255, 0, 255], [128, 128, 128], [0, 0, 0], [128, 128, 128], [255, 0, 255]], [[128, 0, 255], [0, 0, 255], [0, 128, 255], [0, 255, 255], [0, 255, 128]], [[0, 255, 0], [128, 255, 0], [255, 255, 0], [255, 128, 0], [255, 0, 0]] ], dtype=np.uint8) # Save a NumPy array as an image imwrite('/Images/output.png', image)
When we execute the above code output will be saved as an image, that we can check in the specifed location.
Sound Files
SciPy provides scipy.io.wavfile to read and write .wav audio files which is a popular format for storing uncompressed sound data. For more complex audio formats we can consider using the soundfile or librosa libraries.
Reading WAV files
To read an audio file the scipy.io.wavfile.read() function can be used. This function returns the sample rate and the audio data in the form of a NumPy array where each element represents a sample in the audio waveform. Here is the example which reads the given input audio −
from scipy.io import wavfile # Read the WAV file sampling_rate, audio_data = wavfile.read('/files/sample-3s.wav') # Display sampling rate and audio data details print(f"Sampling Rate: {sampling_rate} Hz") # Frequency of audio samples print(f"Audio Data Shape: {audio_data.shape}") # Shape of the array, e.g., (n_samples,) or (n_samples, n_channels) print(f"Data Type: {audio_data.dtype}") # Type of the data, often int16 or float32
Here is the output of reading the .wav audio file −
Sampling Rate: 44100 Hz Audio Data Shape: (140928, 2) Data Type: int16
Writing WAV files
We can create or modify audio data and save it back to a .wav file using scipy.io.wavfile.write() −
import numpy as np from scipy.io.wavfile import write # Set parameters sampling_rate = 44100 # 44.1 kHz standard sampling rate duration = 2 # 2 seconds frequency = 440 # 440 Hz tone (A4 note) # Generate a sine wave t = np.linspace(0, duration, int(sampling_rate * duration), endpoint=False) audio_data = 0.5 * np.sin(2 * np.pi * frequency * t) # Save the sine wave as a .wav file write('/files/generated_audio.wav', sampling_rate, audio_data.astype(np.float32))
HDF5 Files
HDF5 is a popular format for large datasets and while SciPy itself doesnt directly support HDF5 we can use the h5py library to work with these files. HDF5 allows hierarchical organization of data and is particularly useful for machine learning and high-performance computing.
Here is the example of reading and writing the data of the HDF5 file −
import h5py import numpy as np # Write data to an HDF5 file with h5py.File('data.h5', 'w') as file: file.create_dataset('array', data=np.array([1, 2, 3])) # Read data from an HDF5 file with h5py.File('data.h5', 'r') as file: array = file['array'][:] print(array)
Below is the output of the writing into the .h5 file −
[1 2 3]
SON and XML Files
For hierarchical or structured data JSON and XML formats are often used. SciPy doesnt have native support for these but Pythons built-in json and xml libraries can parse them and we can convert parsed data into NumPy arrays if needed.
Following is the example of reading and writing the data of the SON and XML files −
import json import numpy as np # Write JSON data data = {'array': [1, 2, 3]} with open('data.json', 'w') as f: json.dump(data, f) # Read JSON data with open('data.json', 'r') as f: data = json.load(f) array = np.array(data['array'])