
- ML - Home
- ML - Introduction
- ML - Getting Started
- ML - Basic Concepts
- ML - Ecosystem
- ML - Python Libraries
- ML - Applications
- ML - Life Cycle
- ML - Required Skills
- ML - Implementation
- ML - Challenges & Common Issues
- ML - Limitations
- ML - Reallife Examples
- ML - Data Structure
- ML - Mathematics
- ML - Artificial Intelligence
- ML - Neural Networks
- ML - Deep Learning
- ML - Getting Datasets
- ML - Categorical Data
- ML - Data Loading
- ML - Data Understanding
- ML - Data Preparation
- ML - Models
- ML - Supervised Learning
- ML - Unsupervised Learning
- ML - Semi-supervised Learning
- ML - Reinforcement Learning
- ML - Supervised vs. Unsupervised
- Machine Learning Data Visualization
- ML - Data Visualization
- ML - Histograms
- ML - Density Plots
- ML - Box and Whisker Plots
- ML - Correlation Matrix Plots
- ML - Scatter Matrix Plots
- Statistics for Machine Learning
- ML - Statistics
- ML - Mean, Median, Mode
- ML - Standard Deviation
- ML - Percentiles
- ML - Data Distribution
- ML - Skewness and Kurtosis
- ML - Bias and Variance
- ML - Hypothesis
- Regression Analysis In ML
- ML - Regression Analysis
- ML - Linear Regression
- ML - Simple Linear Regression
- ML - Multiple Linear Regression
- ML - Polynomial Regression
- Classification Algorithms In ML
- ML - Classification Algorithms
- ML - Logistic Regression
- ML - K-Nearest Neighbors (KNN)
- ML - Naïve Bayes Algorithm
- ML - Decision Tree Algorithm
- ML - Support Vector Machine
- ML - Random Forest
- ML - Confusion Matrix
- ML - Stochastic Gradient Descent
- Clustering Algorithms In ML
- ML - Clustering Algorithms
- ML - Centroid-Based Clustering
- ML - K-Means Clustering
- ML - K-Medoids Clustering
- ML - Mean-Shift Clustering
- ML - Hierarchical Clustering
- ML - Density-Based Clustering
- ML - DBSCAN Clustering
- ML - OPTICS Clustering
- ML - HDBSCAN Clustering
- ML - BIRCH Clustering
- ML - Affinity Propagation
- ML - Distribution-Based Clustering
- ML - Agglomerative Clustering
- Dimensionality Reduction In ML
- ML - Dimensionality Reduction
- ML - Feature Selection
- ML - Feature Extraction
- ML - Backward Elimination
- ML - Forward Feature Construction
- ML - High Correlation Filter
- ML - Low Variance Filter
- ML - Missing Values Ratio
- ML - Principal Component Analysis
- Reinforcement Learning
- ML - Reinforcement Learning Algorithms
- ML - Exploitation & Exploration
- ML - Q-Learning
- ML - REINFORCE Algorithm
- ML - SARSA Reinforcement Learning
- ML - Actor-critic Method
- ML - Monte Carlo Methods
- ML - Temporal Difference
- Deep Reinforcement Learning
- ML - Deep Reinforcement Learning
- ML - Deep Reinforcement Learning Algorithms
- ML - Deep Q-Networks
- ML - Deep Deterministic Policy Gradient
- ML - Trust Region Methods
- Quantum Machine Learning
- ML - Quantum Machine Learning
- ML - Quantum Machine Learning with Python
- Machine Learning Miscellaneous
- ML - Performance Metrics
- ML - Automatic Workflows
- ML - Boost Model Performance
- ML - Gradient Boosting
- ML - Bootstrap Aggregation (Bagging)
- ML - Cross Validation
- ML - AUC-ROC Curve
- ML - Grid Search
- ML - Data Scaling
- ML - Train and Test
- ML - Association Rules
- ML - Apriori Algorithm
- ML - Gaussian Discriminant Analysis
- ML - Cost Function
- ML - Bayes Theorem
- ML - Precision and Recall
- ML - Adversarial
- ML - Stacking
- ML - Epoch
- ML - Perceptron
- ML - Regularization
- ML - Overfitting
- ML - P-value
- ML - Entropy
- ML - MLOps
- ML - Data Leakage
- ML - Monetizing Machine Learning
- ML - Types of Data
- Machine Learning - Resources
- ML - Quick Guide
- ML - Cheatsheet
- ML - Interview Questions
- ML - Useful Resources
- ML - Discussion
Machine Learning - Types of Data
Data in machine learning are broadly categorized into two types − numerical (quantitative) and categorical (qualitative) data. The numerical data can be measured, counted or given a numerical value, for example, age, height, income, etc. The categorical data is non-numeric data that can be arranged in categories with or without meaningful order, for example, gender, blood group, etc.
Further, the numerical data can be categorized into discrete and continuous data. The categorical data can also be categorized into two types − nominal and ordinal. Let's understand these types of data in machine learning in detail.

What is Data in Machine Learning?
Data in machine learning is a set of observations or measurement that are used to train, validate and test a machine learning model. Data is very crucial in machine learning because it is the foundation of creating accurate machine learning model.
What are Types of Data?
The data used in machine learning can be broadly categorized into two types −
Numerical (Quantitative) Data
The numerical (quantitative) data is data that can be measured, counted or given a numerical value. The examples of numerical data are age, height, income, number of students in class, number of books in a shelf, shoe size, etc.
The numerical data can be categorized into the folloiwng two types −
- Discrete Data
- Continuous Data
1. Discrete Data
The discrete data is numerical data that is countable, finite, and can only take certain values, usually whole numbers. Examples of discrete data are number of students in class, number of books in a shelf, shoe size, number of ducks in a pond, etc.
2. Continuous Data
The continuous data is numerical data that can take any value within a specified range including fractions and decimals. Examples of continuous data are age, height, weight, income, time, temperature, etc.
What is true zero?
True zero represents the absence of the quantity being measured. For example, height, weight, age, temperature in Kelvin are examples of data with true zero. As the height with 0 CM represents the absolute absence of height, 0K temperature represents no heat. But temperature in Celsius (or Fahrenheit) is an example of data with false zero.
We can categorize the numerical data into the following two types on basis of true zero −
- interval data − quantitative data with equal intervals between data points. Examples are temperature (Fahrenheit), temperature (Celsius), pH, SAT score (200-800), credit score (300-850), etc.
- ratio data − same as interval data but with true zero. Examples are weight in KG, number of students, income, speed, etc.
Categorical (Qualitative) Data
The categorical (qualitative) data can be categorized with or without a meaningful order. For example, gender, blood group, hair color, nationality, the school grades, level of education, range of income, ratings, etc.
The categorical data can be divided into the folloiwng two types −
- Nominal Data
- Ordinal Data
1. Nominal Data
The nominal data is categorical data that can not be arranged in an order or rank. The examples of nominal data are gender, blood group, hair color, nationality, etc.
2. Ordinal Data
The ordinal data is categorical data can be ordered or ranked with a specific attribute. The examples of ordinal data are the school grades, level of education, range of income, ratings, etc.
The Four Levels of Data Measurement
We can categorized data into four level − nominal, ordinal, interval, and ratio. These levels of measurement are divided on basis of the following four features −
- Categories − data can be categorized but not in an order.
- Rank Order − data can be categorized with some meaningful order.
- Equal Difference − The difference between subsequent data remains same.
- True Zero − it represents the absence of quantity being measured.
The following table highlights how the four level of measurement are associated with the above discussed four features.
Nominal | Ordinal | Interval | Ratio | |
---|---|---|---|---|
Categories | Yes | Yes | Yes | Yes |
Rank Order | Yes | Yes | Yes | |
Equal Difference | Yes | Yes | ||
True Zero | Yes |
The nominal data is categorical data with no meaningful order whereas ordinal data is a categorical data with meaningful order. The concept of true zero plays role to differentiate interval and ratio data. Ratio data is same as interval data but it includes true zero.