What Is Supervised Learning?
Supervised learning is the most widely used machine learning paradigm. The model learns from a dataset of labeled examples — input-output pairs where the correct answer (label) is known. The goal is to learn a mapping function that can accurately predict outputs for new, unseen inputs.
Think of it like a teacher grading homework: the model sees the question (input) and the correct answer (label), learns the pattern, and eventually can answer new questions on its own.
Two Main Tasks
Classification
Predicting a discrete category or class label.
Examples:
- Email: spam or not spam
- Image: cat, dog, or bird
- Medical test: positive or negative
- Transaction: fraudulent or legitimate
Regression
Predicting a continuous numerical value.
Examples:
- House price based on features (size, location, bedrooms)
- Stock price for the next trading day
- Patient's blood pressure based on lifestyle factors
- Energy consumption based on weather conditions
Common Supervised Learning Algorithms
| Algorithm | Task Type | Strengths |
|---|---|---|
| Linear Regression | Regression | Simple, interpretable, fast |
| Logistic Regression | Classification | Probabilistic outputs, efficient |
| Decision Trees | Both | Interpretable, handles non-linear data |
| Random Forest | Both | High accuracy, resistant to overfitting |
| Support Vector Machines | Both | Effective in high-dimensional spaces |
| k-Nearest Neighbors | Both | Simple, no training phase |
| Neural Networks | Both | Handles complex, non-linear patterns |
| Extreme Learning Machines | Both | Ultra-fast training, lightweight |
The Supervised Learning Workflow
- Collect Data — Gather a representative dataset with input features and target labels
- Split Data — Divide into training set (typically 70–80%) and test set (20–30%)
- Choose Algorithm — Select based on data type, size, and problem requirements
- Train Model — Feed training data to the algorithm; it learns the mapping function
- Evaluate — Test on held-out data using metrics like accuracy, precision, recall, F1 score (classification) or MSE, MAE, R² (regression)
- Tune — Adjust hyperparameters, add regularization, or try different algorithms
- Deploy — Put the model into production
Key Concepts
Overfitting vs. Underfitting
- Overfitting: The model memorizes the training data (including noise) and performs poorly on new data
- Underfitting: The model is too simple to capture the underlying patterns
Bias-Variance Tradeoff
- High bias: Model makes strong assumptions, misses important patterns (underfitting)
- High variance: Model is too sensitive to training data, captures noise (overfitting)
- The goal is to find the sweet spot between the two
Cross-Validation
A technique for robust model evaluation: the data is split into multiple folds, and the model is trained and tested on different combinations. This provides a more reliable estimate of performance than a single train/test split.
Supervised Learning with ELMs
Extreme Learning Machines excel in supervised learning scenarios where speed is critical. Because ELMs solve for output weights analytically (no iterative training), they can train on labeled datasets in milliseconds — enabling rapid prototyping, real-time model updates, and deployment on edge devices where computational resources are limited.