Differences: Primary Key, Foreign Key, Index and Unique Contraint

TL;DR

Key/Index Type	good	bad
Primary Key	fast lookup	Requires additional storage space (benefit outweighs the cost)
Foreign Key	data integrity	additional table lookups
Index	Faster read	Slower write operations (need to update both table and index)
Composite Index	Supports queries on multiple fields	Requires more storage space
Unique Constraint	business logic	Overhead from uniqueness checks during write operations

Primary Key

Uniquely identifies each record in a table. Cannot be NULL

id is Primary Key here.

CREATE TABLE users (
    id INT PRIMARY KEY,
    username VARCHAR(50)
);

SELECT * FROM users WHERE id = 1;

Foreign Key

Creates a link between two tables, ensuring referential integrity

user_id is Foreign Key here.

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    user_id INT,
    FOREIGN KEY (user_id) REFERENCES users(id)
    ON DELETE CASCADE
    ON UPDATE CASCADE
);

SELECT orders.*, users.username 
FROM orders 
JOIN users ON orders.user_id = users.id;

Plus, here’s how foreign key relationships handle parent-child data changes:

CASCADE: If the parent gets updated or deleted, the child follows.

SET NULL: If the child can’t find the parent, it just marks the parent as “unknown.”

NO ACTION: If there are children, the parent can’t be touched.
It checks at every step, so it doesn’t wait until after the operation is completed to determine that “it’s not allowed.”

RESTRICT: Pretty much the same as NO ACTION, just immediately stops you from doing anything.

Index

Improves query performance

B-tree index (default)

Without an index, the database would perform a full table scan

--
CREATE INDEX idx_lastname ON employees(last_name);

-- with index now
SELECT * FROM employees WHERE last_name BETWEEN 'A' AND 'C';

Hash index

Best for exact equality comparisons

Used in memory tables and when extremely fast exact lookups are needed

CREATE TABLE cache (
    id INT,
    data VARCHAR(100),
    INDEX USING HASH (id)
) ENGINE=MEMORY;

-- yes
SELECT * FROM cache WHERE id = 123;

-- no
SELECT * FROM cache WHERE id > 100;

Bitmap index

Ideal for columns with few unique values (enum-like)

status

gender

flag, t or f

CREATE TABLE orders (
    id INT PRIMARY KEY,
    status ENUM('pending', 'processing', 'shipped', 'delivered') 
);

CREATE BITMAP INDEX idx_status ON orders(status);

Full-text index

For searching within large text fields.

Improves performance over LIKE queries.

-- Without index
SELECT * FROM articles 
WHERE content LIKE '%apple%' 
   OR content LIKE '%banana%';

-- With index
SELECT * FROM articles 
WHERE MATCH(content) 
AGAINST('apple banana' IN NATURAL LANGUAGE MODE);

Composite Index

Index on multiple columns

idx_name_age is just for naming purposes, it's not used when writing queries

CREATE INDEX idx_name_age ON users(last_name, first_name);

-- Uses composite index
SELECT * FROM users 
WHERE last_name = 'Smith' AND first_name = 'John';

Unique Constraint

Ensures no duplicate values

Can be NULL (unlike PRIMARY KEY)

CREATE TABLE users (
    id INT PRIMARY KEY,
    email VARCHAR(100) UNIQUE
);

Choosing Between Unique Constraint and Primary Key

Simple, small database only need a unique identifier, go using primary key is enough.

To ensure uniqueness of both an ID and another field (like email), using both. This is common with ORM tools.

In microservices architecture, Unique Constraints are often used as natural keys. For example, when checking product inventory, you're more likely to use a product code than an ID. ISBN for books is another example.

Natural Key vs. Surrogate Key

Natural Key (NK): Inherent attrs that are unique and meaningful in business logic

Surrogate Key (SK): Artificially created, no business logic significance

Primary Keys can be either NK or SK.

Unique Constraints are typically used for NK.

In most cases, SK are implemented as Primary Keys, while NK are implemented as Unique Constraints.