Differences: Primary Key, Foreign Key, Index and Unique Contraint
TL;DR
Key/Index Type | good | bad |
Primary Key | fast lookup | Requires additional storage space (benefit outweighs the cost) |
Foreign Key | data integrity | additional table lookups |
Index | Faster read | Slower write operations (need to update both table and index) |
Composite Index | Supports queries on multiple fields | Requires more storage space |
Unique Constraint | business logic | Overhead from uniqueness checks during write operations |
Primary Key
Uniquely identifies each record in a table. Cannot be NULL
id
is Primary Key here.
CREATE TABLE users (
id INT PRIMARY KEY,
username VARCHAR(50)
);
SELECT * FROM users WHERE id = 1;
Foreign Key
Creates a link between two tables, ensuring referential integrity
user_id
is Foreign Key here.
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT,
FOREIGN KEY (user_id) REFERENCES users(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
SELECT orders.*, users.username
FROM orders
JOIN users ON orders.user_id = users.id;
Plus, here’s how foreign key relationships handle parent-child data changes:
- CASCADE: If the parent gets updated or deleted, the child follows.
- SET NULL: If the child can’t find the parent, it just marks the parent as “unknown.”
- NO ACTION: If there are children, the parent can’t be touched.
It checks at every step, so it doesn’t wait until after the operation is completed to determine that “it’s not allowed.”
- RESTRICT: Pretty much the same as NO ACTION, just immediately stops you from doing anything.
Index
Improves query performance
B-tree index (default)
Without an index, the database would perform a full table scan
--
CREATE INDEX idx_lastname ON employees(last_name);
-- with index now
SELECT * FROM employees WHERE last_name BETWEEN 'A' AND 'C';
Hash index
Best for exact equality comparisons
Used in memory tables and when extremely fast exact lookups are needed
CREATE TABLE cache (
id INT,
data VARCHAR(100),
INDEX USING HASH (id)
) ENGINE=MEMORY;
-- yes
SELECT * FROM cache WHERE id = 123;
-- no
SELECT * FROM cache WHERE id > 100;
Bitmap index
Ideal for columns with few unique values (enum-like)
- status
- gender
- flag, t or f
CREATE TABLE orders (
id INT PRIMARY KEY,
status ENUM('pending', 'processing', 'shipped', 'delivered')
);
CREATE BITMAP INDEX idx_status ON orders(status);
Full-text index
For searching within large text fields.
Improves performance over LIKE queries.
-- Without index
SELECT * FROM articles
WHERE content LIKE '%apple%'
OR content LIKE '%banana%';
-- With index
SELECT * FROM articles
WHERE MATCH(content)
AGAINST('apple banana' IN NATURAL LANGUAGE MODE);
Composite Index
Index on multiple columns
idx_name_age
is just for naming purposes, it's not used when writing queries
CREATE INDEX idx_name_age ON users(last_name, first_name);
-- Uses composite index
SELECT * FROM users
WHERE last_name = 'Smith' AND first_name = 'John';
Unique Constraint
- Ensures no duplicate values
Can be NULL (unlike PRIMARY KEY)
CREATE TABLE users (
id INT PRIMARY KEY,
email VARCHAR(100) UNIQUE
);
Choosing Between Unique Constraint and Primary Key
Simple, small database only need a unique identifier, go using primary key is enough.
To ensure uniqueness of both an ID and another field (like email), using both. This is common with ORM tools.
In microservices architecture, Unique Constraints are often used as natural keys. For example, when checking product inventory, you're more likely to use a product code than an ID. ISBN for books is another example.
Natural Key vs. Surrogate Key
- Natural Key (NK): Inherent attrs that are unique and meaningful in business logic
- Surrogate Key (SK): Artificially created, no business logic significance
Primary Keys can be either NK or SK.
Unique Constraints are typically used for NK.
In most cases, SK are implemented as Primary Keys, while NK are implemented as Unique Constraints.