UUID Deep Dive: RFC 4122, Versions, and Best Practices

UUID (Universally Unique Identifier) is one of those things every developer uses daily but few understand at depth. The version you choose, and how you store UUIDs, can have major implications for database performance, security, and system architecture. This guide covers everything.

UID

Open in Dev Cosmos

UUID Generator →

↗

UUID Structure (RFC 4122)

A UUID is 128 bits, canonically formatted as 32 hexadecimal characters in 5 hyphen-separated groups:

550e8400 - e29b - 41d4 - a716 - 446655440000
 8 chars   4     4     4     12 chars
time_low  time  ver+  var+  node
          mid   hi    clock

The single hex digit at position 13 (the version nibble) identifies the version: 1, 4, or 5. The first 1–2 bits of the 17th group are the variant — RFC 4122 UUIDs always have 10xx here (hex 8, 9, a, or b).

UUID Versions Explained

Version 1 — Time + Node

v1 generates uniqueness from two sources: a 60-bit timestamp (100-nanosecond intervals since October 15, 1582) and a 48-bit node (typically the host's MAC address or a random value).

Advantages: Monotonically increasing per generator, meaning v1 UUIDs sort chronologically. This is valuable for time-series data and for database B-tree indexes — inserting at the end of an index is much faster than random insertion.

Disadvantage: Using the real MAC address leaks network topology. Most modern implementations use a random node to mitigate this.

Version 4 — Random (Default Choice)

v4 uses 122 bits of cryptographically random data. The remaining 6 bits encode the version and variant. No state, no coordination, no infrastructure needed.

// JavaScript — built-in
crypto.randomUUID()

// Python
import uuid; str(uuid.uuid4())

// Go
import "github.com/google/uuid"
uuid.New().String()

// C#
Guid.NewGuid().ToString()

// PostgreSQL
SELECT gen_random_uuid();

v4 is the right choice for 95% of use cases — database primary keys, request IDs, session tokens (where you also need unpredictability), feature flag keys, and test data identifiers.

Version 5 — Namespace + SHA-1 (Deterministic)

v5 is deterministic: the same namespace UUID + name string always produces the same output UUID. It hashes the concatenation using SHA-1 and sets the appropriate version/variant bits.

ℹ️

Standard Namespace UUIDs

RFC 4122 defines four well-known namespaces: DNS (6ba7b810-9dad-11d1-80b4-00c04fd430c8), URL, OID, and X.500. Use these for interoperability — any implementation using the same namespace and name will produce the same UUID.

// v5 use cases:
// 1. Stable user ID from email (same email → same UUID across services)
uuidv5("alice@example.com", DNS_NAMESPACE)
// → always "a0a60b28-5f94-5f93-8a71-2a1b5f3e8a4a"

// 2. Deduplicate events by content hash
uuidv5(JSON.stringify(eventPayload), URL_NAMESPACE)

// 3. Content-addressable IDs for documents
uuidv5(documentTitle, customNamespace)

Nil UUID

All 128 bits zero: 00000000-0000-0000-0000-000000000000. Used as a sentinel "no value" or "unset" indicator — equivalent to null but typed as a UUID. Common in databases where a nullable UUID column would complicate queries.

Collision Probability

For v4, the probability of any two UUIDs colliding is approximately 1 / (2^122). To have a 50% chance of a collision, you would need to generate roughly 2.7 × 10¹⁸ UUIDs — approximately 85 years of generating 1 billion UUIDs per second.

💡

Practical Uniqueness

For any real application, UUID v4 collision probability is so astronomically low it can be ignored entirely. The risk of your database hardware failing is orders of magnitude higher than generating two identical v4 UUIDs.

Database Performance Considerations

This is where UUID version choice matters most in production systems:

ID Strategy	Insert Perf	Globally Unique	Client-Generated	Sortable
Auto-increment INT	Excellent	No (per-DB)	No	Yes
UUID v4 (random)	Degrades at scale	Yes	Yes	No
UUID v1 (time)	Good	Yes	Yes	Yes (approx.)
UUID v7 (timestamp prefix)	Excellent	Yes	Yes	Yes
ULID / CUID2	Excellent	Yes	Yes	Yes

Random v4 UUIDs cause B-tree index fragmentation because each new row inserts at a random position in the index rather than appending to the end. At millions of rows this triggers frequent page splits and cache misses, measurably degrading INSERT performance.

UUID v7 — The Modern Answer

UUID v7 (IETF draft, widely adopted since 2023) prefixes a 48-bit Unix millisecond timestamp, followed by random bits. This makes UUIDs monotonically increasing within a millisecond, combining the B-tree locality of sequential IDs with the global uniqueness of random UUIDs. PostgreSQL 17+, MySQL 8.0+, and most modern ORMs support v7 natively.

UUIDs Are Identifiers, Not Secrets

A common mistake: using a UUID as a secret token (password reset link, email verification token, API key). UUID v4 has 122 bits of entropy — which sounds secure — but UUIDs are designed to be shared. They appear in URLs, logs, headers, and error messages. If they're guessable from context or leaked through logs, security breaks down.

For secrets, generate a dedicated cryptographic random token using crypto.randomBytes(32).toString('hex') (Node.js) or the equivalent in your language — and store only its hash.