Skip to content

PostgreSQL Data Types

PostgreSQL offers a rich variety of data types. Choosing the right type ensures data integrity, optimizes storage, and improves query performance. This chapter covers all essential data types with practical examples.

Why Data Types Matter

┌─────────────────────────────────────────────────────────────────┐
│                   Importance of Data Types                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ✅ Data Integrity    - Prevents invalid data from being stored│
│   ✅ Storage Efficiency - Uses appropriate amount of space      │
│   ✅ Query Performance  - Faster comparisons and calculations   │
│   ✅ Clear Intent       - Documents what data is expected       │
│                                                                  │
│   Example:                                                       │
│   email VARCHAR(255)  → Only text, max 255 characters           │
│   age INTEGER         → Only whole numbers                       │
│   price DECIMAL(10,2) → Numbers with 2 decimal places           │
│   is_active BOOLEAN   → Only true or false                      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Numeric Types

Integer Types

TypeStorageRangeUse Case
SMALLINT2 bytes-32,768 to 32,767Age, small counts
INTEGER4 bytes-2.1 billion to 2.1 billionMost integer needs
BIGINT8 bytes-9.2 quintillion to 9.2 quintillionLarge IDs, big data
sql
-- Creating a table with integer types
CREATE TABLE product_inventory (
    id INTEGER PRIMARY KEY,
    quantity SMALLINT,           -- Small numbers
    views INTEGER,               -- Medium numbers
    total_sold BIGINT           -- Large numbers
);

-- SERIAL: Auto-incrementing integer
CREATE TABLE users (
    id SERIAL PRIMARY KEY,       -- Auto-generates: 1, 2, 3, ...
    name VARCHAR(100)
);

-- BIGSERIAL for large tables
CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,    -- For billions of rows
    event_name VARCHAR(200)
);

Decimal Types

TypeDescriptionPrecisionUse Case
DECIMAL(p,s)Exact precisionUser-definedMoney, precise calculations
NUMERIC(p,s)Same as DECIMALUser-definedSame as DECIMAL
REAL4-byte floating point6 decimal digitsScientific (approximate)
DOUBLE PRECISION8-byte floating point15 decimal digitsScientific (approximate)
sql
-- DECIMAL/NUMERIC for money (exact precision)
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    price DECIMAL(10, 2),        -- Up to 99999999.99
    tax_rate DECIMAL(5, 4)       -- Up to 9.9999 (like 0.0825)
);

-- Insert examples
INSERT INTO products (name, price, tax_rate) VALUES
    ('Laptop', 999.99, 0.0825),
    ('Mouse', 29.50, 0.0825);

-- Precise calculations
SELECT
    name,
    price,
    price * tax_rate AS tax,
    price + (price * tax_rate) AS total
FROM products;

Avoid FLOAT for Money

Never use REAL or DOUBLE PRECISION for money! Floating-point numbers can have rounding errors.

sql
-- BAD: Can cause rounding errors
price REAL  -- 0.1 + 0.2 might not equal 0.3

-- GOOD: Exact precision
price DECIMAL(10, 2)

Text Types

Character Types

TypeDescriptionUse Case
CHAR(n)Fixed length, padded with spacesFixed codes (country, status)
VARCHAR(n)Variable length, max n charactersMost text fields
TEXTUnlimited lengthLong content
sql
CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title VARCHAR(200),          -- Up to 200 characters
    slug VARCHAR(100),           -- URL-friendly title
    status CHAR(1),              -- 'D'raft, 'P'ublished, 'A'rchived
    content TEXT,                -- Full article body
    summary VARCHAR(500)
);

-- Examples
INSERT INTO articles (title, slug, status, content, summary)
VALUES (
    'Getting Started with PostgreSQL',
    'getting-started-postgresql',
    'P',
    'This is the full article content that can be very long...',
    'A quick introduction to PostgreSQL basics.'
);

Text Operations

sql
-- String functions
SELECT
    UPPER(name) AS uppercase,
    LOWER(email) AS lowercase,
    LENGTH(name) AS name_length,
    CONCAT(first_name, ' ', last_name) AS full_name,
    SUBSTRING(phone FROM 1 FOR 3) AS area_code
FROM users;

-- Pattern matching
SELECT * FROM users WHERE email LIKE '%@gmail.com';
SELECT * FROM users WHERE name ILIKE 'john%';  -- Case insensitive

Date and Time Types

Available Types

TypeStorageDescriptionExample
DATE4 bytesDate only'2024-01-15'
TIME8 bytesTime only'14:30:00'
TIMESTAMP8 bytesDate and time'2024-01-15 14:30:00'
TIMESTAMPTZ8 bytesWith timezone'2024-01-15 14:30:00+07'
INTERVAL16 bytesTime duration'1 year 2 months'
sql
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    name VARCHAR(200),
    event_date DATE,
    start_time TIME,
    created_at TIMESTAMP DEFAULT NOW(),
    scheduled_at TIMESTAMPTZ,
    duration INTERVAL
);

-- Insert with various date/time formats
INSERT INTO events (name, event_date, start_time, scheduled_at, duration)
VALUES (
    'Team Meeting',
    '2024-03-15',
    '09:00:00',
    '2024-03-15 09:00:00+07:00',
    '2 hours'
);

-- Date arithmetic
SELECT
    event_date,
    event_date + INTERVAL '1 week' AS next_week,
    event_date - INTERVAL '1 month' AS last_month,
    scheduled_at + duration AS end_time,
    AGE(event_date, CURRENT_DATE) AS days_until
FROM events;

Common Date Functions

sql
-- Current date/time
SELECT
    CURRENT_DATE,                     -- 2024-01-15
    CURRENT_TIME,                     -- 14:30:00.123456
    CURRENT_TIMESTAMP,                -- 2024-01-15 14:30:00.123456
    NOW();                            -- Same as CURRENT_TIMESTAMP

-- Extract parts
SELECT
    EXTRACT(YEAR FROM created_at) AS year,
    EXTRACT(MONTH FROM created_at) AS month,
    EXTRACT(DAY FROM created_at) AS day,
    EXTRACT(HOUR FROM created_at) AS hour,
    EXTRACT(DOW FROM created_at) AS day_of_week  -- 0=Sunday
FROM events;

-- Format dates
SELECT
    TO_CHAR(created_at, 'YYYY-MM-DD') AS iso_date,
    TO_CHAR(created_at, 'Month DD, YYYY') AS pretty_date,
    TO_CHAR(created_at, 'HH12:MI AM') AS time_12hr
FROM events;

-- Date truncation
SELECT
    DATE_TRUNC('month', created_at) AS month_start,
    DATE_TRUNC('year', created_at) AS year_start,
    DATE_TRUNC('hour', created_at) AS hour_start
FROM events;

Use TIMESTAMPTZ

Always use TIMESTAMPTZ instead of TIMESTAMP for real applications. It handles timezone conversions automatically and prevents timezone-related bugs.

sql
-- PostgreSQL converts to/from UTC automatically
SET timezone = 'Asia/Phnom_Penh';
SELECT '2024-01-15 09:00:00+00'::timestamptz;
-- Shows: 2024-01-15 16:00:00+07

Boolean Type

sql
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE,
    is_active BOOLEAN DEFAULT true,
    is_admin BOOLEAN DEFAULT false,
    email_verified BOOLEAN DEFAULT false
);

-- Boolean values: true, false, NULL
INSERT INTO users (email, is_active, is_admin, email_verified)
VALUES
    ('admin@example.com', true, true, true),
    ('user@example.com', true, false, false),
    ('inactive@example.com', false, false, true);

-- Querying booleans
SELECT * FROM users WHERE is_active;           -- Same as is_active = true
SELECT * FROM users WHERE NOT is_admin;        -- Same as is_admin = false
SELECT * FROM users WHERE is_active AND email_verified;

UUID Type

UUID (Universally Unique Identifier) is perfect for distributed systems or when you don't want sequential IDs.

sql
-- Enable UUID extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE sessions (
    id UUID DEFAULT uuid_generate_v4() PRIMARY KEY,
    user_id INTEGER REFERENCES users(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    expires_at TIMESTAMPTZ
);

-- Insert (UUID generated automatically)
INSERT INTO sessions (user_id, expires_at)
VALUES (1, NOW() + INTERVAL '24 hours')
RETURNING id;

-- Result: id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'

JSON Types

PostgreSQL has excellent JSON support with two types:

TypeDescriptionUse Case
JSONStores as text, validates on insertInfrequent access
JSONBBinary format, supports indexingFrequent queries (recommended)
sql
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(200),
    attributes JSONB,
    metadata JSON
);

-- Insert JSON data
INSERT INTO products (name, attributes, metadata) VALUES
(
    'Gaming Laptop',
    '{
        "brand": "TechPro",
        "specs": {
            "cpu": "Intel i9",
            "ram": "32GB",
            "storage": "1TB SSD"
        },
        "colors": ["black", "silver"],
        "in_stock": true,
        "price": 1299.99
    }',
    '{"created_by": "admin", "version": 1}'
);

Querying JSON Data

sql
-- Access JSON fields
SELECT
    name,
    attributes->>'brand' AS brand,              -- Get as text
    attributes->'specs'->>'cpu' AS cpu,         -- Nested access
    attributes->'specs' AS all_specs,           -- Get as JSON
    attributes->'colors'->0 AS first_color      -- Array access
FROM products;

-- Filter by JSON values
SELECT * FROM products
WHERE attributes->>'brand' = 'TechPro';

SELECT * FROM products
WHERE (attributes->'specs'->>'ram')::integer >= 16;

-- Check if key exists
SELECT * FROM products
WHERE attributes ? 'colors';

-- Check if value is in array
SELECT * FROM products
WHERE attributes->'colors' ? 'black';

-- Update JSON fields
UPDATE products
SET attributes = jsonb_set(attributes, '{price}', '1199.99')
WHERE id = 1;

-- Add new key
UPDATE products
SET attributes = attributes || '{"warranty": "2 years"}'::jsonb
WHERE id = 1;

JSONB vs JSON

Always prefer JSONB over JSON:

  • JSONB is faster for queries
  • JSONB supports indexing
  • JSONB removes duplicate keys and whitespace
  • JSON only validates and stores as text

Array Types

PostgreSQL supports arrays of any data type.

sql
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title VARCHAR(200),
    tags TEXT[],
    ratings INTEGER[],
    view_history TIMESTAMP[]
);

-- Insert arrays
INSERT INTO posts (title, tags, ratings) VALUES
    ('PostgreSQL Tips', ARRAY['database', 'postgresql', 'tutorial'], ARRAY[5, 4, 5, 5]),
    ('Web Development', '{"html", "css", "javascript"}', '{4, 3, 5}');

-- Query arrays
SELECT
    title,
    tags[1] AS first_tag,           -- Arrays are 1-indexed
    array_length(tags, 1) AS tag_count,
    'postgresql' = ANY(tags) AS has_postgresql_tag
FROM posts;

-- Filter by array contents
SELECT * FROM posts WHERE 'database' = ANY(tags);
SELECT * FROM posts WHERE tags @> ARRAY['postgresql'];  -- Contains

-- Array functions
SELECT
    array_append(tags, 'new_tag') AS with_new_tag,
    array_remove(tags, 'tutorial') AS without_tutorial,
    array_cat(tags, ARRAY['extra1', 'extra2']) AS combined
FROM posts;

-- Unnest: Convert array to rows
SELECT id, title, unnest(tags) AS tag FROM posts;

Enum Types

Enums ensure values come from a predefined list.

sql
-- Create enum type
CREATE TYPE order_status AS ENUM (
    'pending',
    'processing',
    'shipped',
    'delivered',
    'cancelled'
);

CREATE TYPE priority_level AS ENUM ('low', 'medium', 'high', 'critical');

-- Use in table
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INTEGER,
    status order_status DEFAULT 'pending',
    priority priority_level DEFAULT 'medium',
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Insert (only valid enum values allowed)
INSERT INTO orders (customer_id, status, priority)
VALUES (1, 'processing', 'high');

-- Invalid value causes error
-- INSERT INTO orders (status) VALUES ('invalid'); -- Error!

-- Query
SELECT * FROM orders WHERE status = 'pending';
SELECT * FROM orders WHERE priority IN ('high', 'critical');

-- Add new value to enum
ALTER TYPE order_status ADD VALUE 'refunded' AFTER 'cancelled';

Network Types

PostgreSQL has specialized types for network data.

sql
CREATE TABLE server_logs (
    id SERIAL PRIMARY KEY,
    client_ip INET,
    server_network CIDR,
    mac_address MACADDR,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

INSERT INTO server_logs (client_ip, server_network, mac_address)
VALUES (
    '192.168.1.100',
    '192.168.1.0/24',
    '08:00:2b:01:02:03'
);

-- Network operations
SELECT
    client_ip,
    host(client_ip) AS ip_text,
    client_ip << '192.168.0.0/16' AS is_private,  -- Is contained in
    network(server_network) AS network_address
FROM server_logs;

Special Types

Money Type

sql
CREATE TABLE transactions (
    id SERIAL PRIMARY KEY,
    amount MONEY,
    currency CHAR(3) DEFAULT 'USD'
);

INSERT INTO transactions (amount) VALUES ('$1,234.56');

SELECT
    amount,
    amount::numeric AS as_number
FROM transactions;

Geometric Types

sql
CREATE TABLE locations (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    coordinates POINT,
    boundary POLYGON
);

INSERT INTO locations (name, coordinates)
VALUES ('Office', POINT(11.5564, 104.9282));

-- Calculate distance
SELECT
    a.name,
    b.name,
    a.coordinates <-> b.coordinates AS distance
FROM locations a, locations b
WHERE a.id != b.id;

Type Conversion

sql
-- Explicit casting
SELECT
    '100'::INTEGER,
    '3.14'::DECIMAL,
    '2024-01-15'::DATE,
    'true'::BOOLEAN,
    123::TEXT;

-- Using CAST
SELECT CAST('100' AS INTEGER);

-- Common conversions
SELECT
    price::TEXT || ' USD' AS price_text,
    created_at::DATE AS date_only,
    (views::DECIMAL / 1000)::DECIMAL(10,1) AS views_k
FROM products;

Summary

CategoryTypesUse Case
NumbersINTEGER, BIGINT, DECIMALCounts, IDs, money
TextVARCHAR, TEXTNames, content
Date/TimeDATE, TIMESTAMPTZ, INTERVALScheduling, logs
BooleanBOOLEANFlags, toggles
UUIDUUIDDistributed IDs
JSONJSONBFlexible data
Arraystype[]Multiple values
EnumCustomLimited choices

Common Type Conversion Pitfalls

Silent Truncation

sql
-- VARCHAR truncates without warning in some modes
CREATE TABLE test (name VARCHAR(5));
INSERT INTO test VALUES ('Hello World');  -- Error: value too long!

-- TEXT has no limit
CREATE TABLE test2 (name TEXT);
INSERT INTO test2 VALUES ('Hello World');  -- Works fine

Numeric Precision Loss

sql
-- REAL/FLOAT loses precision
SELECT 0.1::REAL + 0.2::REAL = 0.3::REAL;  -- false!
SELECT 0.1::REAL + 0.2::REAL;               -- 0.30000001192092896

-- DECIMAL maintains precision
SELECT 0.1::DECIMAL + 0.2::DECIMAL = 0.3::DECIMAL;  -- true

Date/Time Pitfalls

sql
-- TIMESTAMP vs TIMESTAMPTZ
SET timezone = 'America/New_York';

-- Without timezone (dangerous!)
SELECT '2024-01-15 12:00:00'::TIMESTAMP;
-- Returns: 2024-01-15 12:00:00 (no timezone info)

-- With timezone (recommended)
SELECT '2024-01-15 12:00:00'::TIMESTAMPTZ;
-- Returns: 2024-01-15 12:00:00-05 (includes timezone)

JSON vs JSONB Gotchas

sql
-- JSON preserves everything (including duplicates)
SELECT '{"a": 1, "a": 2}'::JSON;  -- {"a": 1, "a": 2}

-- JSONB removes duplicates (keeps last value)
SELECT '{"a": 1, "a": 2}'::JSONB;  -- {"a": 2}

-- JSONB doesn't preserve key order
SELECT '{"z": 1, "a": 2}'::JSONB;  -- {"a": 2, "z": 1}

Real-World Type Selection Guide

┌─────────────────────────────────────────────────────────────────┐
│                   Type Selection Decision Tree                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Storing Money?                                                 │
│   └── YES → DECIMAL(19,4) or NUMERIC(19,4)                      │
│                                                                  │
│   Storing IDs?                                                   │
│   └── Distributed System → UUID                                 │
│   └── Single Database → SERIAL/BIGSERIAL                        │
│                                                                  │
│   Storing Dates/Times?                                           │
│   └── Date only → DATE                                          │
│   └── With time → TIMESTAMPTZ (always!)                         │
│   └── Duration → INTERVAL                                        │
│                                                                  │
│   Storing Text?                                                  │
│   └── Fixed length (country codes) → CHAR(n)                    │
│   └── Known max length → VARCHAR(n)                             │
│   └── Unknown/variable length → TEXT                            │
│                                                                  │
│   Storing Flexible Data?                                         │
│   └── Need to query it → JSONB                                  │
│   └── Just store/retrieve → JSON                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

PostgreSQL-Specific Types Worth Knowing

Range Types

sql
-- Built-in range types for intervals
CREATE TABLE reservations (
    id SERIAL PRIMARY KEY,
    room_id INTEGER,
    during TSTZRANGE,  -- Timestamp range
    EXCLUDE USING GIST (room_id WITH =, during WITH &&)  -- No overlaps!
);

-- Insert a reservation
INSERT INTO reservations (room_id, during) VALUES
    (1, '[2024-01-15 14:00, 2024-01-15 16:00)');

-- Query overlapping reservations
SELECT * FROM reservations
WHERE during && '[2024-01-15 15:00, 2024-01-15 17:00)';

Domain Types (Custom Validation)

sql
-- Create a domain for email validation
CREATE DOMAIN email AS VARCHAR(255)
CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');

-- Use in table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    user_email email NOT NULL  -- Automatically validates!
);

-- This works
INSERT INTO users (user_email) VALUES ('test@example.com');

-- This fails with validation error
INSERT INTO users (user_email) VALUES ('invalid-email');

Best Practices

Choosing Data Types

  1. Use the smallest type that fits - SMALLINT vs BIGINT
  2. Use DECIMAL for money - Never FLOAT
  3. Use TIMESTAMPTZ over TIMESTAMP - Timezone awareness
  4. Use JSONB over JSON - Better performance
  5. Use ENUM for fixed choices - Prevents invalid values
  6. Consider arrays for simple lists - Avoids extra tables
  7. Use domains for reusable validation - Email, phone, etc.

What's Next?

Now that you understand data types, let's learn advanced querying techniques!

👉 Continue to Chapter 5: Queries