Sitemap

TimescaleDB: Continuous Aggregates

2 min readSep 1, 2023
Press enter or click to view image in full size

Hey everyone, in this blog we will learn how to create continuous aggregates in timescaledb,

To understand continuous aggregates, we will take a scenario where we have events table that contains response_time for the api hits to a given endpoint

CREATE TABLE events (
id bigserial,
endoint varchar,
response_time smallint,
created_at timestamptz
);

Now we will convert this table to a hypertable

SELECT create_hypertable('events', 'created_at', chunk_time_interval => INTERVAL '1 day'

Now let’s add some randomized data to the table

INSERT INTO events 
(id, endpoint, response_time, created_at)
VALUES
(generate_series(1, 10000000), '/health', floor(random() * 900 + 100), timestamptz '2023-08-01 00:00:00 UTC' + random() * Interval '31 day')

Now let’s create a continuous aggregate over this table which will help us fetch average response time and daily hits per endpoint on a day basis

  1. Create a materialized view (continuous aggregate requires a special parameter time_bucket on the partitioning column of hypertable)
CREATE MATERIALIZED VIEW events_summary_daily
WITH (timescaledb.continuous) AS
SELECT endpoint,
time_bucket(INTERVAL '1 day', created_at) AS day,
AVG(response_time) AS avg_response_time,
COUNT(*) AS total_hits
FROM events
GROUP BY endpoint, day;

2. Create a refresh policy

start_offset -> It determines how much of historical data to consider relative to the time when policy runs

end_offset -> It determines how recent data to be considered

schedule_interval -> Refresh Interval

SELECT add_continuous_aggregate_policy('events_summary_daily',
start_offset => INTERVAL '1 month',
end_offset => INTERVAL '1 day',
schedule_interval => INTERVAL '1 day');

And voilla, we are done.

--

--

No responses yet