Does modifying table data affect index scanning?
Redrock Postgres introduced undo logging. When a tuple is modified, index records are marked for deletion and transaction status information is recorded for index records. If each index is not updated, only related index columns are updated. Modifying table data does not affect index scanning.
Create a table, insert some data, and create an index on the id
field of the table. Execute the query that involves index scanning in advance to load the index data to the shared buffer.
CREATE TABLE t_large (id integer, name text);
INSERT INTO t_large (id, name)
SELECT i, repeat('Pg', 32)
FROM generate_series(1, 1000000) AS s(i);
CREATE INDEX large_idx ON t_large (id);
SET max_parallel_workers_per_gather = 0;
SELECT count(*) FROM t_large;
Let’s first confirm that the query does use index scanning and record the execution plan and statistics for the query.
EXPLAIN (analyze, buffers, costs off)
SELECT count(*) FROM t_large;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Aggregate (actual time=102.074..102.075 rows=1 loops=1)
Buffers: shared hit=2
-> Index Only Scan using large_idx on t_large (actual time=0.017..57.880 rows=1000000 loops=1)
Heap Fetches: 0
Buffers: shared hit=2
Planning time: 0.075 ms
Execution time: 102.094 ms
Let’s make some updates to some of the data in the table and delete some of the data.
UPDATE t_large SET name = 'dummy' WHERE mod(id, 100) = 1;
DELETE FROM t_large WHERE mod(id, 100) = 50;
Let’s look again at the execution plan used by the original query.
EXPLAIN (analyze, buffers, costs off)
SELECT count(*) FROM t_large;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Aggregate (actual time=101.703..101.703 rows=1 loops=1)
Buffers: shared hit=2
-> Index Only Scan using large_idx on t_large (actual time=0.017..57.207 rows=990000 loops=1)
Heap Fetches: 0
Buffers: shared hit=2
Planning time: 0.079 ms
Execution time: 101.724 ms
In the above example, after some data is updated and deleted from the t_large
table, the execution plan used by the original query does not change, and the query execution time is not affected.