Postgres and tables internal organization

Question

I found an explanation of how things work internally in postgresql. There was the following picture:

and the following explanation:

Items after the headers is an array identifier composed of (offset, length) pairs pointing to the actual items.

Because an item identifier is never moved until it is freed, its index can be used on a long-term basis to reference an item, even when the item itself is moved around on the page to compact free space. A Pointer to an item is called CTID (ItemPointer), created by PostgreSQL, it consists of a page number and the index of an item identifier.

Could you be so kind to clear a couple of things out here?

Am I right that items near the page header are CTIDs themselves or Items and CTIDs are different things?
Do CTIDs never move around or rows?
Depending on the answers, maybe I'll understand what the following means exactly "Because an item identifier is never moved until it is freed, its index can be used on a long-term basis to reference an item, even when the item itself is moved around on the page to compact free space." However, additional more detailed explanation would be nice.

Laurenz Albe · Accepted Answer

What is called “item” in the picture is a “line pointer” in PostgreSQL jargon. It is defined in src/include/storage/itemid.h:

/*
 * A line pointer on a buffer page.  See buffer page definitions and comments
 * for an explanation of how line pointers are used.
 *
 * In some cases a line pointer is "in use" but does not have any associated
 * storage on the page.  By convention, lp_len == 0 in every line pointer
 * that does not have storage, independently of its lp_flags state.
 */
typedef struct ItemIdData
{
    unsigned    lp_off:15,      /* offset to tuple (from start of page) */
                lp_flags:2,     /* state of line pointer, see below */
                lp_len:15;      /* byte length of tuple */
} ItemIdData;

typedef ItemIdData *ItemId;

These line pointers are stored in an array right after the page header.

See the excellent documentation in src/include/storage/bufpage.h:

/*
 * A postgres disk page is an abstraction layered on top of a postgres
 * disk block (which is simply a unit of i/o, see block.h).
 *
 * specifically, while a disk block can be unformatted, a postgres
 * disk page is always a slotted page of the form:
 *
 * +----------------+---------------------------------+
 * | PageHeaderData | linp1 linp2 linp3 ...           |
 * +-----------+----+---------------------------------+
 * | ... linpN |                                      |
 * +-----------+--------------------------------------+
 * |           ^ pd_lower                             |
 * |                                                  |
 * |             v pd_upper                           |
 * +-------------+------------------------------------+
 * |             | tupleN ...                         |
 * +-------------+------------------+-----------------+
 * |       ... tuple3 tuple2 tuple1 | "special space" |
 * +--------------------------------+-----------------+
 *                                  ^ pd_special
 *
 * NOTES:
 *
 * linp1..N form an ItemId (line pointer) array.  ItemPointers point
 * to a physical block number and a logical offset (line pointer
 * number) within that block/page.  Note that OffsetNumbers
 * conventionally start at 1, not 0.
 *
 * tuple1..N are added "backwards" on the page.  Since an ItemPointer
 * offset is used to access an ItemId entry rather than an actual
 * byte-offset position, tuples can be physically shuffled on a page
 * whenever the need arises.  This indirection also keeps crash recovery
 * relatively simple, because the low-level details of page space
 * management can be controlled by standard buffer page code during
 * logging, and during recovery.

Answers to your questions:

The ctid of a tuple is the physical address, consisting of the block number (starting at 0) and the line pointer (starting at 1). You can identify the line pointer from the ctid of a table row: it is the second number. For example, (321,5) would be the fifth line pointer on the 322th page.
The location of the actual tuple in the block is not fixed: it is stored in lp_off. That allows PostgreSQL to move the data around in a block without changing the physical address (tid) of the tuples. The line pointer itself never changes.
As explained above, the actual data can move in the block, but the line pointer doesn't change. The ctid of a tuple is what is stored in the index. The statement should be clear now.

Postgres and tables internal organization

Tags:

postgresql

EngineerSpock

1 Answers

Laurenz Albe

Recent Activity

Donate For Us

Postgres and tables internal organization

Tags:

postgresql

EngineerSpock

1 Answers

Laurenz Albe

Related questions

Recent Activity

Donate For Us