I know how the
works and its functionality.
I wanted to know in which situation these joins are used in Postgres
Merge join is used when projections of the joined tables are sorted on the join columns. Merge joins are faster and uses less memory than hash joins. Hash join is used when projections of the joined tables are not already sorted on the join columns.
Nested loop joins are particularly efficient if the outer relation is small, because then the inner loop won't be executed too often.
hash join: the right relation is first scanned and loaded into a hash table, using its join attributes as hash keys. Next the left relation is scanned and the appropriate values of every row found are used as hash keys to locate the matching rows in the table.
The three algorithms are: Loop Join. Merge Join. Hash Join.
The following are a few rules of thumb:
Nested loop joins are preferred if one of the sides of the join has few rows. Nested loop joins are also used as the only option if the join condition does not use the equality operator.
Hash Joins are preferred if the join condition uses an equality operator and both sides of the join are large and the hash fits into work_mem.
Merge Joins are preferred if the join condition uses an equality operator and both sides of the join are large, but can be sorted on the join condition efficiently (for example, if there is an index on the expressions used in the join column).
A typical OLTP query that chooses only one row from one table and the associated rows from another table will always use a nested loop join as the only efficient method.
Queries that join tables with many rows (which cannot be filtered out before the join) would be very inefficient with a nested loop join and will always use a hash or merge join if the join condition allows it.
The optimizer considers each of these join strategies and uses the one that promises the lowest costs. The most important factor on which this decision is based is the estimated row count from both sides of the join. Consequently, wrong optimizer choices are usually caused by misestimates in the row counts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With