Our logical replication recently stopped because of key conflicts - the wals were coming down but weren't being applied. I'd like to setup some monitoring so I can get notified when this happens, but I wasn't sure which fields to use.
I think I should be able to use the pg_stat_subscription view, but I wasn't sure from its description exactly what its columns represent.
Can I compare the source's pg_current_wal_lsn() to the replica's pg_stat_subscription.received_lsn to make sure the data is coming down?
received_lsn pg_lsn Last write-ahead log location received, the initial value of this field being 0
And would comparing the replica's pg_stat_subscription.received_lsn to its pg_stat_subscription.latest_end_lsn let me know if the data is getting replayed?
latest_end_lsn pg_lsn Last write-ahead log location reported to origin WAL sender
It's that last question which I'm most unsure about - does 'reported to origin WAL sender' imply that the replication was replayed successfully on the replica? Or is there some other field I'd need to use to know how far the replica has successfully applied the changes from the source database?
You can find logical-replication lag from source database using the below query:
SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
FROM pg_replication_slots;
pg_current_wal_lsn()
: This function returns the current wal lsn position in source database.confirmed_flush_lsn
: this field is present in pg_replication_slots
view. This value tells that till which position in WAL (source) the subscriber/logical-client has received the changes sent from source database. This value is sent by subscriber/logical-client via feedback message to source database.You can use above query in monitoring script.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With