Background: I am migrating from postgreSQL to Vertica and found, that there are some issues in IDENTITY or AUTO_INCREMENT columns. One of these issues is, that vertica cannot assign values to IDENTITY columns or alter a column, that already has data into an IDENTITY column. Therefore I created a sequence and set the default value of the column to be unique doing:
SELECT MAX(id_column) FROM MY_SCHEMA.my_table;
which is 12345
CREATE SEQUENCE MY_SCHEMA.seq_id_column MINVALUE 12346 CACHE 1;
ALTER TABLE MY_SCHEMA.my_table
ALTER COLUMN id_column SET DEFAULT(MY_SCHEMA.seq_id_column.nextval);
ALTER TABLE MY_SCHEMA.log ADD UNIQUE(id_column);
Which works as expected. In this case, I have the cache deactivated, as I am on a single node installation and I want my ID column to be contiguous. However, this is not an option on a cluster installation as the needed lock leads to a bottleneck.
Question: In a vertica cluster with several nodes, how can I access the ID of the last insert in a session (without an additional select)?
E.g. in postgreSQL I could do something like
INSERT INTO MY_SCHEMA.my_table RETURNING id_column;
which does not work in Vertica. Furthermore, the LAST_INSERT_ID() function of Vertica does not work for named sequences. I also feel, that querying the current_value of MY_SCHEMA.seq_id_column could be giving wrong results due to caching, but I am unsure about this.
Why no additional SELECT?
To my knowledge, the select will only give correct values after a commit. I cannot do a commit after every single insert due to performance.
The comments from LukStorms pointed me in the right direction.
The NEXTVAL() function (as far as I have tested) gives contiguous values in the case, where one single session queries them. Furthermore, on concurrent access, if issued after an insert, CURRVAL retrieves the cached value, which is guaranteed to be unique but not necessarily contiguous. As I never call NEXTVAL anywhere else as in my default clause, this solves the problem for me, although there might be cases, where an additional call to NEXTVAL between inserts increments the sequence counter.
One case I can think of (and that I will test in the future) is what happens if AUTO COMMIT is set to OFF, which is ON by default for the vertica client drivers.
UPDATE:
This even seems to work with AUTOCOMMIT being OFF (shown using the vertica-python client driver, where C is the connection and cur the cursor):
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SET SESSION AUTOCOMMIT TO OFF")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4
However, this seems to be unchanged during a rollback of the connection. So the following happens:
C.rollback()
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With