I have the following query (this is the simplified version of a much more complicated query):
SELECT * FROM TPM_TASK
WHERE (PROJECTID, VERSIONID) IN ((3,1), (24,1), (4,1))
In code I will be building that (PROJECTID,VERSIONID) key list programmatically, and this list could potentially be a couple thousand pairs long.
My question is how Oracle will optimize this query given that ProjectId and VersionId are indexed.  Will the list be converted to a hash table, similar to a join against a temp table?  Or will each key lookup be done one at a time?
I tried this query under my test database and got:
SELECT STATEMENT    68.0    68  2989732 19  8759    68                  ALL_ROWS                                            
   TABLE ACCESS (FULL)  68.0    68  2989732 19  8759    1   TPMDBO  TPM_TASK    FULL    TABLE   ANALYZED    1
However, I believe this database doesn't have enough data to warrant an index scan. I tried the query on production and got:
SELECT STATEMENT    19.0    19  230367  23  9683    19                  ALL_ROWS                                            
   INLIST ITERATOR                      1                                                               
      TABLE ACCESS (BY INDEX ROWID) 19.0    19  230367  23  9683    1   TPMDBO  TPM_TASK    BY INDEX ROWID  TABLE   ANALYZED    1                                       
         INDEX (RANGE SCAN) 4.0 4   64457   29      1   TPMDBO  TPM_H1_TASK RANGE SCAN  INDEX   ANALYZED                1                           
This seems to hit the index, however I'm not sure what INLIST ITERATOR means. I'm guessing this means that Oracle is iterating through the list and doing a table access for each item in the list, which would probably not be too efficient with thousands of keys. However, perhaps Oracle is smart enough to optimize this better if I actually did give it several thousand keys.
NOTE: I don't want to load these keys into a temp table because frankly I don't like the way temp tables work under Oracle, and they usually end up in more frustration than they're worth (in my non-expert opinion anyway.)
The optimizer should base its decision on the number of items in the list and the number of rows in the table. If the table has millions of rows and the list has even a couple of thousand items, I would generally expect that it would use the index to do a couple thousand single-row lookups. If the table has a few thousand rows and the list has a couple thousand items, I'd expect that the optimizer to do a full scan of the table. In the middle, of course, is where all the interesting stuff happens and where it gets harder to work out exactly what plan the optimizer is going to choose.
In general, however, dynamically building this sort of query is going to be problematic from a performance perspective not because of how expensive a particular query execution is but because the queries you're generating are not sharable. Since you can't use bind variables (or, if you are using bind variables, you'll need a different number of bind variables). That forces Oracle to do a rather expensive hard parse of the query every time and puts pressure on your shared pool which will likely force out other queries that are sharable which will cause more hard parsing in the system. You'll generally be better served tossing the data you want to match on into a temporary table (or even a permanent table) so that your query can then be made sharable and parsed just once.
To Branko's comment, while Oracle is limited to 1000 literals in an IN list, that is only if you are using the "normal" syntax, i.e.
WHERE projectID IN (1,2,3,...,N)
If you use the tuple syntax that you posted earlier, however, you can have an unlimited number of elements.
So, for example, I'll get an error if I build up a query with 2000 items in the IN list
SQL> ed
Wrote file afiedt.buf
  1  declare
  2    l_sql_stmt varchar2(32000);
  3    l_cnt      integer;
  4  begin
  5    l_sql_stmt := 'select count(*) from emp where empno in (';
  6    for i in 1..2000
  7    loop
  8      l_sql_stmt := l_sql_stmt || '(1),';
  9    end loop;
 10    l_sql_stmt := rtrim(l_sql_stmt,',') || ')';
 11  --  p.l( l_sql_stmt );
 12    execute immediate l_sql_stmt into l_cnt;
 13* end;
SQL> /
declare
*
ERROR at line 1:
ORA-01795: maximum number of expressions in a list is 1000
ORA-06512: at line 12
But not if I use the tuple syntax
SQL> ed
Wrote file afiedt.buf
  1  declare
  2    l_sql_stmt varchar2(32000);
  3    l_cnt      integer;
  4  begin
  5    l_sql_stmt := 'select count(*) from emp where (empno,empno) in (';
  6    for i in 1..2000
  7    loop
  8      l_sql_stmt := l_sql_stmt || '(1,1),';
  9    end loop;
 10    l_sql_stmt := rtrim(l_sql_stmt,',') || ')';
 11  --  p.l( l_sql_stmt );
 12    execute immediate l_sql_stmt into l_cnt;
 13* end;
SQL> /
PL/SQL procedure successfully completed.
A better solution, which doesn't require temp tables, may be to put the data into a PL/SQL table, and then join to it. Tom Kyte has an excellent example here: PL/SQL Table join example
Hope that helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With