Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Anomalous behaviour of Excel's INDEX() function between references and literal arrays

This question relates to the Excel function INDEX(), which is documented by Microsoft here.

If I have an Excel range in, say, A1:B2:

A B
1 2
3 4

and I use the formula

=INDEX(A1:B2, 0, 0)

then the full range is returned to the worksheet (as a dynamic array). This agrees with the documentation.

If the final argument, col_num, is missing:

=INDEX(A1:B2, 0, )

then the result is the same, again matching the documentation.

But if col_num is omitted:

=INDEX(A1:B2, 0)

then a #REF! error is returned.

However, if the first argument to INDEX() is changed from a range reference to the equivalent literal array:

=INDEX({1, 2; 3, 4}, 0)

then the full range is returned again.

(Edit for clarity: The value of the second argument, row_num, is not important, as long as it is not negative or out of range; the behaviour described above is exhibited for any valid value of row_num, including zero or 'missing'.)

The #REF! error only occurs if col_num is omitted and the input reference is two-dimensional (i.e. has multiple rows and multiple columns). If, instead, the reference has only one row and/or only one column, the behaviour reverts to returning the full range, matching the behaviour for a literal array.

I cannot find this apparent anomaly documented anywhere. Moreover, this is the only discrepancy I can find between the two cases of the first argument to INDEX() being a reference or a literal array; in all other scenarios, the behaviour seems identical between the two.

Is this discrepancy deliberate? Is there any documentation on this behaviour that I have missed?

like image 248
Neil T Avatar asked Nov 02 '25 08:11

Neil T


1 Answers

With some help from @Rory (many thanks), I think I can now explain this behaviour.

The key things to understand are:

  • In Excel, some function parameters can accept either reference or array arguments. The behaviour of the function may differ depending on whether a reference or an array is passed.

  • INDEX() is such a function. The documentation refers to the "array form" and the "reference form". The syntax is almost identical (except for the rare "reference form" case where multiple ranges are passed).

  • The documentation states that the array parameter of the "array form" can accept "a range of cells or an array constant". Crucially, "range of cells" here does not mean a reference to a cell range on a worksheet such as A1:B2 (this would be a reference). Rather, it means an in-memory cell range, as might be returned by a nested function call to an outer function (examples below).

  • Therefore, INDEX(A1:B2, ...) always calls the "reference form" of the INDEX() fucntion, never the "array form".

  • As pointed out by @Rory in the comments, there is a minor difference in the documentation between the two forms: in the "array form", the column_num parameter is optional for a 2D range ("2D" meaning > 1 row, > 1 column). This is not the case for the "reference form".

This explains why, in this unique case of a 2D reference with an omitted column_num, a #REF! error is returned, whereas the equivalent call on a literal array (or an in-memory cell range) succeeds because the "array form" supports this.

Examples:

=INDEX(A1:B2, 0) : fails (reference form; col_num is required for a 2D reference)

=INDEX(A1:B2, 0, ) : succeeds (reference form; the comma means that col_num is present, even though 'missing')

=INDEX({1, 2; 3, 4}, 0) : succeeds (array form; col_num is optional for a 2D array)

=INDEX(INDEX(A1:B2, 0, 0), 0) : fails (a _reference_ is passed to the outer INDEX() call)

=INDEX(SQRT(INDEX(A1:B2, 0, 0)), 0) : succeeds (an _array_ is passed to the outer INDEX() call)

This last example is the "range of cells" case: the inner INDEX() function returns a reference, which is processed by the SQRT() function, returning an array ("range of cells"), which then calls the "array form" of the outer INDEX() function.

I still do not understand why this difference is necessary; I can see no reason why the "reference form" should not behave identically to the array form in this regard. Presumably there is a reason, which I would love to hear if anyone knows it!

like image 73
Neil T Avatar answered Nov 05 '25 14:11

Neil T



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!