I'd like to know why the following join returns values in an unordered fashion. Does distinct imply that the order of the IQueryable
passed is not guaranteed, therefore EF does not bother generating an order by clause in the SQL.
var currentUtc = DateTime.Now;
var data = (from a in ItemsA
join c in Customers on a.Fk_CustomerID equals c.ID into customerSubset
from cs in customerSubset.DefaultIfEmpty()
where cs.Age > 18)
orderby a.ID ascending
select new
{
a.ID,
a.someProperty,
cs.CustomerUID,
CustomerName = cs.Name,
UpdateUTC = currentUtc
}).Distinct().Take(1000).ToList();
The strange thing is that removing the distinct adds the order by clause in the inner query of the generated SQL (you can use Linqpad to see the SQL generated for eg.)
Now if I replace the last line with
.Distinct().OrderBy(x => s.ID).Take(1000).ToList();
I'm getting the order by clause in the SQL, whether or not I have the orderby a.ID ascending
in the inner query. This makes sense, but why is chaining distinct after orderby also not yield the same order?
The DISTINCT operation in SQL doesn't guarantee order. Internally it performs a sort before determining if rows are identical.
Even that order isn't guaranteed though because the query engine can partition the data for parallel processing, then recombine the partitioned data to produce the final result.
To guarantee a specific order, the ORDER BY
clause or the OrderBy()
call should be the last one before Skip()
or Take()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With