Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash Table Data Structure in SQL Server

For the last few days, I've been reading an ebook on data structures and well, frankly speaking, many things are already gone from my head. Just reviewing them and trying to make clear again. I was going through hash tables and get to familiar with it again. So I know and heard, SQL Server uses hash tables internally and many of the threads of stackoverflow.com and forums.asp.net asked about creating hash tables in SQL Server as it stores temporary data. So let me give an example that I've used in a stored procedure using temp table: (Avoid it and it's too long. Just for an example)

1st:

CREATE PROCEDURE [dbo].[Orders]
    @OrderLine int
AS
BEGIN
    DECLARE @t1 TABLE(Date1 date, 
                      OrderID VARCHAR(MAX), 
                      EmployeeName VARCHAR(MAX), 
                      DeliveryDate date, 
                      StoreName VARCHAR(MAX),
                      DeliveryAddress VARCHAR(MAX), 
                      ItemName VARCHAR(MAX), 
                      Quantity FLOAT)

    INSERT INTO @t1(Date1, OrderID, EmployeeName, DeliveryDate, StoreName, DeliveryAddress, ItemName, Quantity)
        (SELECT DISTINCT 
             CONVERT(VARCHAR(11), DemandOrder.POCreationDate, 6) AS DemandOrderDate, 
             DemandOrder.OrderID, EmployeeDetails.EmployeeName,
             CONVERT(DATE, DemandOrder.DeliveryDate) AS ExpectedDeliveryDate, 
             StoreDetails.StoreName,
             DemandOrder.DeliveryAddress, Item.ItemName, 
             DemandOrderLine.Quantity 
         FROM 
             DemandOrder 
         INNER JOIN 
             DemandOrderLine ON DemandOrder.OrderID = DemandOrderLine.OrderID 
         INNER JOIN 
             Item on DemandOrderLine.ItemID=Item.ItemID 
         INNER JOIN 
             EmployeeDetails ON EmployeeDetails.EmployeeID = DemandOrder.EmployeeID 
         INNER JOIN 
             StoreDetails ON DemandOrderLine.StoreID = StoreDetails.StoreID
         WHERE 
             DemandOrderLine.OrderLine = @OrderLine)

    DECLARE @t2 TABLE(Approvedby VARCHAR(MAX)) 

    INSERT INTO @t2(Approvedby)
        (SELECT EmployeeDetails.EmployeeName 
         FROM EmployeeDetails 
         INNER JOIN DemandOrderLine ON DemandOrderLine.ApprovedBy = EmployeeDetails.EmployeeID)

    SELECT DISTINCT 
        CONVERT(VARCHAR(11), Date1, 6) AS Date, 
        OrderID, EmployeeName,
        CONVERT(VARCHAR(11), DeliveryDate, 6) AS ExpectedDeliveryDate, 
        StoreName, Approvedby, DeliveryAddress, 
        ItemName, Quantity  
    FROM 
        @t1 
    CROSS JOIN 
        @t2
END   

Another one, from an example, that says in stored procedure, hash tables can't be used. So here it's:

2nd:

CREATE PROCEDURE TempTable AS ---- It's actually not possible in SP

CREATE table #Color
(
    Color varchar(10) PRIMARY key
)

INSERT INTO #color 
    SELECT 'Red' 
    UNION 
    SELECT 'White'
    UNION 
    SELECT 'green'
    UNION 
    SELECT 'Yellow'
    UNION 
    SELECT 'blue'

DROP TABLE #color

CREATE table #Color
(
    Color varchar(10) PRIMARY key
)

INSERT INTO #color 
    SELECT 'Red' 
    UNION 
    SELECT 'White'
    UNION 
    SELECT 'green'
    UNION 
    SELECT 'Yellow'
    UNION 
    SELECT 'blue'

DROP TABLE #color
GO

So my question is can I say the 1st one is an example of hash table as it uses temp tables and if not, why can't we use it in the stored procedure? Again, if it's created internally, why do we need to create a hash table again for working purposes (Though it has performance issues, just wondering to know if the above examples serve for the purpose). Thanks.

Note: I faced an interview last month and was discussing about it. That's why making sure if I was correct in my views.

like image 670
AT-2017 Avatar asked Mar 02 '26 10:03

AT-2017


2 Answers

Hash-based algorithms are important for any powerful database. These are used for aggregation and join operations. Hash-based joins have been there since version 7.0 -- which is really old (thanks to Martin Smith). You can read more about them in the documentation.

SQL Server 2014 introduced hash-based indexes for memory optimized tables (see here). These are an explicit use of hash tables. In general, though, the tree-based indexes are more powerful because they can be used in more situations:

  • For range lookups (including like).
  • For partial key matches.
  • For order by.

A hash index can only be used for an exact equality match (and group by).

like image 191
Gordon Linoff Avatar answered Mar 04 '26 22:03

Gordon Linoff


I know im a little late to the party, but I dont think anyone has directly answered your original question.

The first is an example of a table variable and the second is an example of a local table, both are created in the tempdb

The difference between them is that a table variable is not created in memory and cant have a clustered index. Also a local (hash) table will stick around until that single connection ends, while a table variable is only available for the batch its declared in. A global table (using a double hash before it) will be available to all connections and persist until all connections using it are closed.

One final thing, the only reason you cant use that local table in a stored procedure is because it uses the same name twice, even though you've used drop table it evaluates it based on the creates in the batch first. So it wont execute anything and moan it already exists.

like image 39
Reece Avatar answered Mar 05 '26 00:03

Reece