Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive: date range is not fetching the data from the table

I am new to this platform. Trying to accomplish a task in Hive but failing to get it done.

Have a table with the following structure:

Table1:

ID DATE       VAL1  VAL2  VAL3
01 12/12/2017 12.23 56.45 78.9
02 05/06/2019 65.88 65.22 98.22
03 09/08/2018 86.00 23.00 56.05
04 09/19/2019 34.09 75.04 77.89
05 03/16/2018 87.45 45.56 89.89
06 06/06/2018 67.66 45.00 67.90
07 01/01/2018 23.21 90.99 56.33
08 07/03/2018 67.22 67.22 33.11
09 05/13/2017 12.12 98.33 78.89

DataTypes:

ID   INT
DATE STRING
VAL1 DOUBLE
VAL2 DOUBLE
VAL3 DOUBLE

Hive Query:

SELECT * FROM Table1 WHERE DATE BETWEEN date_format(current_date + 
interval '-12' month,'MM/dd/YYYY') AND 
date_format(current_date,'MM/dd/YYYY');

This above query is not fetching the records from the table.

Actual O/P:

OK
Time taken: 65.515 seconds
hive>

Expected O/P:

ID DATE       VAL1  VAL2  VAL3
02 05/06/2019 65.88 65.22 98.22
03 09/08/2018 86.00 23.00 56.05
04 09/19/2019 34.09 75.04 77.89
06 06/06/2018 67.66 45.00 67.90
08 07/03/2018 67.22 67.22 33.11

Any help to point out the mistake or missed part in the query will be really helpful. Thanks in advance.

like image 829
rocking Avatar asked Oct 25 '25 14:10

rocking


1 Answers

The format you are using is incorrect one. Correct format is 'MM/dd/yyyy', not 'MM/dd/YYYY'. Your date is not in sortable format, this is why you need to convert column date to sortable format 'yyyy-MM-dd' before applying BETWEEN filter:

SELECT * 
  FROM Table1 
 WHERE from_unixtime(unix_timestamp(`DATE`,'MM/dd/yyyy'),'yyyy-MM-dd') BETWEEN add_months(current_date,-12) AND current_date;

For better understanding consider this example with comparing two dates in not-sortable format:

hive> select '09/19/2018' > '08/01/2019'; --bad data format
OK
true
Time taken: 5.632 seconds, Fetched: 1 row(s)

This is why BETWEEN operator will work incorrect with such dates.

Let's compare the same dates in correct format:

hive> select '2018-09-19' > '2019-08-01';
OK
false
Time taken: 0.095 seconds, Fetched: 1 row(s)

Also see SimpleDateFormat class docs as a format template reference.

like image 129
leftjoin Avatar answered Oct 28 '25 03:10

leftjoin