Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to modify datetime objects through the Django ORM Query?

We've a Django, Postgresql database that contains objects with:

object_date = models.DateTimeField()

as a field.

We need to count the objects by hour per day, so we need to remove some of the extra time data, for example: minutes, seconds and microseconds.

We can remove the extra time data in python:

query = MyModel.objects.values('object_date')
data = [tweet['tweet_date'].replace(minute=0, second=0, microsecond=0) for tweet in query

Which leaves us with a list containing the date and hour.

My Question: Is there a better, faster, cleaner way to do this in the query itself?

like image 693
a_Fraley Avatar asked Oct 17 '25 14:10

a_Fraley


2 Answers

If you simply want to obtain the dates without the time data, you can use extra to declare calculated fields:

query = MyModel.objects
    .extra(select={
        'object_date_group': 'CAST(object_date AS DATE)',
        'object_hour_group': 'EXTRACT(HOUR FROM object_date)'
    })
    .values('object_date_group', 'object_hour_group')

You don't gain too much from just that, though; the database is now sending you even more data.

However, with these additional fields, you can use aggregation to instantly get the counts you were looking for, by adding one line:

query = MyModel.objects
    .extra(select={
        'object_date_group': 'CAST(object_date AS DATE)',
        'object_hour_group': 'EXTRACT(HOUR FROM object_date)'
    })
    .values('object_date_group', 'object_hour_group')
    .annotate(count=Count('*'))

Alternatively, you could use any valid SQL to combine what I made two fields into one field, by formatting it into a string, for example. The nice thing about doing that, is that you can then use the tuples to construct a Counter for convenient querying (use values_list()).

This query will certainly be more efficient than doing the counting in Python. For a background job that may not be so important, however.

One downside is that this code is not portable; for one, it does not work on SQLite, which you may still be using for testing purposes. In that case, you might save yourself the trouble and write a raw query right away, which will be just as unportable but more readable.

Update

As of 1.10 it is possible to perform this query nicely using expressions, thanks to the addition of TruncHour. Here's a suggestion for how the solution could look:

from collections import Counter
from django.db.models import Count
from django.db.models.functions import TruncHour

counts_by_group = Counter(dict(
    MyModel.objects
        .annotate(object_group=TruncHour('object_date'))
        .values_list('object_group')
        .annotate(count=Count('object_group'))
)) # query with counts_by_group[datetime.datetime(year, month, day, hour)]

It's elegant, efficient and portable. :)

like image 76
Thijs van Dien Avatar answered Oct 19 '25 04:10

Thijs van Dien


count = len(MyModel.objects.filter(object_date__range=(beginning_of_hour, end_of_hour)))

or

count = MyModel.objects.filter(object_date__range=(beginning_of_hour, end_of_hour)).count()

Assuming I understand what you're asking for, this returns the number of objects that have a date within a specific time range. Set the range to be from the beginning of the hour until the end of the hour and you will return all objects created in that hour. Count() or len() can be used depending on the desired use. For more information on that check out https://docs.djangoproject.com/en/1.9/ref/models/querysets/#count

like image 33
Written Avatar answered Oct 19 '25 02:10

Written