Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Datetimes for Julia dataframes

pandas has a number of very handy utilities for manipulating datetime indices. Is there any similar functionality in Julia? I have not found any tutorials for working with such things, though it obviously must be possible.

Some examples of pandas utilities:

dti = pd.to_datetime(
    ["1/1/2018", np.datetime64("2018-01-01"), 
datetime.datetime(2018, 1, 1)]
)

dti = pd.date_range("2018-01-01", periods=3, freq="H")

dti = dti.tz_localize("UTC")

dti.tz_convert("US/Pacific")

idx = pd.date_range("2018-01-01", periods=5, freq="H")
ts = pd.Series(range(len(idx)), index=idx)
ts.resample("2H").mean()
like image 950
Igor Rivin Avatar asked Oct 21 '25 04:10

Igor Rivin


1 Answers

Julia libraries have "do only one thing but do it right" philosophy so the layout of its libraries matches perhaps more a Unix (battery of small tools that allow to accomplish a common goal) rather then Python's. Hence you have separate libraries for DataFrames and Dates:

julia> using Dates, DataFrames

Going through some of the examples of your tutorial:

Pandas

dti = pd.to_datetime(
    ["1/1/2018", np.datetime64("2018-01-01"), datetime.datetime(2018, 1, 1)]
)

Julia

julia> DataFrame(dti=[Date("1/1/2018", "m/d/y"), Date("2018-01-01"), Date(2018,1,1)])
3×1 DataFrame
 Row │ dti
     │ Date
─────┼────────────
   1 │ 2018-01-01
   2 │ 2018-01-01
   3 │ 2018-01-01

Pandas

dti = pd.date_range("2018-01-01", periods=3, freq="H")

Julia

julia> DateTime("2018-01-01")  .+ Hour.(0:2)
3-element Vector{DateTime}:
 2018-01-01T00:00:00
 2018-01-01T01:00:00
 2018-01-01T02:00:00

Pandas

dti = dti.tz_localize("UTC")

dti.tz_convert("US/Pacific")

Julia

Note that that there is a separate library in Julia for time zones. Additionally "US/Pacific" is a legacy name of a time zone.

julia> using TimeZones

julia> dti = ZonedDateTime.(dti, tz"UTC")
3-element Vector{ZonedDateTime}:
 2018-01-01T00:00:00+00:00
 2018-01-01T01:00:00+00:00
 2018-01-01T02:00:00+00:00

julia> julia> astimezone.(dti, TimeZone("US/Pacific", TimeZones.Class(:LEGACY)))
3-element Vector{ZonedDateTime}:
 2017-12-31T16:00:00-08:00
 2017-12-31T17:00:00-08:00
 2017-12-31T18:00:00-08:00

Pandas

idx = pd.date_range("2018-01-01", periods=5, freq="H")
ts = pd.Series(range(len(idx)), index=idx)
ts.resample("2H").mean()

Julia

For resampling or other complex manipulations you will want to use the split-apply-combine pattern (see https://docs.juliahub.com/DataFrames/AR9oZ/1.3.1/man/split_apply_combine/)

julia> df = DataFrame(date=DateTime("2018-01-01")  .+ Hour.(0:4), vals=1:5)
5×2 DataFrame
 Row │ date                 vals
     │ DateTime             Int64
─────┼────────────────────────────
   1 │ 2018-01-01T00:00:00      1
   2 │ 2018-01-01T01:00:00      2
   3 │ 2018-01-01T02:00:00      3
   4 │ 2018-01-01T03:00:00      4
   5 │ 2018-01-01T04:00:00      5
julia> df.date2 = floor.(df.date, Hour(2));

julia> using StatsBase

julia> combine(groupby(df, :date2), :date2, :vals => mean => :vals_mean)
5×2 DataFrame
 Row │ date2                vals_mean
     │ DateTime             Float64
─────┼────────────────────────────────
   1 │ 2018-01-01T00:00:00        1.5
   2 │ 2018-01-01T00:00:00        1.5
   3 │ 2018-01-01T02:00:00        3.5
   4 │ 2018-01-01T02:00:00        3.5
   5 │ 2018-01-01T04:00:00        5.0
like image 119
Przemyslaw Szufel Avatar answered Oct 22 '25 23:10

Przemyslaw Szufel