Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create dataframe and set column as datetime?

I want to create a pandas dataframe df like:

df = pd.DataFrame(
    {
        "group": ["A", "A", "A", "A", "A"],
        "date": ["2020-01-02", "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"],
        "value": [10, 20, 16, 31, 56],
    }
)

I want to specify column "date" as dtype=datetime64[ns] while creating the dataframe, not afterwards.

So not like this:

df = pd.DataFrame(
    {
        "group": ["A", "A", "A", "A", "A"],
        "date": ["2020-01-02", "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"],
        "value": [10, 20, 16, 31, 56],
    }
)
df["date"] = pd.to_datetime(df["date"])

But like this:

df = pd.DataFrame(
    {
        "group": ["A", "A", "A", "A", "A"],
        "date": pd.Series(
            ["2020-01-02", "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"],
            dtype=np.datetime64,
        ),
        "value": [10, 20, 16, 31, 56],
    }
)

but that gives the error:

ValueError: The 'datetime64' dtype has no unit. Please pass in 'datetime64[ns]' instead.

Doing that

df = pd.DataFrame(
    {
        "group": ["A", "A", "A", "A", "A"],
        "date": pd.Series(
            ["2020-01-02", "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"],
            dtype=np.datetime64[ns],
        ),
        "value": [10, 20, 16, 31, 56],
    }
) 

I get this error:

NameError: name 'ns' is not defined

So how can I set a specific column as type "datetime64[ns]"?

like image 984
Vega Avatar asked Sep 04 '25 16:09

Vega


1 Answers

You can use pd.to_datetime within the dict, as follows:

df = pd.DataFrame(
    {
        "group": ["A", "A", "A", "A", "A"],
        "date": pd.to_datetime(["2020-01-02", "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"]),
        "value": [10, 20, 16, 31, 56],
    }
)

date column is in datetime64[ns] format, as you can see by df.info()

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   group   5 non-null      object        
 1   date    5 non-null      datetime64[ns]
 2   value   5 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 248.0+ bytes
like image 142
SeaBean Avatar answered Sep 07 '25 16:09

SeaBean