Hello I am trying to convert a list of X and Y coordinates to lines. I want to mapped this data by groupby the IDs and also by time. My code executes successfully as long as I grouby one column, but two columns is where I run into errors. I referenced to this question.
Here's some sample data:
ID  X           Y           Hour
1   -87.78976   41.97658    16
1   -87.66991   41.92355    16
1   -87.59887   41.708447   17
2   -87.73956   41.876827   16
2   -87.68161   41.79886    16
2   -87.5999    41.7083     16
3   -87.59918   41.708485   17
3   -87.59857   41.708393   17
3   -87.64391   41.675133   17
Here's my code:
df = pd.read_csv("snow_gps.csv", sep=';')
#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = GeoDataFrame(df, geometry=geometry)
# aggregate these points with the GrouBy
geo_df = geo_df.groupby(['track_seg_point_id', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df = GeoDataFrame(geo_df, geometry='geometry')
Here is the error: ValueError: LineStrings must have at least 2 coordinate tuples
This is the final result I am trying to get:
ID          Hour     geometry
1           16       LINESTRING (-87.78976 41.97658, -87.66991 41.9... 
1           17       LINESTRING (-87.78964000000001 41.976634999999... 
1           18       LINESTRING (-87.78958 41.97663499999999, -87.6... 
2           16       LINESTRING (-87.78958 41.976612, -87.669785 41... 
2           17       LINESTRING (-87.78958 41.976624, -87.66978 41.... 
3           16       LINESTRING (-87.78958 41.97666, -87.6695199999... 
3           17       LINESTRING (-87.78954 41.976665, -87.66927 41.... 
Please any suggestions or ideas would be great on how to groupby multiple parameters.
Your code is good, the problem is your data.
You can see that if you group by ID and Hour, then there is only 1 point that is grouped with an ID of 1 and an hour of 17. A LineString has to consist of 1 or more Points (must have at least 2 coordinate tuples). I added another point to your sample data:
    ID   X          Y           Hour
    1   -87.78976   41.97658    16
    1   -87.66991   41.92355    16
    1   -87.59887   41.708447   17
    1   -87.48234   41.677342   17
    2   -87.73956   41.876827   16
    2   -87.68161   41.79886    16
    2   -87.5999    41.7083     16
    3   -87.59918   41.708485   17
    3   -87.59857   41.708393   17
    3   -87.64391   41.675133   17
and as you can see below the code below is almost identical to yours:
    import pandas as pd
    import geopandas as gpd
    from shapely.geometry import Point, LineString, shape
    df = pd.read_csv("snow_gps.csv", sep='\s*,\s*')
    #zip the coordinates into a point object and convert to a GeoData Frame
    geometry = [Point(xy) for xy in zip(df.X, df.Y)]
    geo_df = gpd.GeoDataFrame(df, geometry=geometry)
    geo_df2 = geo_df.groupby(['ID', 'Hour'])['geometry'].apply(lambda x:                 LineString(x.tolist()))
    geo_df2 = gpd.GeoDataFrame(geo_df2, geometry='geometry')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With