Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Databricks - FileNotFoundException

I'm sorry if this is basic and I missed something simple. I'm trying to run the code below to iterate through files in a folder and merge all files that start with a specific string, into a dataframe. All files sit in a lake.

file_list=[]
path = "/dbfs/rawdata/2019/01/01/parent/"
files  = dbutils.fs.ls(path)
for file in files:
    if(file.name.startswith("CW")):
       file_list.append(file.name)
df = spark.read.load(path=file_list)

# check point
print("Shape: ", df.count(),"," , len(df.columns))
db.printSchema()

This looks fine to me, but apparently something is wrong here. I'm getting an error on this line:
files = dbutils.fs.ls(path)

Error message reads:

java.io.FileNotFoundException: File/6199764716474501/dbfs/rawdata/2019/01/01/parent does not exist.

The path, the files, and everything else definitely exist. I tried with and without the 'dbfs' part. Could it be a permission issue? Something else? I Googled for a solution. Still can't get traction with this.

like image 504
ASH Avatar asked Jun 05 '26 19:06

ASH


1 Answers

Make sure you have a folder named "dbfs" if your parent folder starts from "rawdata" the path should be "/rawdata/2019/01/01/parent" or "rawdata/2019/01/01/parent".

The error is thrown in case of incorrect path.

like image 54
Abraham Avatar answered Jun 08 '26 17:06

Abraham