I'm using data stored in avro-1.7.4 format and trying to use Pig for data manipulation. When trying to LOAD data and then STORE them again I've receive following error:
ERROR 2116: Output Location Validation Failed for: 'file:///home/pig/100/test.avro More info to follow: Can't redefine: Employees
Any ideas / suggestions would be appreciated.
Thanks.
Employees field is in two places of schema:
Partial schema:
{
"name" : "Employees",
"type" : [ "null", {
"type" : "array",
"items" : {
"type" : "record",
"name" : "CheckResponsibleEmployee",
"fields" : [ {
"name" : "Id",
"type" : "string"
}, {
"name" : "Name",
"type" : "string"
}, {
"name" : "Job",
"type" : "Job"
}, {
"name" : "Time",
"type" : [ "null", "Date" ],
"default" : null
} ]
}
} ],
"default" : null
}
in another place (but i think this is okay) :
{
"name" : "Employees",
"type" : "ResponsibleEmployees"
}
Im just simply running script (with loaded libraries piggybank, avro 1.7.4, mapred, etc):
data = LOAD 'part-m-00000.avro' USING AvroStorage();
STORE data INTO 'output.avro' USING AvroStorage();
Pig Stack Trace
---------------
ERROR 2116:
Output Location Validation Failed for: 'file:///home/pig/100/test.avro More info to follow:
Can't redefine: Employees
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias posdata
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1635)
at org.apache.pig.PigServer.registerQuery(PigServer.java:575)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1093)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:541)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2116:
Output Location Validation Failed for: 'file:///home/pig/100/test.avro More info to follow:
Can't redefine: Employees
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:300)
at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
at org.apache.pig.PigServer.execute(PigServer.java:1297)
at org.apache.pig.PigServer.access$400(PigServer.java:122)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1630)
... 13 more
Caused by: org.apache.avro.SchemaParseException: Can't redefine: Employees
at org.apache.avro.Schema$Names.put(Schema.java:1019)
at org.apache.avro.Schema$NamedSchema.writeNameRef(Schema.java:496)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:611)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema$ArraySchema.toJson(Schema.java:722)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema$ArraySchema.toJson(Schema.java:722)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema$ArraySchema.toJson(Schema.java:722)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:799)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:633)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:620)
at org.apache.avro.Schema.toString(Schema.java:291)
at org.apache.avro.Schema.toString(Schema.java:281)
at org.apache.pig.builtin.AvroStorage.setOutputAvroSchema(AvroStorage.java:504)
at org.apache.pig.builtin.AvroStorage.checkSchema(AvroStorage.java:495)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65)
... 25 more
It's about ambiguity of fieldnames in PigLatinSchema. I've solved it by redefining/correcting of avro schema not to contain fields with the same name referring to different types of records.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With