Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Parsing variable schema


Copy link to this message
-
Parsing variable schema
Here is a snippet of how schema is applied to tuples

String serializedSchema = p.getProperty(signature + SCHEMA_FILE);
                if (serializedSchema != null) {
                    try {
                        resourceSchema = new
ResourceSchema(Utils.getSchemaFromString(serializedSchema));
                    } catch (ParserException e) {
                        mLog.error("Unable to parse serialized schema " +
serializedSchema, e);
                    }
                }
Is there a good way to define multiple "serializedSchema" which could be
applied to different type of tuples (different log lines)? I am able to
push this logic into a UDF to parse a record based on a schema data
structure I build within it. Wondering if this can be done in LoadFunc
itself.

Thanks,
Prashant
+
Jonathan Coveney 2012-12-12, 18:07
+
Prashant Kommireddi 2012-12-13, 07:51