Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> String Representation of DataBag and its Schema


Copy link to this message
-
String Representation of DataBag and its Schema
In Java, I am trying to convert a DataBag from it's String representation
with its schema String to a valid DataBag Object:

String databag_string = "{(apples,1024)}";
String schema_string = "b1:bag{t1:tuple(a:chararray,b:long)}";

I've tried implementing something along the lines of this, but I believe
it's in the wrong direction, and then I get stuck:

        String[] aliases = {"b1", "t1", "a", "b"};
        byte[] types = {DataType.BAG, DataType.TUPLE, DataType.CHARARRAY,
DataType.LONG};
        List<Schema.FieldSchema> fsList = new
ArrayList<Schema.FieldSchema>();
        for (int i = 0; i < aliases.length; i++) {
            fsList.add(new Schema.FieldSchema(aliases[i], types[i])) ;
        }
        Schema origSchema = new Schema(fsList);
        ResourceSchema rsSchema = new ResourceSchema(origSchema);
        Schema genSchema = Schema.getPigSchema(rsSchema);
        ResourceSchema.ResourceFieldSchema[] rfschema rsSchema.getFields();
        ... lost here, maybe Utf8StorageConverter c = new
Utf8StorageConverter(); ???
An ideal process would be along the lines of:

DataBag d = BagFactory.getInstance().newDefaultBag();
d.something(databag_string, schema_string);    // ??? no idea what this
process could be
d.toString().equals(databag_string) == true.

Thanks, -Dan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB