Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Templeton create table with custom inputformat

Copy link to this message
RE: Templeton create table with custom inputformat
Peter Marron 2013-08-02, 09:08
Just to answer my own question, just in case anybody gets as stuck as I was.
The answer is that although it's obvious that

                "INPUTFORMAT 'com.trilliumsoftware.profiling.LookupInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'"

is actually an object and so you might expect it to be an object in the JSON, it's not, it's just a string.
And so the JSON I need to create my table with custom INPUTFORMAT and OUTPUTFORMAT is like this:

"columns" :
[  { "name": "Year", "type": "string" },
  { "name": "Home", "type": "string" },
  { "name": "Away", "type": "string" },
  { "name": "Score", "type": "string" },
  { "name": "Venue", "type": "string" },
  { "name": "Attendance", "type": "string"}
"location" : "/user/pmarron/Ex/_output/rows",
  "format": {
        "rowFormat": {      "fieldsTerminatedBy": ","  },
        "storedAs" : "INPUTFORMAT 'com.trilliumsoftware.profiling.LookupInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'"

Hope that helps someone, it certainly would have helped me.

From: Peter Marron [mailto:[EMAIL PROTECTED]]
Sent: 29 July 2013 08:47
Subject: Templeton create table with custom inputformat


(I'm a little bit behind in reading the lists, so apologies if this is a duplicate question.)

I am running Templeton v1 (?) and HCatalog 0.5.0 with hive 0.11.0 over Hadoop 1.0.4.

I can use something like this:

curl -s -X PUT -HContent-type:application/json -d @createtable.json http://hpcluster1:50111/templeton/v1/ddl/database/default/table/ordinals?user.name=pmarron

to successfully create a Hive table in my metastore. Where the file createtable.json looks like this:

{ "external":true,
  "columns" :  [
        { "name": "english", "type": "string" },
        { "name": "number", "type": "string" },
        { "name": "italian", "type": "string" }
  "format": {
      "storedAs" : "rcfile"
      "rowFormat": { "fieldsTerminatedBy": ","  }

Now, I can change the "storedAs" argument to be "rcfile", "sequencefile", "textfile" or "orc" and they all work.
However I can't work out any syntax which allows me to create a table with a custom InputFormat.
Is there some way to create a table over the Templeton RESTful interface with  a custom InputFormat?

Also I can't find the source code where this JSON is parsed, is it shipped with the Hive 11 source?
If so can someone tell me where?

Many thanks in advance.