Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Is there any API that tells me what files comprise a hive table?


+
demian rosas 2013-10-07, 18:01
+
Nitin Pawar 2013-10-07, 18:13
+
demian rosas 2013-10-07, 18:50
+
Edward Capriolo 2013-10-07, 18:52
+
Sanjay Subramanian 2013-10-07, 19:04
+
demian rosas 2013-10-07, 19:16
+
demian rosas 2013-10-07, 18:56
Copy link to this message
-
Re: Is there any API that tells me what files comprise a hive table?
You may want to reach out to hdfs dev for the format of editlog. There is a
lot of information there and I am not sure how accurate I am.

In one of my previous works, we did convert the daily editlog to a
partitioned hive table and did exactly what you wanted to do.
Sadly we could not opensource that product.
On Tue, Oct 8, 2013 at 12:26 AM, demian rosas <[EMAIL PROTECTED]> wrote:

> Edward,
>
> Thanks a lot for this info !!!
>
> This gives me a clearer picture of the problem and how I can approach it.
>
> Cheers.
>
>
> On 7 October 2013 11:52, Edward Capriolo <[EMAIL PROTECTED]> wrote:
>
>> Not a direct API.
>>
>> What I do is this. From java/thrift:
>> Table t = client.getTable("name_of_table");
>> Path p = new Path(t.getSd.getLocation());
>> FileSystem fs = FileSystem.get(conf);
>> List<FileStatus> f = fs.listFiles(p)
>> /// your logic here.
>>
>>
>>
>>
>> On Mon, Oct 7, 2013 at 2:01 PM, demian rosas <[EMAIL PROTECTED]> wrote:
>>
>>> Hi all,
>>>
>>> I want to track the changes made to the files of a Hive table.
>>>
>>> I wounder whether there is any API that I can use to find out the
>>> following:
>>>
>>> 1. What files in hdfs constitute a hive table.
>>> 2. What is the size of each of these files.
>>> 3. The time stamp of the creation/last update to each of these files.
>>>
>>>
>>> Also in a wider view, is there any API that can do the above mentioned
>>> for HDFS files in general (not only hive specific)?
>>>
>>> Thanks a lot in advance.
>>>
>>> Cheers.
>>>
>>>
>>>
>>
>
--
Nitin Pawar
+
demian rosas 2013-10-07, 19:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB