Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Is there any API that tells me what files comprise a hive table?


Copy link to this message
-
Re: Is there any API that tells me what files comprise a hive table?
Nitin Pawar 2013-10-07, 19:04
You may want to reach out to hdfs dev for the format of editlog. There is a
lot of information there and I am not sure how accurate I am.

In one of my previous works, we did convert the daily editlog to a
partitioned hive table and did exactly what you wanted to do.
Sadly we could not opensource that product.
On Tue, Oct 8, 2013 at 12:26 AM, demian rosas <[EMAIL PROTECTED]> wrote:

> Edward,
>
> Thanks a lot for this info !!!
>
> This gives me a clearer picture of the problem and how I can approach it.
>
> Cheers.
>
>
> On 7 October 2013 11:52, Edward Capriolo <[EMAIL PROTECTED]> wrote:
>
>> Not a direct API.
>>
>> What I do is this. From java/thrift:
>> Table t = client.getTable("name_of_table");
>> Path p = new Path(t.getSd.getLocation());
>> FileSystem fs = FileSystem.get(conf);
>> List<FileStatus> f = fs.listFiles(p)
>> /// your logic here.
>>
>>
>>
>>
>> On Mon, Oct 7, 2013 at 2:01 PM, demian rosas <[EMAIL PROTECTED]> wrote:
>>
>>> Hi all,
>>>
>>> I want to track the changes made to the files of a Hive table.
>>>
>>> I wounder whether there is any API that I can use to find out the
>>> following:
>>>
>>> 1. What files in hdfs constitute a hive table.
>>> 2. What is the size of each of these files.
>>> 3. The time stamp of the creation/last update to each of these files.
>>>
>>>
>>> Also in a wider view, is there any API that can do the above mentioned
>>> for HDFS files in general (not only hive specific)?
>>>
>>> Thanks a lot in advance.
>>>
>>> Cheers.
>>>
>>>
>>>
>>
>
--
Nitin Pawar