Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> UPDATE statement in Hive?


Copy link to this message
-
RE: UPDATE statement in Hive?
There is no update statement at this time and as there is no update of a file in hadoop and update in Hive though possible would just be syntax sugar for merging the new values to the old data in the table and then rewriting the table with the merged output. This can be achieved by doing an insert overwrite on the old table from the results of the merge done by a left outer join on the old table and the new data staged in another table. Also note that when you are updating the table, current queries running on the table may fail.

Another option is to change your schema so that the table actually contains the changes to the row instead of the row values themselves and then change the query that takes the new schema into account.

Ashish

________________________________________
From: Saurabh Nanda [[EMAIL PROTECTED]]
Sent: Tuesday, July 28, 2009 3:41 AM
To: [EMAIL PROTECTED]
Subject: UPDATE statement in Hive?

Is there an UPDATE statement in Hive? If not, are there any plans for adding support for it in the future?

This is why I ask: I want to maintain a table which, against each user ID, stores the first visit & last visit time. This is across the entire year, not a day -- basically to understand how many visitors we got in last 1/3/6 months, etc.

I can add new users into a separate partition to get around the limitation of not being able to append rows to a table. However, I don't know how to update the last_visited_at column for each user?

Is this best achieved by storing this table outside of Hive in a traditional RDBMS? Using JDBC query Hive for a list of distinct visitors today and based on that list update the 'external' table.

Saurabh.
--
http://nandz.blogspot.com
http://foodieforlife.blogspot.com