Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Reflect MySQL updates into Hive


+
Ibrahim Yakti 2012-12-24, 13:08
+
Dean Wampler 2012-12-24, 14:51
+
Ibrahim Yakti 2012-12-24, 15:34
+
Dean Wampler 2012-12-24, 18:12
+
Ibrahim Yakti 2012-12-24, 18:25
+
Kshiva Kps 2012-12-25, 05:50
+
Mohammad Tariq 2012-12-25, 05:56
+
Mohammad Tariq 2012-12-25, 05:59
+
Ibrahim Yakti 2012-12-26, 06:27
+
Ibrahim Yakti 2012-12-26, 13:54
Copy link to this message
-
Re: Reflect MySQL updates into Hive
Mohammad Tariq 2012-12-26, 14:52
Hello Ibrahim,

           Sorry for the late response. Those replies were for Kshiva. I
saw his question(exactly same as this one) multiple times on Pig mailing
list as well, so just thought of giving some pointers to him on how to use
the list. I should have specified it properly. Apologies for creating the
nuisance.

Coming back to the actual point, yes the flow is fine. Normally people do
it like this. But I was looking for some alternate way, so that we don't
have to go through this long process for the updates. I'll let you know
once I find something useful. But till now I haven't found anything better
than whatever Dean sir has suggested. Please, do let me know if you find
something before me.

Many thanks.
Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/
On Wed, Dec 26, 2012 at 7:24 PM, Ibrahim Yakti <[EMAIL PROTECTED]> wrote:

> After more reading, a suggested scenario looks like:
>
> MySQL ---(Extract / Load)---> HDFS ---> Load into HBase --> Read as
> external in Hive ---(Transform Data & Join Tables)--> Use hive for Joins &
> Queries ---> Update HBase as needed & Reload in Hive.
>
> What do you think please?
>
>
>
> --
> Ibrahim
>
>
> On Wed, Dec 26, 2012 at 9:27 AM, Ibrahim Yakti <[EMAIL PROTECTED]> wrote:
>
>> Mohammad, I am not sure if the answers & the link were to me or to
>> Kshiva's question.
>>
>> if I have partitioned my data based on status for example, when I run the
>> update query it will add the updated data on a new partition (success or
>> shipped for example) and it will keep the old data (confirmed or paid for
>> example), right?
>>
>>
>> --
>> Ibrahim
>>
>>
>> On Tue, Dec 25, 2012 at 8:59 AM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>>
>>> Also, have a look at this :
>>> http://www.catb.org/~esr/faqs/smart-questions.html
>>>
>>> Best Regards,
>>> Tariq
>>> +91-9741563634
>>> https://mtariq.jux.com/
>>>
>>>
>>> On Tue, Dec 25, 2012 at 11:26 AM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>>>
>>>> Have a look at Beeswax.
>>>>
>>>> BTW, do you have access to Google at your station?Same question on the
>>>> Pig mailing list as well, that too twice.
>>>>
>>>> Best Regards,
>>>> Tariq
>>>> +91-9741563634
>>>> https://mtariq.jux.com/
>>>>
>>>>
>>>> On Tue, Dec 25, 2012 at 11:20 AM, Kshiva Kps <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Is there any Hive editors and where we can write 100 to 150 Hive
>>>>> scripts I'm believing is not essay  to  do in CLI mode all scripts .
>>>>> Like IDE for JAVA /TOAD for SQL pls advice , many thanks
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Mon, Dec 24, 2012 at 8:21 PM, Dean Wampler <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> This is not as hard as it sounds. The hardest part is setting up the
>>>>>> incremental query against your MySQL database. Then you can write the
>>>>>> results to new files in the HDFS directory for the table and Hive will see
>>>>>> them immediately. Yes, even though Hive doesn't support updates, it doesn't
>>>>>> care how many files are in the directory. The trick is to avoid lots of
>>>>>> little files.
>>>>>>
>>>>>> As others have suggested, you should consider partitioning the data,
>>>>>> perhaps by time. Say you import about a few HDFS blocks-worth of data each
>>>>>> day, then use year/month/day partitioning to speed up your Hive queries.
>>>>>> You'll need to add the partitions to the table as you go, but actually, you
>>>>>> can add those once a month, for example, for all partitions. Hive doesn't
>>>>>> care if the partition directories don't exist yet or the directories are
>>>>>> empty. I also recommend using an external table, which gives you more
>>>>>> flexibility on directory layout, etc.
>>>>>>
>>>>>> Sqoop might be the easiest tool for importing the data, as it will
>>>>>> even generate a Hive table schema from the original MySQL table. However,
>>>>>> that feature may not be useful in this case, as you already have the table.
>>>>>>
>>>>>> I think Oozie is horribly complex to use and overkill for this
+
Ibrahim Yakti 2012-12-26, 14:56
+
Mohammad Tariq 2012-12-24, 13:19
+
Ibrahim Yakti 2012-12-24, 13:30
+
Mohammad Tariq 2012-12-24, 13:35
+
Ibrahim Yakti 2012-12-24, 13:38
+
Mohammad Tariq 2012-12-24, 14:03
+
Ibrahim Yakti 2012-12-24, 14:08
+
Mohammad Tariq 2012-12-24, 14:25
+
Ibrahim Yakti 2012-12-24, 14:28
+
Jeremiah Peschka 2012-12-24, 14:22
+
Edward Capriolo 2012-12-24, 14:28
+
Mohammad Tariq 2012-12-24, 14:31
+
Ibrahim Yakti 2012-12-24, 14:29
+
Edward Capriolo 2012-12-24, 14:37
+
Ibrahim Yakti 2012-12-24, 14:41