Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Reflect MySQL updates into Hive


Copy link to this message
-
Re: Reflect MySQL updates into Hive
Thanks Mohammad, I will be waiting ... meanwhile, seems I will get into
HBase and give it a try ... unless someone advised with something
better/easier.
--
Ibrahim
On Wed, Dec 26, 2012 at 5:52 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Hello Ibrahim,
>
>            Sorry for the late response. Those replies were for Kshiva. I
> saw his question(exactly same as this one) multiple times on Pig mailing
> list as well, so just thought of giving some pointers to him on how to use
> the list. I should have specified it properly. Apologies for creating the
> nuisance.
>
> Coming back to the actual point, yes the flow is fine. Normally people do
> it like this. But I was looking for some alternate way, so that we don't
> have to go through this long process for the updates. I'll let you know
> once I find something useful. But till now I haven't found anything better
> than whatever Dean sir has suggested. Please, do let me know if you find
> something before me.
>
> Many thanks.
>
>
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
>
>
> On Wed, Dec 26, 2012 at 7:24 PM, Ibrahim Yakti <[EMAIL PROTECTED]> wrote:
>
>> After more reading, a suggested scenario looks like:
>>
>> MySQL ---(Extract / Load)---> HDFS ---> Load into HBase --> Read as
>> external in Hive ---(Transform Data & Join Tables)--> Use hive for Joins &
>> Queries ---> Update HBase as needed & Reload in Hive.
>>
>> What do you think please?
>>
>>
>>
>> --
>> Ibrahim
>>
>>
>> On Wed, Dec 26, 2012 at 9:27 AM, Ibrahim Yakti <[EMAIL PROTECTED]> wrote:
>>
>>> Mohammad, I am not sure if the answers & the link were to me or to
>>> Kshiva's question.
>>>
>>> if I have partitioned my data based on status for example, when I run
>>> the update query it will add the updated data on a new partition (success
>>> or shipped for example) and it will keep the old data (confirmed or paid
>>> for example), right?
>>>
>>>
>>> --
>>> Ibrahim
>>>
>>>
>>> On Tue, Dec 25, 2012 at 8:59 AM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>>>
>>>> Also, have a look at this :
>>>> http://www.catb.org/~esr/faqs/smart-questions.html
>>>>
>>>> Best Regards,
>>>> Tariq
>>>> +91-9741563634
>>>> https://mtariq.jux.com/
>>>>
>>>>
>>>> On Tue, Dec 25, 2012 at 11:26 AM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Have a look at Beeswax.
>>>>>
>>>>> BTW, do you have access to Google at your station?Same question on the
>>>>> Pig mailing list as well, that too twice.
>>>>>
>>>>> Best Regards,
>>>>> Tariq
>>>>> +91-9741563634
>>>>> https://mtariq.jux.com/
>>>>>
>>>>>
>>>>> On Tue, Dec 25, 2012 at 11:20 AM, Kshiva Kps <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there any Hive editors and where we can write 100 to 150 Hive
>>>>>> scripts I'm believing is not essay  to  do in CLI mode all scripts .
>>>>>> Like IDE for JAVA /TOAD for SQL pls advice , many thanks
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Mon, Dec 24, 2012 at 8:21 PM, Dean Wampler <
>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>> This is not as hard as it sounds. The hardest part is setting up the
>>>>>>> incremental query against your MySQL database. Then you can write the
>>>>>>> results to new files in the HDFS directory for the table and Hive will see
>>>>>>> them immediately. Yes, even though Hive doesn't support updates, it doesn't
>>>>>>> care how many files are in the directory. The trick is to avoid lots of
>>>>>>> little files.
>>>>>>>
>>>>>>> As others have suggested, you should consider partitioning the data,
>>>>>>> perhaps by time. Say you import about a few HDFS blocks-worth of data each
>>>>>>> day, then use year/month/day partitioning to speed up your Hive queries.
>>>>>>> You'll need to add the partitions to the table as you go, but actually, you
>>>>>>> can add those once a month, for example, for all partitions. Hive doesn't
>>>>>>> care if the partition directories don't exist yet or the directories are
>>>>>>> empty. I also recommend using an external table, which gives you more
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB