I wonder if JDBC driver over Hive could help you. If you legacy ETL job can
talk to a jdbc driver, it is a slow way of writing to HDFS and I don't have
any experience doing it, e.g.:
On Tue, Oct 8, 2013 at 10:07 AM, Jitendra Yadav
> Hi Bertrand,
> Thanks for your reply.
> As per my understanding mentioned open source does not support
> procedure language(PL) flexibility. Right?
> I was looking for some other alternatives so that we can migrate our
> existing code rather then creating java UDF etc. So handling complex
> ETL business logic is still very difficult on Hadoop in terms of
> coding, QA and performance?
> On 10/8/13, Bertrand Dechoux <[EMAIL PROTECTED]> wrote:
> > open source : Pig, Hive, Cascading ...
> > other : Talend ...
> > Is that the answer you are expecting or are you looking for something
> > specific?
> > Regards
> > Bertrand
> > On Tue, Oct 8, 2013 at 5:47 PM, Jitendra Yadav
> > <[EMAIL PROTECTED]>wrote:
> >> Hi All,
> >> We are planning to consolidate our 3 existing warehouse databases to
> >> Hadoop cluster, In our testing phase we have designed the target
> >> environment and transferred the data from source to target (not in
> >> sync but almost completed ). These legacy systems were using
> >> traditional ETL/replication mechanism like Golden gate, Loaders,
> >> PL/SQL language etc., FYI we are using 80% PL/SQL code and SQL server
> >> packages in current environment, however we have re-writing some of
> >> ETL jobs through java and python MR but looking for some more and easy
> >> alternatives.
> >> What is best approach we should follow to complete this process? while
> >> suggesting please take effort and timing in consideration( if
> >> possible).
> >> Please guide.
> >> Regards
> >> Jitendra