rather easy to do in Pig with a UDF, filter values > threshold, Group ALL,
then nested foreach which does an order by on the timestamp and calls your
UDF on the sorted bag in the generate
On Mar 29, 2012 11:03 AM, "banermatt" <[EMAIL PROTECTED]> wrote:
> I'm developping a log file anomaly detection system on an hadoop cluster.
> I'm looking for a way to process query like: "select all values when
> value>threshold for a duration>30 secondes". Do you know a tool which could
> help me to process such a query?
> I documented on the script langages pig, hive and jaql which seem to have
> very similar application. I tried it but I was not be able to do what I
> Thank you in advance,
> View this message in context:
> Sent from the Hadoop core-user mailing list archive at Nabble.com.