Roger,
A basic time series construct is the "sliding" window in conjunction
with sorted time/value data; A sample implementation is at my github:
https://github.com/jpatanooga/Caduceus/tree/master/src/tv/floe/caduceus/hadoop/movingaverageThere are two jobs in there, one that uses the shuffle and one that
does not --- to illustrate the difference. I have a blog draft coming
that accompanies this code, I'll follow up and send you a copy draft
of it.
>From that code you should be able to build out a more complex time
series / DSP process (using it as base code), something along the
lines of a 1NN classifier:
https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/docs/openPDC%20Datamining%20Tools%20Guide.pdfhttps://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/src/TVA/Hadoop/MapReduce/Datamining/SAX/SlidingTSClassifier_kNN.javaI'm in the process of updating that older openPDC code to be more
modern and modular for general data sources.
Josh
On Sat, Mar 5, 2011 at 12:05 AM, Roger Smith <[EMAIL PROTECTED]> wrote:
> All -
> I wonder if any of you have integrated a DSP library with Hadoop.
> We are considering using Hadoop to processing time series data, but don't
> want to write standard DSP functions.
>
> Roger.
>
--
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop:
http://www.cloudera.comblog:
http://jpatterson.floe.tv