I am pretty new to Nifi and I’m struggling on something that (in my mind) should be very easy to do 😉
My flow consists of a Json file being processed by different processors to extract different information and enrich the data. Each processors have been implemented as ExecuteStreamCommand and will output the information extracted in a JSON like element. As an example, one of the module determines the language of one of the field in the original JSON and will output something like:

{ “language� : “en� }

Every module is extracting a different piece of information and my goal was to reduce the amount of data going around.

What would be the best way of merging the responses on all modules into my JSON when everything has been processed? The resulting JSON will then continue in the flow for further processing.

I tried using the MergeContent module but the output format can not be in JSON so I’m a bit stuck. Right now, the merge strategy is set to “Bin-Packing algorithm� with the Correlation attribute set to ${filename}. The min and max entries are set to the expected number of elements to merge (6 in this case).

I tried the “Defragment� strategy as well but the system was complaining about missing fragment.index attribute (which I tried to provide through an UpdateAttribute processor but that does not seem to work either

Jean-Sébastien Vachon

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB