Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Best practices in sizing values?

Copy link to this message
Best practices in sizing values?
I have an application where I have a block of unstructured text.  Normally that text is relatively small <500k, but there are conditions where it can be up to GBs of text.  
I was considering of using a threshold where I simply decide to change from storing the text in the value of my mutation, and just add a reference to the HDFS location, but I wanted to get some advice on where that threshold should (best practice) or must (system limitation) be?
Also, can I stream data into a value, vice passing a byte array?  Similar to how CLOBs and BLOBs are handled in an RDBMS.