Hi William,

Glad to hear you're enjoying Kudu so far! Always nice to hear from excited
Kudu-explorers.

You're right about the current limitations of Kudu when it comes to binary
storage. Due to these, the use-cases you mentioned for large files in Kudu
are currently fairly uncommon. There are unsupported workarounds to this
limit (e.g. --unlock_unsafe_flags --max_cell_size_bytes=<more than 64kb>),
but values above 64KB are, as the name implies, unsafe, untested, and not
openly supported. If you'd like to experiment with these, you're more than
welcome to (and report back with what you find!).

There may be others more qualified to discuss storing larger data in Kudu,
but my understanding of it is that Kudu stores groups of rows together in a
columnar format (rowsets), and roll these based on size. If one (or more)
of these columns are particularly large, the resulting rowset might be a
single row, and you might hit performance walls, etc. There may be more
issues I'm not aware of. Overall it's just very different from what Kudu
handles well at the moment (not to deter you, I am interested in seeing
what you find if you do pursue this).
Andrew

On Fri, Sep 8, 2017 at 1:37 PM, William Li <[EMAIL PROTECTED]>
wrote:

--
Andrew Wong
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB