Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # dev >> Seeks with DataFileReader in C++


+
Daniel Russel 2013-01-23, 05:50
+
Thiruvalluvan MG 2013-01-23, 12:56
Copy link to this message
-
Re: Seeks with DataFileReader in C++
In our case, we have files created from large numbers of frames stored sequentially as records in a data file. Currently, finding the i-th frame requires going to the beginning and reading all records until the appropriate one is found. Doing binary search or some sort of index based search would decrease load times for many operations significantly. It would also make implementing map-reduce sorts of operations on the data files easier since currently there is no reliably way to shard the files.

I'll work on the patch, nothing written yet :-)
       --Daniel

On Jan 23, 2013, at 4:56 AM, Thiruvalluvan MG <[EMAIL PROTECTED]> wrote:

> Hi Daniel,
>
> I think it will be nice if you can describe your use case. Yes, we'll be interested in seeing your implementation. Since this will be an added feature, it harms none unless they use this feature. Please go ahead and create a ticket and submit a patch.
>
> Thanks
>
> Thiru
>
>
> ________________________________
> From: Daniel Russel <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Wednesday, 23 January 2013 11:20 AM
> Subject: Seeks with DataFileReader in C++
>
> From what I can tell, there is no way to do any sort of random access with the C++ DataFileReader API. Is this correct? Is someone working on that? If not, and people think this would be a generally interesting capability, I'd consider implementing it as I'd kind of like to have it. Thanks.
>              --Daniel
+
Thiruvalluvan MG 2013-01-24, 16:46
+
Daniel Russel 2013-01-28, 19:31
+
Thiruvalluvan MG 2013-01-30, 13:38