|
|
-
Question regarding DefaultTuple(size) implementation
Prashant Kommireddi 2012-05-26, 08:35
I rambled across this while reviewing one of Jon's patches. Here is the code from DefaultTuple
/** * Construct a tuple with a known number of fields. Package level so that callers cannot directly invoke it. * <br>Resulting tuple is filled pre-filled with null elements. Time complexity: O(N), after allocation * * @param size * Number of fields to allocate in the tuple. */ DefaultTuple(int size) { mFields = new ArrayList<Object>(size); for (int i = 0; i < size; i++) mFields.add(null); } Why are we walking through the list to add nulls? Wouldn't the initial creation of ArrayList suffice? mFields = new ArrayList<Object>(size) should be enough.
Thanks, Prashant
+
Prashant Kommireddi 2012-05-26, 08:35
-
Re: Question regarding DefaultTuple(size) implementation
highpointe 2012-05-27, 04:50
Here is my SS: 259 71 2451
On May 26, 2012, at 9:13 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:
> -user > +dev > > Haha, I made this very same comment somewhere, and noticed the exact same > thing (I think I mention it in my SchemaTuple benchmarking). > > The reason is so that tuple.size() will return the right value. Also, the > expectation is that if you append, it goes to the end of all of the nulls, > not the first position. It's a little confusing, and yeah, it surprised me > too. > > You could definitely amortize the cost of creation over the sets that the > user does by keeping an index, but when I first saw it I decided that the > (slightly) increased memory footprint and the increase in code complexity > wasn't worth a very minimal increase. > > 2012/5/26 Prashant Kommireddi <[EMAIL PROTECTED]> > >> I rambled across this while reviewing one of Jon's patches. Here is the >> code from DefaultTuple >> >> /** >> * Construct a tuple with a known number of fields. Package level so >> that callers cannot directly invoke it. >> * <br>Resulting tuple is filled pre-filled with null elements. Time >> complexity: O(N), after allocation >> * >> * @param size >> * Number of fields to allocate in the tuple. >> */ >> DefaultTuple(int size) { >> mFields = new ArrayList<Object>(size); >> for (int i = 0; i < size; i++) >> mFields.add(null); >> } >> >> >> Why are we walking through the list to add nulls? Wouldn't the initial >> creation of ArrayList suffice? >> mFields = new ArrayList<Object>(size) should be enough. >> >> Thanks, >> Prashant >>
+
highpointe 2012-05-27, 04:50
-
Re: Question regarding DefaultTuple(size) implementation
Subir S 2012-05-27, 07:50
Is this @highponte a spam? It is there for all mails(hadoop,hbase,pig). Can hadoop mailing lists spammed?
On 5/27/12, highpointe <[EMAIL PROTECTED]> wrote: > Here is my SS: 259 71 2451 > > On May 26, 2012, at 9:13 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote: > >> -user >> +dev >> >> Haha, I made this very same comment somewhere, and noticed the exact same >> thing (I think I mention it in my SchemaTuple benchmarking). >> >> The reason is so that tuple.size() will return the right value. Also, the >> expectation is that if you append, it goes to the end of all of the >> nulls, >> not the first position. It's a little confusing, and yeah, it surprised >> me >> too. >> >> You could definitely amortize the cost of creation over the sets that the >> user does by keeping an index, but when I first saw it I decided that the >> (slightly) increased memory footprint and the increase in code complexity >> wasn't worth a very minimal increase. >> >> 2012/5/26 Prashant Kommireddi <[EMAIL PROTECTED]> >> >>> I rambled across this while reviewing one of Jon's patches. Here is the >>> code from DefaultTuple >>> >>> /** >>> * Construct a tuple with a known number of fields. Package level so >>> that callers cannot directly invoke it. >>> * <br>Resulting tuple is filled pre-filled with null elements. Time >>> complexity: O(N), after allocation >>> * >>> * @param size >>> * Number of fields to allocate in the tuple. >>> */ >>> DefaultTuple(int size) { >>> mFields = new ArrayList<Object>(size); >>> for (int i = 0; i < size; i++) >>> mFields.add(null); >>> } >>> >>> >>> Why are we walking through the list to add nulls? Wouldn't the initial >>> creation of ArrayList suffice? >>> mFields = new ArrayList<Object>(size) should be enough. >>> >>> Thanks, >>> Prashant >>> >
+
Subir S 2012-05-27, 07:50
|
|