I believe the default LazySerDe takes a parameter called 'serialization.last.column.takes.rest'. Setting this to true might solve your issue (restoMsg would become a string then and you might have to parse it in the query into an array)
On May 30, 2012, at 9:27 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
> I’m trying to define a table over an external file. My file has 12 fixed columns followed by a varying amount of columns that depends on some of the fixed ones. I tried to define the table as:
> CREATE EXTERNAL TABLE IF NOT EXISTS log_array (
> dt string,
> txOperOpciResto string,
> idRegPerf string,
> oper string,
> opcion string,
> accion string,
> servc string,
> canal string,
> platf string,
> codIdioma string,
> pais string,
> lacre string,
> dirIP string,
> restoMsg array<string>
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> COLLECTION ITEMS TERMINATED BY '|'
> STORED AS SEQUENCEFILE
> LOCATION '/user/hadoop-user/uc3/seq/';
> So what I tried was to get all varing part on an array field (restoMsg). The trick is not working because both delimiters, fields and collections, are the same. My restoMsg field only gets one column and the rest are omitted.
> Is there any way to get that last part without custom code? If not, what classes should I create to this and how can I define the table then?
> Ramón Pin
> Subject to local law, communications with Accenture and its affiliates including telephone calls and emails (including content), may be monitored by our systems for the purposes of security and the assessment of internal compliance with Accenture policy.