Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive select shows null after successful data load


+
Sunita Arvind 2013-06-18, 18:24
+
Nitin Pawar 2013-06-18, 18:26
+
Sunita Arvind 2013-06-18, 20:29
+
Nitin Pawar 2013-06-18, 20:35
+
Sunita Arvind 2013-06-18, 22:58
+
Stephen Sprague 2013-06-19, 00:38
+
Sunita Arvind 2013-06-19, 01:35
+
Sunita Arvind 2013-06-19, 01:58
+
Richa Sharma 2013-06-19, 06:17
+
Sunita Arvind 2013-06-19, 11:34
+
Stephen Sprague 2013-06-19, 14:38
+
Sunita Arvind 2013-06-19, 15:24
+
Ramki Palle 2013-06-19, 16:11
+
Sunita Arvind 2013-06-19, 17:00
+
Stephen Sprague 2013-06-19, 19:08
+
Sunita Arvind 2013-06-19, 19:29
+
Sunita Arvind 2013-06-20, 02:54
Copy link to this message
-
Re: Hive select shows null after successful data load
hooray!   over one hurdle and onto the next one.   So something about that
one nested array caused the problem.  very strange. I wonder if there is a
smaller test case to look at as it seems not all arrays break it since i
see one for the attribute "values".

As to the formatting issue i don't believe the native hive client has much
to offer there. its bare bones and record oriented.   beeline seems to
another opensource hive client which looks to have more options you might
have a gander at that though i don't think it has anything special for
pretty printing arrays, maps or structs but i could be wrong.

And then of course nothing stopping you though from exploring piping that
gnarly stuff into python (or whatever) and have it come out the other end
all nice and pretty -- and then posting that here. :)
On Wed, Jun 19, 2013 at 7:54 PM, Sunita Arvind <[EMAIL PROTECTED]>wrote:

> Finally I could get it work. The issue resolves once I remove the arrays
> within position structure. So that is the limitation of the serde. I
> changed 'industries' to string and 'jobfunctions' to Map<string,string> I
> can query the table just fine now. Here is the complete DDL for reference:
>
> create external table linkedin_Jobsearch (
>
> jobs STRUCT<
> values : ARRAY<STRUCT<
> company : STRUCT<
> id : STRING,
> name : STRING>,
> postingDate : STRUCT<
> year : STRING,
> day : STRING,
> month : STRING>,
> descriptionSnippet : STRING,
> expirationDate : STRUCT<
> year : STRING,
> day : STRING,
> month : STRING>,
> position : STRUCT<
> jobFunctions : MAP<STRING,STRING>,    ------these were arrays of structure
> in my previous attempts
> industries : STRING,
> title : STRING,
>
> jobType : STRUCT<
> code : STRING,
> name : STRING>,
> experienceLevel : STRUCT<
> code : STRING,
> name : STRING>>,
> id : STRING,
> customerJobCode : STRING,
> skillsAndExperience : STRING,
> salary : STRING,
> jobPoster : STRUCT<
> id : STRING,
> firstName : STRING,
> lastName : STRING,
> headline : STRING>,
> referralBonus : STRING,
> locationDescription : STRING>>>
> )
> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
> LOCATION '/user/sunita/tables/jobs';
>
> Thanks Stephen for sharing your thoughts. It helped.
>
> Also if someone /Stephen could help me display this information in a
> useful manner, that would be great. Right now all the values show up as
> arrays. Here is what I mean:
> For a query like this:
> hive> select jobs.values.company.name, jobs.values.position.title,
> jobs.values.locationdescription from linkedin_jobsearch;
>
> This is the output:
>
> ["CyberCoders","CyberCoders","CyberCoders","Management Science
> Associates","Google","Google","CyberCoders","CyberCoders","HP","Sigmaways","Global
> Data Consultancy","Global Data
> Consultancy","CyberCoders","CyberCoders","CyberCoders","VMware","CD IT
> Recruitment","CD IT Recruitment","Digital Reasoning Systems","AOL"]
> ["Software Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Software
> Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Software
> Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Data
> Architect","Systems Engineer, Site Reliability Engineering","Systems
> Engineer, Site Reliability Engineering","NoSQL Engineer - MongoDB for big
> data, web crawling - RELO OFFER","NoSQL Engineer - MongoDB for big data,
> web crawling - RELO OFFER","Hadoop Database Administrator Medicare","Hadoop
> / Big Data Consultant","Lead Hadoop developer","Head of Big Data -
> Hadoop","Hadoop Engineer - Hadoop, Operations, Linux Admin, Java,
> Storage","Sr. Hadoop Administrator - Hadoop, MapReduce, HDFS","Sr. Hadoop
> Administrator - Hadoop, MapReduce, HDFS","Software Engineer - Big
> Data","Hadoop Team Lead Consultant - Global Leader in Big Data
> solutions","Hadoop Administrator Consultant - Global Leader in Big Data
> solutions","Java Developer","Sr.Software Engineer-Big Data-Hadoop"]
> ["Pittsburgh, PA","Pittsburgh, PA","Harrisburg, PA","Pittsburgh, PA
> (Shadyside area near Bakery Square)","Pittsburgh, PA, USA","Pittsburgh,
+
Sunita Arvind 2013-06-21, 08:42