Hi, I have 2 questions related to the hive behavior when using 'add jar'.
I am testing the implementing of my own Hive InputFormat and SerDe in a jar in my single machine cluster running in Pseudo distributed mode. In the jar, I will include the properties file in the top level of the jar.
In my custom code, I will try to load the properties file through the following way:
I am sure that the my.properties exists in the my jar file, but this.getClass().getResourceAsStream("my.properties") at runtime will return NULL in this case. I am not sure the reason for this. Does anyone have an idea?
Second question is that when my test data is small, which is less than the setting of (hive.exec.mode.local.auto.inputbytes.max, not sure I typed correct here), the hive will run my query locally. But in this case, the HIVE will fail due to my custom class (like InputFormat) not found error. Of course in my session, I did the 'add jar xxxx.jar' command. If the test is big, it will run in the standalone cluster without any problem (it finds the class in my jar in this case). My question is this normal? Why hive running in local mode won't be able to find my class in the jar which is already being added?
My environment is CDH3U5, hive will be 0.7.1 in it.
Mark Grover 2012-12-22, 00:20
java8964 java8964 2012-12-22, 01:59