Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> Null values and escaping


Copy link to this message
-
Re: Null values and escaping
Ah now I understand. I think you can get the desired effect of using '\N'
by providing the arguments --null-string '\\N' and --null-non-string '\\N'
without setting "escaped-by". Take a look at the note at the bottom of this
page from the "apache sqoop cookbook":
http://books.google.com/books?id=bxBnjitgIAYC&pg=PT36&lpg=PT36&dq=sqoop+escape+by+null-string&source=bl&ots=JKuOI3l5Px&sig=WlDF4aWA_kTbM9lbgLmYXHUK6Uo&hl=en&sa=X&ei=0OPIUuzHLNHkoATMyYG4AQ&ved=0CEAQ6AEwAg#v=onepage&q=sqoop%20escape%20by%20null-string&f=false
.

-Abe
On Sat, Jan 4, 2014 at 6:06 PM, redshift-etl-user <[EMAIL PROTECTED]>wrote:

> Abe,
>
> Sure - the problem is that whatever I specify as the null-string, say X, I
> don't see how I can distinguish between that X and an actual string X in
> the resulting text file. Any ideas?
>
> Thanks.
> On Jan 4, 2014 1:47 AM, "Abraham Elmahrek" <[EMAIL PROTECTED]> wrote:
>
>> Hey There,
>>
>> Have you tried the --null-string option? See
>> http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#idp3491496 for
>> more details. It should change null string values to what ever string you
>> specify.
>>
>> -Abe
>>
>>
>> On Fri, Jan 3, 2014 at 4:34 AM, redshift-etl-user <[EMAIL PROTECTED]
>> > wrote:
>>
>>> I'm importing from a DB into a text file, and I need to distinguish
>>> between null and non-null strings. Is there a combination of parameters
>>> (i.e. escaped-by, enclosed-by, and null-string) that yields unambiguous
>>> output strings? With the default options "null-string" is "null", and so
>>> there's no way of distinguishing between a null string and the string
>>> "null" in the output file.
>>>
>>> One solution to this would be to avoid escaping the specified null
>>> string. That way we could specify "escaped-by" as "\" and "null-string" as
>>> "\N" and get "\N" in the output as opposed to "\\N" for null strings. That
>>> way it's guaranteed to be different from any non-null string.
>>>
>>> In the generated code's toString() method this would mean changing from
>>>
>>> __sb.append(FieldFormatter.escapeAndEnclose(STRING==null?"\\N":STRING,
>>> delimiters));
>>>
>>> to
>>>
>>> __sb.append(STRING==null?"\\N":FieldFormatter.escapeAndEnclose(STRING,
>>> delimiters));
>>>
>>> Thoughts? Any ideas for a workaround?
>>>
>>>  Thanks!
>>>
>>
>>