Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Sqoop, mail # dev - Review Request: Securing passwords in sqoop 1.x


+
Seetharam Venkatesh 2013-03-06, 07:23
+
Seetharam Venkatesh 2013-03-06, 07:24
+
Jarek Cecho 2013-03-10, 00:18
+
Seetharam Venkatesh 2013-03-10, 07:02
+
Jarek Cecho 2013-03-10, 20:41
+
Seetharam Venkatesh 2013-03-11, 07:00
Copy link to this message
-
Re: Review Request: Securing passwords in sqoop 1.x
Jarek Cecho 2013-03-12, 04:30


> On March 10, 2013, 12:18 a.m., Jarek Cecho wrote:
> > Hi Venkatesh,
> > thank you very much for working on this and please accept my late review. The changes looks good me the except few nits highlighted below.
> >
> > I do have just one bigger concern regarding saved jobs. It seems to me that the code will load the password from file and store it inside the metastore, which might be seen as a security hole. What about explicitly not saving the password in metastore and instead loading it from the file each subsequent execution?
>
> Seetharam Venkatesh wrote:
>     Jarek, Thanks for your time reviewing this.
>    
>     I'm not sure about your suspicion on storing clear text passwords in the metastore. It isn't. How are you arriving at that?
>     I'm only reading password from a file which should be under user'e home directory with 400 permissions which implies no one except the user running sqoop will have access to.
>    
>     Makes sense?
>
> Jarek Cecho wrote:
>     Hi Venkatesh,
>     you're welcome. I'm glad to help get this in!
>    
>     I believe that I do understand proposed functionality correctly. I'm trying to look for potential corner cases and one possible issue that I was able to think of is with saved jobs. Imagine following example (real one):
>    
>     $ sqoop job --create jarcec -- import --connect jdbc:mysql://mysql/clusterdb --username sqoop --passwordPath sqoop-password --table texts
>     $ sqoop job --show jarcec | grep password
>     db.password = secret-password
>    
>     It's important to mention that this will work only with sqoop.metastore.client.record.password set to true. Nevertheless, I believe that the passwords shouldn't be serialized into the metastore in case that --password-file is enabled in any case. What do you think?
>
> Seetharam Venkatesh wrote:
>     Metastore stores sqoop tool template? If so, then this would be insecure. Do folks use this? Two options come to mind:
>    
>     * If metastore is widely used, disable storing passwords in metastore if file is specified
>     * If not, print a warning when metastore is used
>    
>     Lemme know what you think?
>

I've seen people using the Sqoop 1 metastore, even though it's probably not very common. Nevertheless, I would prefer to secure that as it's potential security hole.

I think that solution to this is quite simple. Entire metastore serialization is happening in SqoopConfiguration.writeProperties() and the password is already handled in special if-else statements on rows 632-644. I think that we just need to add new branch to that part of the code. And similarly to the SqoopConfiguration.loadProperties that will deserialize the properties from the metastore.
> On March 10, 2013, 12:18 a.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/db/DBConfiguration.java, lines 71-72
> > <https://reviews.apache.org/r/9771/diff/1/?file=267024#file267024line71>
> >
> >     Can we put also login and entire JDBC url into the credentials object?
>
> Seetharam Venkatesh wrote:
>     We could but KISS. Only passwords are secure credentials but not sure why the entire JDBC URI should be in there? Appreciate if you could elaborate. Thanks!
>
> Jarek Cecho wrote:
>     I personally see this functionality more as securing Sqoop rather than just scratching the password. Thinking from the attacker's point of view having valid username and valid database URL is something that will significantly help me to compromise remote system. For example, knowing login will allow me to attack one specific account on the remote database box, whereas otherwise it's a tuple username-password that I need to guess. Similarly for the URL - it's exposing where the database is physically running which is information that can be abused.
>
> Seetharam Venkatesh wrote:
>     Well, it depends. I'm no expert but IMHO JDBC URI's are always public, there is nothing secretive about it. Its also largely convention over configuration. Agree? May be once we get this base functionality in, we can have another jira to actually secure the entire URI along with username and password.

I see what you're saying. Nor URL nor login are super secrets, but my point is that they are exposing information that might help possible attacker. I'm fine with creating follow up jira for this.

Och, please accept my apologies for the confusion. The code functionality is definitely decoupled. I was talking here only from the documentation perspective. E.g. to describe both features independently on each other.
- Jarek
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9771/#review17652
On March 6, 2013, 7:24 a.m., Seetharam Venkatesh wrote:
+
Seetharam Venkatesh 2013-03-13, 21:41
+
Seetharam Venkatesh 2013-03-13, 21:41
+
Jarek Cecho 2013-03-14, 04:30
+
Seetharam Venkatesh 2013-03-14, 08:24
+
Jarek Cecho 2013-03-17, 19:37
+
Seetharam Venkatesh 2013-03-19, 08:30
+
Jarek Cecho 2013-03-20, 19:21