After reviewing the class's (not very complicated) code, I have some
questions I hope someone can answer:
- (more general question) Are there many use-cases for using
DBInputFormat? Do most Hadoop jobs take their input from files or DBs?
- What happens when the database is updated during mappers' data
retrieval phase? is there a way to lock the database before the data
retrieval phase and release it afterwords?
- Since all mappers open a connection to the same DBS, one cannot use
hundreds of mapper. Is there a solution to this problem?