What is the purpose of the function org.apache.hadoop.mapreduce.Mapper.run () in Hadoop?

advertisements

What is the purpose of the org.apache.hadoop.mapreduce.Mapper.run() function in Hadoop? The setup() is called before calling the map() and the clean() is called after the map(). The documentation for the run() says

Expert users can override this method for more complete control over the execution of the Mapper.

I am looking for the practical purpose of this function.


The default run() method simply takes each key / value pair supplied by the context and calls the map() method:

public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
       map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
}

If you wanted to do more than that ... you'd need to override it. For example, the MultithreadedMapper class