There are various method to control job running on data node:
–> Stop the task tracker on the nodes.Then the Map Reduce tasks will not be scheduled on that node.But,the data still be fetched from Data Node.So, we need to stop the data node also.
–> By using a capacity scheduler,we can limit the usage of a cluster by a particular job. But the scheduler will allocate the free resources beyond the capacity for the better utilization of the cluster.
Trash is very similar to Recycle Bin.It is very useful for hadoop user in case of accidental deletion of files and directories.If trash is enabled and a file is deleted, then that file is moved to the trash directory,instead of being deleted. File in .trash directory are removed permanently after a user-configurable time interval and the admin can restore that file to its original location within a time interval.
How to configure Trash in Hadoop
Steps to configure trash in hadoop:- First stop the cluster to add a new property in core-site.xml <property> <name>fs.trash.interval</name> <value>30</value> <property> This property will automatically create a .Trash directory.