(83)What is a daemon?
Daemon is a service or process which run in background it verify by JPS command
(84)Is Namenode machine same as datanode machine as in terms of hardware?
It depends upon the cluster, in single node cluster configuration both work on same machine but in development and in testing both are on different machine.
(85)What is a heartbeat in HDFS?
A hearbeat is referred to a signal to indicate the file-system is live. A datanode sends heartbeat to the namenode followed by task tracker to the job tracker. If at any point of time the namenode fails to receive the heartbeat, it will indicate there is some problem in the datanode or the tasktracker fails to perform the task.
(86)Are Namenode and job tracker on the same host?
It is possible to run all daemons on a single machine but in production environment namenode and jobtracker run on different hosts.
(87)What is a ‘block’ in HDFS?
It refers to the minimum data that can either be read or written. The default block size of Hadoop is 64 Mb as compared to 8192 Bytes in Unix/Linux. In HDFS, the files are broken into chunks of the size of a block and are stored in independent units.
(89)If we want to copy 10 blocks from one machine to another, but another machine can copy only 8.5 blocks, can the blocks be broken at the time of replication?
Blocks canot be broken down before copying the blocks from one machine to another , the master node ill figure out what is the requirement of actual storage space.
(90)How indexing is done in HDFS?
Hadoop has its own indexing based on block size , once the data is stored, hdfs will keep storing last part of data which will say where the next part of data will be.
(91)If a data Node is full how it’s identified?
When data is stored in datanode , then metadata of the data ill sotre in namenode so namenode ill identify that datanode is full.
(92)If datanodes increase, then do we need to upgrade Namenode?
We do not need to upgrade the data bcz it does not store the actual data,store metadata so requirement arise rarely.
(93)Are job tracker and task trackers present in separate machines?
Yes bcz job tracker is a single time of failure.
(79)What is streaming access?
Streaming access is one of the most important parts HDFS, as it relies on the principle of ‘Write Once, Read Many’. HDFS does not focus much on the storage of data. Rather, it focuses more to find the best possible means to retrieve records in a faster way.
Hadoop Dump Questions