Describe five projects of big data


The Apache
Ambari Project is made to make the Hadoop system easier by developing software
for provisioning, checking the clusters and to manage. It provides an instinctive
and easy to use Hadoop management web UI

Provisioning a Hadoop Cluster.Ambari pgives a gradual wizard installing Hadoop
services across any of the hosts.It can control the configuration of the Hadoop
services cluster.

Managing a Hadoop Cluster.It acts as a central management for starting,
stopping and reconfiguring the Hadoop services throughout the cluster.



Flume may be
an distributed, reliable, What’s more accessible for effectively collecting
aggregating, and moving the set of data. It uses the streaming architecture on
the data flow. The agent first takes from the web server to the source. From
the source it flows to the channel. From the channel it sinks and then it is carried
to the HDFS.


The sqoop is
another project in the big data ecosystem. This tool is made for transferring
huge amount of data between Apache Hadoop and structured data stores which are like
the relational databases. It can run multiple times for importing the updates
from the last import  that has been made
to a database. Imports in Hive and HBase canbe done. Whereas in Export, it can
be used to put the Hadoop to a relational database.



One Other
project in big data ecosystem is the Apache zeppelin. It is an open source
software  under the apache. It is used to
provide a web interface say notebook. The Apache zeppelin notebook includes
like the Data ingestion, data discovery, data analytics , data visualization
and collaboration.



Apache Kafka
is a distributed streaming platform. Kafka is like Publish and subscribe. That
is it can reads and write streams of data like a messaging system.

Writes scalable stream processing applications that react to events in real-time.
It is a library for building applications and microservices

