Big Data in simple terms is a combination of structured and unstructured business data. Big Data deals with current day to day transactional data of the business, which is very complex in nature. Big Data is named one among the finest artificial intelligence tools around the global market, since its inception. However, Big Data had its own limitations in terms of storage, size, analysis, searching, sharing and presentation of data to business users.
A traditional enterprise approach that consists of a server, database and user was launched by end users. But, the database server had a bottleneck of processing huge chunks of data, under a single processor. To overcome this limitation, Google has introduced a Map Reduce Algorithm, which can process the data among a set of distributed systems. This algorithm and Big Data were later transformed into an Open Source Java framework called Hadoop by Doug Cutting and his Team. Hadoop is distributed by multiple vendors across the globe, depending on their business needs. This article intends to shed some light on Big Data technologies namely Hive and Hue.
Most of the operations in the Hadoop ecosystem are operated through command line interface but there wasn’t any user interface designed during the initial releases of Hadoop. Hue is a web user interface that performs some of the common activities with Hadoop ecosystem or Hadoop based frameworks. Hue was launched and developed by an open source Hadoop framework called Cloudera.
Hive was launched by Facebook, during the initial stages of development and later it was taken over by Apache Software Foundation. This Apache project on Hive has embedded it into the Hadoop Ecosystem. Hive was designed to interact with data stored in HDFS (Hadoop Distribution File System). Hive is similar to SQL like query language. Hive is basically, used to query and retrieve the data from HDFS. This kind of query language using Hive is known as HiveQL or HQL.
Head to Head Comparison Between Hive vs Hue (Infographics)
Below is the Top 6 Comparision Between Hive vs HUE
Key Differences between Hive vs Hue
Hue is a web user interface that provides a number of services across the Cloudera based Hadoop framework. Some of the key features include HDFS file browser, Pig editor, Hive editor, Job browser, Hadoop shell, User admin permissions, Impala editor, Ozzie web interface and Hadoop API Access. But, Hive is an analytic SQL query language that can query or manipulate the data stored in a database. Some of the key features of Hive include Map-Reduce algorithm, OLAP (online analytical processing), creating schemas on databases, performing DML & DDL operations such as CREATE, ALTER, INSERT, SELECT, UPDATE, DELETE, DROP statements on HDFS.
Hue provides a web user interface along with the file path to browse HDFS. This web UI layout helps the users to browse the files, similar to that of an average windows user locating his files on his machine. This additional feature in Hue, also helps users manually upload or move files across different directories over web UI. Files stored on the HDFS can be accessed using file browser option on Hue. Hue can be a handy tool for users who don’t prefer UNIX command line interface. But, Hive is utilized to create schemas, databases to query the database. The DML & DDL statements in Hive (CREATE, ALTER, INSERT, SELECT, UPDATE, DELETE, DROP) helps users to analyze the data stored on HDFS as per business requirements. Hive can manually process and upload the data from Text files to tables. But it cannot move the files across different directories.
Hue provides a user interface to track down job status of the map reduce jobs. These jobs can be browsed through the job browser option on the web UI. Job status on hue is represented in the form of color coding (red, green, yellow and black). Green-Successful completed jobs, Yellow – Currently running jobs, Red – failed jobs and Black – Jobs abandoned by the user manually. But, Hive, on the other hand, utilizes Map-Reduce algorithm to process the data stored on HDFS. Hive can be operated either using command line interface or web editors like Hue. Hive is usually utilized to analyze complex unstructured data. This type of analytical operations performed using Hive are scheduled as Map Reduce jobs in Hadoop ecosystem.
Hue provides a web user interface to programming languages like Hive, which can be a handy tool for users to avoid syntax errors while executing queries. Hue also returns the result set and logs after the successful query execution. Hue also provides users to analyze the data in the form of charts (pie and bar charts). Hive editor can be accessed via query editors’ option on Hue. But, Hive without hue cannot be accessed over a web editor. Visualizations cannot be created using Hive. Hive only displays the result set at the command prompt level.
Hue allows users to create and configure file permissions on HDFS. The file permissions and user roles can be accessed via security option listed on the browser. Hue provides users to track down Ozzie workflows to process the jobs scheduled on job browser. Hue also allows users to browse and access tables and databases via metastore manager and database editors. But, Hive has secured with Kerberos 2.0 authentication along with Hadoop Cluster. The workflows scheduled using Ozzie cannot be tracked using Hive. All the data stored in the form of schemas and databases can also be viewed using HiveQL or Hive.
Hive vs Hue Comparision Table
Following is the Comparision Table between Hive and Hue are as follows
Basis of Comparison
Inventor / Invention Hive was launched by Apache Software Foundation. Hue was launched by Cloudera.
Scope/ Meaning Hive or HiveQL is an analytic query language used to process and retrieve data from a data warehouse. Hue is a Web UI that facilitates the users to interact with the Hadoop ecosystem.
Installation/ Configuration Hive can be installed or configured using command line Interface of a Hadoop Ecosystem. Hue can be installed or configured only using a web browser.
Functionality Hive uses map-reduce algorithm to process and analyze the data. Hue provides Web UI editor to access Hive and other programming languages.
Implementation Hive is Implemented and accessed using a command line interface or a web UI Interface. Hue is implemented on a web browser to access multiple programs installed on Cloudera.
Dependency Hive can be embedded across multiple Hadoop Frameworks. Hue is only available on Cloudera Based Hadoop Framework.
Conclusion – Hive vs Hue
In conclusion, we have covered the introduction, key differences and few comparisons on big data technologies Hive & Hue. We also have seen some of the similarities in Hive, which are also present in SQL query language. Hue is a one-stop web UI application that has all the services across the Hadoop big data ecosystem. Hive and Hue both can be utilized and configured in the Hadoop based frameworks depending on the end user requirements. There is a lot of information available over the web along with pre-configured Hadoop virtual machines to get a brief idea of Hive & Hue implementation. Both Hive and Hue have a key role to play in modern-day Big Data analytics.