What’s the Scoop on Hadoop?


Hadoop is an open-source software library designed to handle large amounts of data across clusters of computers by utilizing relatively simple programming. The model is scalable from one to thousands of servers, with each one performing local computing and storage functions.  One major advantage of this system is that, instead of relying on hardware redundancy for reliability, Hadoop can identify and handle malfunctions at the application level, which makes the solution redundant without the need for redundancy at the individual server level.  The system is presently gaining popularity with IT managers in the form of Hadoop Appliances. A Hadoop Big Data appliance is built to perform deep data analysis outside the normal operating system and deliver those results to the operating system for further processing.

The open standard for storage offered by Hadoop allows large volumes of unstructured or partially-structured data – such as web server logs, image and video files, independent data sets, and email – to be housed, and then manipulated and transformed externally before being moved to a final internal storage location such as a data warehouse. By doing all of the heavy analysis of this data in raw form in Hadoop (versus the structured format of the operating system), expensive disk space and computing resources can be unloaded and reserved so the operating system can handle more of the direct processing responsibilities.

Hadoop is becoming popular in the form of Big Data appliances because they essentially offer plug-and-play processor capability to an operating system. By installing software connectors to the appliance, it becomes another rack of servers in the system architecture. However, because the appliance is programmed in Hadoop language, it is much less expensive than an equivalent quantity of hardware (in terms of CPUs, storage space, nodes, etc) configured to run in the operating system.  This is its real appeal for IT departments running on tight budgets and limited resources.

Hadoop does not have to be run within an appliance; the software can be programmed on an independently designed and installed server system. However, with a limited number of true Hadoop experts in the industry, IT departments operating with limited resources, and the typical difficulty in appropriately sizing hardware to match the needs of a software application, appliances make sense for most customers.  They are supported by the supplier and designed to perform to the specifications of the client, leaving the internal IT team free to focus on business-specific issues.

There are a couple of Hadoop appliances on the market today, offering anywhere from 4 to 18 nodes, with 12 cores, 48 GB RAM, and 28 to 36 TB of capacity per node.  Businesses interested in deploying such a system however, should not feel restricted to these offerings. Experienced suppliers of server systems, such as NEI, can design and deliver a Hadoop appliance in any size and configuration to suit the needs of a client. NEI’s best-in-class approach to Big Data appliance design means it will work flawlessly with any operating system employed by the business, and integrate seamlessly into the existing architecture.

Contact NEI online or by calling (877) 792-9099 to learn more about Hadoop and how NEI can scale a Hadoop cluster to meet your requirements.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>