Concurrent Seeks to Lower Hadoop Adoption Barrier for SQL Coders

Concurrent Seeks to Lower Hadoop Adoption Barrier for SQL Coders
David Ramel, Application Development Trends Magazine
November 21, 2013
http://adtmag.com/articles/2013/11/21/concurrent.aspx

Concurrent Inc. this week released Cascading Lingual, an open source project designed to give developers a standards-based SQL interface for creating and working with Big Data applications on Apache Hadoop. It works with the company’s Cascading application framework used by Java developers for building Hadoop analytics and data management apps.

Concurrent also announced an upcoming upgrade of that framework, Cascading 2.5, which includes compatibility with the recent Hadoop 2 upgrade featuring long-awaited support for YARN.

Cascading Lingual joins the growing list of projects aimed at lowering the Hadoop adoption barrier by making Hadoop Big Data application development more accessible to SQL-savvy developers and integrating existing legacy systems with Hadoop applications. “Cascading Lingual enables virtually anyone familiar with SQL to instantly work with data stored on Hadoop using their JDBC-compliant BI [business intelligence] or desktop tool of choice,” the company said. “Enterprises benefit as they can execute on Big Data strategies using existing in-house resources, skills sets and product investments.”

Developers have the option of using standard Java JDBC interfaces to build apps or Concurrent’s own Cascading APIs that facilitate solutions built with ANSI-standard SQL and custom code written in Java, Scala or Clojure.

Along with the standards-based SQL support and JDBC driver, Cascading Lingual features an interactive SQL Shell command interface for interacting with Hadoop, for example through the execution of SQL commands. A Catalog command-line tool is provided for curating database tables that map to Hadoop files and other resources. Another key feature of Cascading Lingual is a Data Provider mechanism that allows for simultaneous data queries from multiple external data stores with just one SQL command, Concurrent said. Developers can just “cut and paste” existing SQL code to work with Hadoop data or migrate applications to Hadoop clusters.

The company said Cascading Lingual was created via collaboration between the developers of the Cascading Java API and developers of Optiq, a dynamic data management framework and SQL parser originally written by Julian Hyde, who also authored the Java-based Mondarian Online Analytical Processing (OLAP) engine.

The upcoming 2.5 release of the Cascading application framework upon which Lingual is based provides developers with the new capabilities of Hadoop 2. Released last month, Hadoop 2 supports YARN–sometimes referred to as “Yet Another Resource Negotiator”–a long-anticipated upgrade to the MapReduce job batch processing framework and core component of Hadoop. Concurrent said Cascading users will now be able to seamlessly upgrade their applications to Hadoop 2. “Furthermore, Big Data applications using domain specific languages (DSLs), such as the widely used Scalding (Scala on Cascading), Cascalog (Clojure on Cascading) and PyCascading (Jython on Cascading) languages, will also seamlessly migrate to Hadoop 2,” the company said.

Concurrent said the framework is used by companies such as Twitter, eBay and Trulia “to streamline data processing, data filtering and workflow optimization for large volumes of unstructured and semi-structured data.”

Cascading 2.5 reportedly will be available for download soon under the open source Apache 2.0 License Agreement. Cascading Lingual is available for download now.