Announcing CloudBase- Data warehouse system build on top of Hadoop

There’s been mention in this group (at least in nutch-dev, but I’m copying nutch-user as there may be interest there as well) the desire for an easier way to get at the data stored by nutch in the hadoop hfs.  We have just released to open source a database abstraction layer that will let you query your hfs directly using ANSI SQL.  No modification whatsoever is needed in your hfs storage.


CloudBase is a data warehouse system built on top of Hadoop. It is developed by ( and is released to the open source community under GNU General Public License 2.0


Some of the salient features of CloudBase are –


1)       Supports ANSI SQL as its query language

2)       Provides JDBC driver, so you can use any JDBC database manager application to connect to CloudBase

3)       Allows you to push results of queries into RDBMS using RDBMS JDBC driver

4)       Supports String and Date time functions as described in JDBC specifications

5)       Supports regular expressions in LIKE clause

6)       Supports sub-queries, VIEWS

7)       Supports Order by, Group By, Having clauses


CloudBase site:

CloudBase discussion group:


Read more about CloudBase here:




- leo