Apache

Apache TLP+Incubator 一覧 2018年2月

More than 1 year has passed since last update.

最近、初耳なオープンソースなのに、何年も前からApacheで公開されてた〜!ということが頻発したので、一念発起してApache TLPとIncubatorの一覧を整備してみました。

一覧整備の中で気になったプロジェクトをまとめた、ほとんど知られていない、IoTに役立つ10個の最新Apacheプロジェクト - 2018年2月 も参照ください。


1. Apacheソフトウェアとは

Apacheソフトウェア財団(Apache Software Foundation)によって管理されるオープンソースを、Apacheソフトウェアと呼ぶ。そのライセンスはかなり緩い方で、派生プロダクトの商用非公開利用も一定の条件のもと可能であるため、活発にソフトウェアが追加されている。Apacheというと、1990年代ではWeb Serverの代名詞であったが、近年ではApache Hadoopをはじめとするビッグデータ関連ソフトウェアに強い。


1.1. Incubatorとは

Apacheソフトウェアになる応募をして一定の条件をクリアしたオープンソースは、Apacheの名称を関することが許され、実験プロジェクトの位置づけでIncubatorとして登録され、支援を受けられる。(Podlingとも呼ぶ)

Incubatorプロジェクトは一定の条件のもと、中断もしくは卒業を果たす。卒業にあたって、過去のApache TLPのサブプロジェクトになるか、新規のTLPとなる。


1.2. TLPとは

TLP(Top Level Project)は、専用のPMC(Project Management Committee)と呼ばれる委員会が組織され、Apacheの正式プロジェクトとして運営される。


2. TLP一覧

Apache Projects によると、現在、約170のTLPが存在する。2000年代まではJAVA関連、2010年代はビッグデータ関連が主流になってきている。

Committee(TLP)配下に複数のオープンソースをホストしているものもある。また、オープンソースプロジェクトを1つもホストしていないTLPもある。なお、Categoryは各TLPの自称なので、未設定のものも多く、かならずしもレベルは合っていない。

以下にTLP昇格年代の降順で並べてみた。2014年以降のプロジェクトには、最近の旬のものが散見される。

Committee
Category
Description
Established

Apache Trafodion
big-data
webscale SQL-on-Hadoop solution enabling transactional or operational workloads.
2017年12月

Apache Guacamole
network
providing performant, browser-based remote access
2017年11月

Apache Impala

a high-performance distributed SQL engine
2017年11月

Apache Mnemonic

a transparent nonvolatile hybrid memory oriented library for Big data, High-performance computing, and Analytics
2017年11月

Apache Juneau

a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using VERY little code
2017年10月

Apache Kibble

an interactive project activity analyzer and aggregator
2017年10月

Apache PredictionIO
big-data
a machine learning server built on top of state-of-the-art open source stack, that enables developers to manage and deploy production-ready predictive services for various kinds of machine learning tasks
2017年10月

Apache DRAT

large scale code license analysis, auditing and reporting
2017年9月

Apache RocketMQ

a fast, low latency, reliable, scalable, distributed, easy to use message-oriented middleware, especially for processing large amounts of streaming data
2017年9月

Apache Royale

improving developer productivity in creating applications for wherever Javascript runs (and other runtimes)
2017年9月

Apache Fluo

Storage and incremental processing of large data sets
2017年7月

Apache MADlib

Scalable, Big Data, SQL-driven machine learning framework for Data Scientists
2017年7月

Apache Streams

interoperability of online profiles and activity feeds
2017年7月

Apache Atlas

scalable and extensible set of core foundational governance services
2017年6月

Apache Mynewt

embedded OS optimized for networking and built for remote management of constrained devices
2017年6月

Apache SystemML

A machine learning platform optimal for big data
2017年5月

Apache CarbonData
big-data
indexed columnar data format for fast analytics on big data platform
2017年4月

Apache Fineract

Platform for Digital Financial Services
2017年4月

Apache Metron

Real-time big data security
2017年4月

Apache Ranger

framework to enable, monitor and manage comprehensive data security across the Hadoop platform.
2017年1月

Apache Beam
big-data
Programming model, SDKs, and runners for defining and executing data processing pipelines
2016年12月

Apache Eagle

open source analytics solution for identifying security and performance issues instantly on big data platforms
2016年12月

Apache Geode

Low latency, high concurrency data management solutions
2016年11月

Apache Kudu

A distributed columnar storage engine built for the Apache Hadoop ecosystem
2016年7月

Apache Twill

Use Apache Hadoop YARN's distributed capabilities with a programming model that is similar to running threads
2016年6月

Apache Bahir

Extensions to distributed analytic platforms such as Apache Spark
2016年5月

Apache TinkerPop

A graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP)
2016年5月

Apache Zeppelin
big-data
A web-based notebook that enables interactive data analytics
2016年5月

Apache Apex
big-data
Enterprise-grade unified stream and batch processing engine
2016年4月

Apache AsterixDB

open source Big Data Management System
2016年4月

Apache Johnzon

JSR-353 compliant JSON parsing; modules to help with JSR-353 as well as JSR-374 and JSR-367
2016年4月

Apache Sentry

Fine grained authorization to data and metadata in Apache Hadoop
2016年3月

Apache Arrow

Powering Columnar In-Memory Analytics
2016年1月

Apache Brooklyn
cloud
Framework for modeling, monitoring, and managing applications through autonomic blueprints
2015年11月

Apache Groovy
library
A multi-faceted language for the Java platform
2015年11月

Apache Kylin

Extreme OLAP Engine for Big Data
2015年11月

Apache REEF
big-data
Retainable Evaluator Execution Framework
2015年11月

Apache Calcite
big-data, hadoop, sql
Dynamic data management framework
2015年10月

Apache Yetus
build-management, library, testing
Collection of libraries and tools that enable contribution and release processes for software projects
2015年9月

Apache Ignite
big-data, cloud, data-management-platform, database, distributed-sql-database, hadoop, iot, osgi, sql
High-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time
2015年8月

Apache Lens
big-data
Unified analytics platform
2015年8月

Apache Serf
library
High performance C-based HTTP client library built upon the Apache Portable Runtime (APR) library
2015年8月

Apache Usergrid

The BaaS Framework you run
2015年8月

Apache NiFi

Easy to use, powerful, and reliable system to process and distribute data
2015年7月

Apache Whimsy
content
Tools that help automate various administrative tasks or information lookup activities
2015年5月

Apache ORC
big-data, database, hadoop, library
the smallest, fastest columnar storage for Hadoop workloads
2015年4月

Apache Parquet
big-data
columnar storage format available to any project in the Apache Hadoop ecosystem
2015年4月

Apache Aurora

Mesos framework for long-running services and cron jobs
2015年3月

Apache Polygene
library
community based effort exploring Composite Oriented Programming for domain centric application development
2015年3月

Apache Samza
big-data
distributed stream processing framework
2015年1月

Apache Falcon
big-data
Data management and processing platform.
2014年12月

Apache Flink
big-data
platform for scalable batch and stream data processing
2014年12月

Apache BookKeeper
big-data
Replicated log service which can be used to build replicated state machines
2014年11月

Apache Drill
big-data
Schema-free SQL Query Engine for Apache Hadoop, NoSQL and Cloud Storage
2014年11月

Apache MetaModel
big-data, database, library
common interface for discovery, exploration of metadata and querying of different types of data sources
2014年11月

Apache Storm
big-data
Distributed, real-time computation system
2014年9月

Apache Celix
network
Implementation of the OSGi specification adapted to C
2014年7月

Apache Tez
big-data
High-performance and scalable distributed data processing framework
2014年7月

Apache VXQuery
big-data, xml
A parallel XQuery processor
2014年7月

Apache Phoenix
big-data, database
High performance relational database layer over Apache HBase for low latency applications
2014年5月

Apache Allura
content
Forge software for hosting software projects
2014年3月

Apache Olingo
library
OASIS OData protocol libraries
2014年3月

Apache Tajo
big-data
Big data warehouse system on Apache Hadoop
2014年3月

Apache Knox
big-data
Simplify and normalize the deployment and implementation of secure Hadoop clusters
2014年2月

Apache Open Climate Workbench
content
Climate model evaluation
2014年2月

Apache Spark
big-data
Fast and general engine for large-scale data processing
2014年2月

Apache Helix
big-data, cloud
A cluster management framework for partitioned and replicated distributed resources
2013年12月

Apache Ambari
big-data
Hadoop cluster management
2013年11月

Apache Marmotta

An Open Platform for Linked Data
2013年11月

Apache Chukwa

Open source data collection system for monitoring large distributed systems.
2013年10月

Apache jclouds
cloud, library
Java cloud APIs and abstractions
2013年10月

Apache Curator
database, library
Java libraries that make using Apache ZooKeeper easier
2013年9月

Apache JSPWiki
content
Leading open source WikiWiki engine, feature-rich and built around standard J2EE components (Java, servlets, JSP).
2013年7月

Apache Mesos
cloud
a cluster manager that provides efficient resource isolation and sharing across distributed applications
2013年6月

Apache DeltaSpike
javaee
Portable CDI extensions that provide useful features for Java application developers
2013年4月

Apache Bloodhound
build-management
Issue tracking, wiki and repository browser
2013年3月

Apache CloudStack
cloud
Infrastructure as a Service solution
2013年3月

Apache cTAKES
content
Natural language processing (NLP) tool for information extraction from electronic medical record clinical free-text
2013年3月

Apache Clerezza
content, osgi
Semantically linked data for OSGi
2013年2月

Apache Crunch
big-data, library
Simple and Efficient MapReduce Pipelines
2013年2月

Apache Oltu
library
OAuth protocol implementation in Java
2013年1月

Apache OpenMeetings
network
OpenMeetings: Web-Conferencing and real-time collaboration
2013年1月

Apache Flex
web-framework
Application framework for expressive web applications that deploy to all major browsers, desktops and devices.
2012年12月

Apache Kafka
big-data
Distributed publish-subscribe messaging system
2012年11月

Apache Syncope
identity, security
Managing digital identities in enterprise environments
2012年11月

Apache Cordova
library, mobile
Platform for building native mobile applications using HTML, CSS and JavaScript
2012年10月

Apache Isis
web-framework
Framework for rapidly developing domain-driven apps in Java
2012年10月

Apache OpenOffice
content
An open-source, office-document productivity suite
2012年10月

Apache Airavata
big-data, cloud, network
Workflow and Computational Job Management Middleware
2012年9月

Apache Bigtop
big-data
Apache Hadoop ecosystem integration and distribution project
2012年9月

Apache SIS
library
Spatial Information System
2012年9月

Apache Stanbol
content
Reusable components for semantic content management
2012年9月

Apache Any23
content
Anything to Triples
2012年8月

Apache Lucene.Net
database
Search engine library targeted at .NET runtime users.
2012年8月

Apache Oozie
big-data
A workflow scheduler system to manage Apache Hadoop jobs.
2012年8月

Apache Steve
library
Apache's Python based single transferable vote software system
2012年7月

Apache Flume
big-data
A reliable service for efficiently collecting, aggregating, and moving large amounts of log data
2012年6月

Apache VCL
cloud
Virtual Computing Lab
2012年6月

Apache Giraph
big-data
Iterative graph processing system built for high scalability
2012年5月

Apache Hama
big-data
a Bulk Synchronous Parallel computing framework on top of Apache Hadoop
2012年5月

Apache ManifoldCF
content
Framework for connecting source content repositories to target repositories or indexes.
2012年5月

Apache Creadur

Comprehension and auditing of software distributions
2012年4月

Apache Jena
library
Java framework for building Semantic Web applications
2012年4月

Apache Accumulo
database
Sorted, distributed key/value store
2012年3月

Apache Lucy
database
Search engine library for dynamic languages
2012年3月

Apache Sqoop
big-data
Bulk Data Transfer for Apache Hadoop and Structured Datastores
2012年3月

Apache Bval
javaee, library
Apache BVal: JSR-303 Bean Validation Implementation and Extensions
2012年2月

Apache OpenNLP
library
Machine learning based toolkit for the processing of natural language text
2012年2月

Apache Empire-db
database
Relational Data Persistence
2012年1月

Apache Gora
database
ORM framework for column stores such as Apache HBase and Apache Cassandra with a specific focus on Hadoop
2012年1月

Apache JMeter
testing
Java performance and functional testing
2011年10月

Apache Libcloud
cloud, library
Unified interface to the cloud
2011年5月

Apache Chemistry
library
CMIS (Content Managment Interoperability Services) Clients and Servers
2011年2月

Apache River
javaee
Jini service oriented architecture
2011年1月

Apache Aries
library
Enterprise OSGi application programming model
2010年12月

Apache OODT
web-framework
Object Oriented Data Technology (middleware metadata)
2010年11月

Apache ZooKeeper
database
Centralized service for maintaining configuration information
2010年11月

Apache Thrift
http, library, network
Framework for scalable cross-language services development
2010年10月

Apache Hive
database
Data warehouse infrastructure using the Apache Hadoop Database
2010年9月

Apache Pig
database
Platform for analyzing large data sets
2010年9月

Apache Shiro
library, web-framework
Powerful and easy-to-use application security framework
2010年9月

Apache jUDDI

Java implementation of the Universal Description, Discovery, and Integration specification
2010年8月

Apache Karaf
osgi, network
Server-side OSGi distribution
2010年6月

Apache Avro
big-data, library
A Serialization System
2010年4月

Apache HBase
database
Apache Hadoop Database
2010年4月

Apache Mahout
library
Scalable machine learning library
2010年4月

Apache Nutch
web-framework
Open Source Web Search Software
2010年4月

Apache Tika
library
Content Analysis and Detection Toolkit
2010年4月

Apache Traffic Server
http
A fast, scalable and extensible HTTP/1.1 compliant caching proxy server
2010年4月

Apache UIMA

Framework and annotators for unstructured information analysis
2010年3月

Apache Cassandra
database
Highly scalable second-generation distributed database
2010年2月

Apache Subversion
build-management
Version Control
2010年2月

Apache Axis
http, network, xml
Java SOAP Engine
2009年12月

Apache OpenWebBeans
javaee
OpenWebBeans: JSR-299 Context and Dependency Injection for Java EE Platform Implementation
2009年12月

Apache Pivot
library
Rich Internet applications in Java
2009年12月

Apache Community Development

Resources to help people become involved with Apache projects
2009年11月

Apache PDFBox
content, library
Java library for working with PDF documents
2009年10月

Apache Sling

Web Framework for JCR Content Repositories
2009年6月

Apache Camel
network, osgi
Spring based Integration Framework which implements the Enterprise Integration Patterns
2008年12月

Apache Attic

A home for dormant projects
2008年11月

Apache Buildr
build-management
Simple and intuitive build system for Java applications
2008年11月

Apache CouchDB
big-data, cloud, content, database, http, network
RESTful document database
2008年11月

Apache Qpid
network
Multiple language implementation of the latest Advanced Message Queuing Protocol (AMQP)
2008年11月

Apache CXF
library, network, xml
Service Framework
2008年4月

Apache Archiva
build-management
Build Artifact Repository Manager
2008年3月

Apache Hadoop
database
Distributed computing platform
2008年1月

Apache Synapse
http, network, xml
Enterprise Service Bus and Mediation Framework
2007年12月

Apache HttpComponents
http, library, network
Java toolset of low level HTTP components
2007年11月

Apache ServiceMix
network, osgi, xml
Enterprise Service Bus
2007年9月

Apache ODE
network, xml
Orchestration Director Engine: Business Process Management (BPM), Process Orchestration and Workflow through service composition.
2007年7月

Apache Commons
http, library, network
Reusable Java components
2007年6月

Apache Wicket
web-framework
Component-based Java Web Application Framework.
2007年6月

Apache OpenJPA
database, javaee, library
OpenJPA: Object Relational Mapping for Java
2007年5月

Apache POI
content, library
Java API for OLE 2 Compound and OOXML Documents
2007年5月

Apache TomEE
network
Java EE Web Profile built on Apache Tomcat
2007年5月

Apache Turbine
web-framework
A Java Servlet Web Application Framework and associated component library
2007年5月

Apache Felix
network
OSGi Framework and components
2007年3月

Apache Roller
content
Java blog server
2007年2月

Apache ActiveMQ
network
Distributed Messaging System
2007年1月

Apache Cayenne
database, library, network, web-framework, xml
User-friendly Java ORM with Tools
2006年12月

Apache OFBiz
content, database, http, network, web-framework, xml
Open for Business: enterprise automation software
2006年12月

Apache Tiles
web-framework
A templating framework for web application user interfaces
2006年12月

Apache Labs

A place for innovation where committers of the foundation can experiment with new ideas
2006年11月

Apache MINA
network
Multipurpose Infrastructure for Network Application
2006年10月

Apache Velocity
library
A Java Templating Engine
2006年10月

Apache Santuario
library, security, xml
XML Security in Java and C++
2006年6月

Apache Jackrabbit
database, library, network, xml
Content Repository for Java
2006年3月

Apache Tapestry
web-framework
Component-based Java Web Application Framework
2006年2月

Apache Tomcat
http, javaee, network
A Java Servlet and JSP Container
2005年5月

Apache Directory
network
Apache Directory Server
2005年2月

Apache MyFaces
javaee, web-framework
JavaServer(tm) Faces implementation and components
2005年2月

Apache Xerces
xml
XML parsers in Java, C++ and Perl
2005年2月

Apache Lucene
database, library, search
Search engine library
2005年1月

Apache Xalan
xml
XSLT processors in Java and C++
2004年10月

Apache XML Graphics
graphics
Conversion from XML to graphical output
2004年10月

Apache SpamAssassin
mail
Mail filter to identify spam
2004年6月

Apache Forrest
build-management, database, graphics, http, network, web-framework, xml
Aggregated multi-channel documentation, separation of concerns
2004年5月

Apache Geronimo
http, javaee, network, web-framework
Java2, Enterprise Edition (J2EE) container
2004年5月

Apache Struts
web-framework
Model 2 framework for building Java web applications
2004年3月

Apache Gump
build-management, testing
Continuous integration of open source projects
2004年2月

Apache Portals
web-framework
Portal technology
2004年2月

Apache Logging Services

Cross-language logging services
2003年12月

Apache Maven
build-management
Java project management and comprehension tools
2003年3月

Apache Cocoon
database, graphics, http, network, web-framework, xml
Web development framework: separation of concerns, component-based
2003年1月

Apache James
mail, network
Java Apache Mail Enterprise Server
2003年1月

Apache Web Services

Projects related to Web Services
2003年1月

Apache Ant
build-management
Java-based build tool
2002年11月

Apache Incubator

Entry path for projects and codebases wishing to become part of the Foundation's efforts
2002年10月

Apache DB

Database access
2002年7月

Apache Portable Runtime (APR)
library
Apache Portable Runtime libraries
2000年12月

Apache Tcl

Dynamic websites using TCL
2000年7月

Apache mod_perl
httpd-module
Dynamic websites using Perl
2000年3月

Apache HTTP Server
http, httpd-module, network
Apache Web Server (httpd)
1995年2月


3. Incubator一覧

Incubatorは過去のものを含めてApache Incubator Projects に一覧されているが、ここではそのうち現在進行中の50あまりのプロジェクトについて、開始日時の降順で一覧する。

Project
Description
Start Date

Coral
Coral is a data processing system to flexibly control the runtime behaviors of a job to adapt to varying deployment characteristics.
2018/2/4

ECharts
ECharts is a charting and data visualization library written in JavaScript.
2018/1/18

PLC4X
PLC4X is a set of libraries for communicating with industrial programmable logic controllers (PLCs) using a variety of protocols but with a shared API.
2017/12/18

SkyWalking
Skywalking is an APM (application performance monitor), especially for microservice, Cloud Native and container-based architecture systems. Also known as a distributed tracing system. It provides an automatic way to instrument applications: no need to change any of the source code of the target application; and an collector with an very high efficiency streaming module.
2017/12/8

ServiceComb
ServiceComb is a microservice framework that provides a set of tools and components to make development and deployment of cloud applications easier.
2017/11/22

Crail
Crail is a storage platform for sharing performance critical data in distributed data processing jobs at very high speed.
2017/11/1

SDAP
SDAP is an integrated data analytic center for Big Science problems.
2017/10/22

PageSpeed
PageSpeed represents a series of open source technologies to help make the web faster by rewriting web pages to reduce latency and bandwidth.
2017/9/30

Amaterasu
Apache Amaterasu is a framework providing continuous deployment for Big Data pipelines.
2017/9/7

Daffodil
Apache Daffodil is an implementation of the Data Format Description Language (DFDL) used to convert between fixed format data and XML/JSON.
2017/8/27

Heron
A real-time, distributed, fault-tolerant stream processing engine.
2017/6/23

Livy
Livy is web service that exposes a REST interface for managing long running Apache Spark contexts in your cluster. With Livy, new applications can be built on top of Apache Spark that require fine grained interaction with many Spark contexts.
2017/6/5

Pulsar
Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication.
2017/6/1

Superset
Superset is an enterprise-ready web application for data exploration, data visualization and dashboarding.
2017/5/21

Gobblin
Gobblin is a distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
2017/2/23

MXNet
A Flexible and Efficient Library for Deep Learning
2017/1/23

Ratis
Ratis is a java implementation for RAFT consensus protocol
2017/1/3

Griffin
Griffin is a open source Data Quality solution for distributed data systems at any scale in both streaming or batch data context
2016/12/5

Weex
Weex is a framework for building Mobile cross-platform high performance UI.
2016/11/30

OpenWhisk
distributed Serverless computing platform
2016/11/23

NetBeans
NetBeans is a development environment, tooling platform and application framework.
2016/10/1

Spot
Apache Spot is a platform for network telemetry built on an open data model and Apache Hadoop.
2016/9/23

Hivemall
Hivemall is a library for machine learning implemented as Hive UDFs/UDAFs/UDTFs.
2016/9/13

Annotator
Annotator provides annotation enabling code for browsers, servers, and humans.
2016/8/30

AriaTosca
ARIA TOSCA project offers an easily consumable Software Development Kit(SDK) and a Command Line Interface(CLI) to implement TOSCA(Topology and Orchestration Specification of Cloud Applications) based solutions.
2016/8/27

SensSoft
SensSoft is a software tool usability testing platform
2016/7/13

Traffic Control
Traffic Control allows you to build a large scale content delivery network using open source.
2016/7/12

Pony Mail
Pony Mail is a mail-archiving, archive viewing, and interaction service, that can be integrated with many email platforms.
2016/5/27

Gossip
Gossip is an implementation of the Gossip Protocol.
2016/4/28

Airflow
Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines.
2016/3/31

Quickstep
Quickstep is a high-performance database engine.
2016/3/29

Omid
Omid is a flexible, reliable, high performant and scalable ACID transactional framework that allows client applications to execute transactions on top of MVCC key/value-based NoSQL datastores (currently Apache HBase) providing Snapshot Isolation guarantees on the accessed data.
2016/3/28

Gearpump
Gearpump is a reactive real-time streaming engine based on the micro-service Actor model.
2016/3/8

Tephra
Tephra is a system for providing globally consistent transactions on top of Apache HBase and other storage engines.
2016/3/7

Edgent
Edgent is a stream processing programming model and lightweight runtime to execute analytics at devices on the edge or at the gateway. (Formerly known as Quarks)
2016/2/29

Joshua
Joshua is a statistical machine translation toolkit
2016/2/13

iota
Open source system that enables the orchestration of IoT devices.
2016/1/20

Milagro
Distributed Cryptography; M-Pin protocol for Identity and Trust
2015/12/21

Toree
Toree provides applications with a mechanism to interactively and remotely access Apache Spark.
2015/12/2

S2Graph
S2Graph is a distributed and scalable OLTP graph database built on Apache HBase to support fast traversal of extremely large graphs.
2015/11/29

Unomi
Unomi is a reference implementation of the OASIS Context Server specification currently being worked on by the OASIS Context Server Technical Committee. It provides a high-performance user profile and event tracking server.
2015/10/5

Rya
Rya (pronounced "ree-uh" /rēə/) is a cloud-based RDF triple store that supports SPARQL queries. Rya is a scalable RDF data management system built on top of Accumulo. Rya uses novel storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes. Rya provides fast and easy access to the data through SPARQL, a conventional query mechanism for RDF data.
2015/9/18

HAWQ
HAWQ is an advanced enterprise SQL on Hadoop analytic engine built around a robust and high-performance massively-parallel processing (MPP) SQL framework evolved from Pivotal Greenplum Database.
2015/9/4

FreeMarker
FreeMarker is a template engine, i.e. a generic tool to generate text output based on templates. FreeMarker is implemented in Java as a class library for programmers.
2015/7/1

SINGA
SINGA is a distributed deep learning platform.
2015/3/17

Myriad
Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure.
2015/3/1

SAMOA
SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms that run on top of distributed stream processing engines (DSPEs). It features a pluggable architecture that allows it to run on several DSPEs such as Apache Storm, Apache S4, and Apache Samza.
2014/12/15

Tamaya
Tamaya is a highly flexible configuration solution based on an modular, extensible and injectable key/value based design, which should provide a minimal but extendible modern and functional API leveraging SE, ME and EE environments.
2014/11/14

HTrace
HTrace is a tracing framework intended for use with distributed systems written in java.
2014/11/11

Taverna
Taverna is a domain-independent suite of tools used to design and execute data-driven workflows.
2014/10/20

Slider
Slider is a collection of tools and technologies to package, deploy, and manage long running applications on Apache Hadoop YARN clusters.
2014/4/29

DataFu
DataFu provides a collection of Hadoop MapReduce jobs and functions in higher level languages based on it to perform data analysis. It provides functions for common statistics tasks (e.g. quantiles, sampling), PageRank, stream sessionization, and set and bag operations. DataFu also provides Hadoop jobs for incremental data processing in MapReduce.
2014/1/5

BatchEE
BatchEE projects aims to provide a JBatch implementation (aka JSR352) and a set of useful extensions for this specification.
2013/10/3

ODF Toolkit
Java modules that allow programmatic creation, scanning and manipulation of OpenDocument Format (ISO/IEC 26300 == ODF) documents
2011/8/1