More than 5 years have passed since last update.

Apache TLP+Incubator 一覧 2018年2月

Apache

Last updated at 2018-02-11Posted at 2018-02-11

最近、初耳なオープンソースなのに、何年も前からApacheで公開されてた〜！ということが頻発したので、一念発起してApache TLPとIncubatorの一覧を整備してみました。

一覧整備の中で気になったプロジェクトをまとめた、[ほとんど知られていない、IoTに役立つ10個の最新Apacheプロジェクト - 2018年2月] (https://qiita.com/toast-uz/items/4e1601bbc07d1d774418) も参照ください。

1. Apacheソフトウェアとは

Apacheソフトウェア財団（Apache Software Foundation）によって管理されるオープンソースを、Apacheソフトウェアと呼ぶ。そのライセンスはかなり緩い方で、派生プロダクトの商用非公開利用も一定の条件のもと可能であるため、活発にソフトウェアが追加されている。Apacheというと、1990年代ではWeb Serverの代名詞であったが、近年ではApache Hadoopをはじめとするビッグデータ関連ソフトウェアに強い。

1.1. Incubatorとは

Apacheソフトウェアになる応募をして一定の条件をクリアしたオープンソースは、Apacheの名称を関することが許され、実験プロジェクトの位置づけでIncubatorとして登録され、支援を受けられる。（Podlingとも呼ぶ）

Incubatorプロジェクトは一定の条件のもと、中断もしくは卒業を果たす。卒業にあたって、過去のApache TLPのサブプロジェクトになるか、新規のTLPとなる。

1.2. TLPとは

TLP（Top Level Project）は、専用のPMC（Project Management Committee）と呼ばれる委員会が組織され、Apacheの正式プロジェクトとして運営される。

2. TLP一覧

Apache Projects によると、現在、約170のTLPが存在する。2000年代まではJAVA関連、2010年代はビッグデータ関連が主流になってきている。

Committee(TLP)配下に複数のオープンソースをホストしているものもある。また、オープンソースプロジェクトを1つもホストしていないTLPもある。なお、Categoryは各TLPの自称なので、未設定のものも多く、かならずしもレベルは合っていない。

以下にTLP昇格年代の降順で並べてみた。2014年以降のプロジェクトには、最近の旬のものが散見される。

Committee	Category	Description	Established
Apache Trafodion	big-data	webscale SQL-on-Hadoop solution enabling transactional or operational workloads.	2017年12月
Apache Guacamole	network	providing performant, browser-based remote access	2017年11月
Apache Impala		a high-performance distributed SQL engine	2017年11月
Apache Mnemonic		a transparent nonvolatile hybrid memory oriented library for Big data, High-performance computing, and Analytics	2017年11月
Apache Juneau		a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using VERY little code	2017年10月
Apache Kibble		an interactive project activity analyzer and aggregator	2017年10月
Apache PredictionIO	big-data	a machine learning server built on top of state-of-the-art open source stack, that enables developers to manage and deploy production-ready predictive services for various kinds of machine learning tasks	2017年10月
Apache DRAT		large scale code license analysis, auditing and reporting	2017年9月
Apache RocketMQ		a fast, low latency, reliable, scalable, distributed, easy to use message-oriented middleware, especially for processing large amounts of streaming data	2017年9月
Apache Royale		improving developer productivity in creating applications for wherever Javascript runs (and other runtimes)	2017年9月
Apache Fluo		Storage and incremental processing of large data sets	2017年7月
Apache MADlib		Scalable, Big Data, SQL-driven machine learning framework for Data Scientists	2017年7月
Apache Streams		interoperability of online profiles and activity feeds	2017年7月
Apache Atlas		scalable and extensible set of core foundational governance services	2017年6月
Apache Mynewt		embedded OS optimized for networking and built for remote management of constrained devices	2017年6月
Apache SystemML		A machine learning platform optimal for big data	2017年5月
Apache CarbonData	big-data	indexed columnar data format for fast analytics on big data platform	2017年4月
Apache Fineract		Platform for Digital Financial Services	2017年4月
Apache Metron		Real-time big data security	2017年4月
Apache Ranger		framework to enable, monitor and manage comprehensive data security across the Hadoop platform.	2017年1月
Apache Beam	big-data	Programming model, SDKs, and runners for defining and executing data processing pipelines	2016年12月
Apache Eagle		open source analytics solution for identifying security and performance issues instantly on big data platforms	2016年12月
Apache Geode		Low latency, high concurrency data management solutions	2016年11月
Apache Kudu		A distributed columnar storage engine built for the Apache Hadoop ecosystem	2016年7月
Apache Twill		Use Apache Hadoop YARN's distributed capabilities with a programming model that is similar to running threads	2016年6月
Apache Bahir		Extensions to distributed analytic platforms such as Apache Spark	2016年5月
Apache TinkerPop		A graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP)	2016年5月
Apache Zeppelin	big-data	A web-based notebook that enables interactive data analytics	2016年5月
Apache Apex	big-data	Enterprise-grade unified stream and batch processing engine	2016年4月
Apache AsterixDB		open source Big Data Management System	2016年4月
Apache Johnzon		JSR-353 compliant JSON parsing; modules to help with JSR-353 as well as JSR-374 and JSR-367	2016年4月
Apache Sentry		Fine grained authorization to data and metadata in Apache Hadoop	2016年3月
Apache Arrow		Powering Columnar In-Memory Analytics	2016年1月
Apache Brooklyn	cloud	Framework for modeling, monitoring, and managing applications through autonomic blueprints	2015年11月
Apache Groovy	library	A multi-faceted language for the Java platform	2015年11月
Apache Kylin		Extreme OLAP Engine for Big Data	2015年11月
Apache REEF	big-data	Retainable Evaluator Execution Framework	2015年11月
Apache Calcite	big-data, hadoop, sql	Dynamic data management framework	2015年10月
Apache Yetus	build-management, library, testing	Collection of libraries and tools that enable contribution and release processes for software projects	2015年9月
Apache Ignite	big-data, cloud, data-management-platform, database, distributed-sql-database, hadoop, iot, osgi, sql	High-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time	2015年8月
Apache Lens	big-data	Unified analytics platform	2015年8月
Apache Serf	library	High performance C-based HTTP client library built upon the Apache Portable Runtime (APR) library	2015年8月
Apache Usergrid		The BaaS Framework you run	2015年8月
Apache NiFi		Easy to use, powerful, and reliable system to process and distribute data	2015年7月
Apache Whimsy	content	Tools that help automate various administrative tasks or information lookup activities	2015年5月
Apache ORC	big-data, database, hadoop, library	the smallest, fastest columnar storage for Hadoop workloads	2015年4月
Apache Parquet	big-data	columnar storage format available to any project in the Apache Hadoop ecosystem	2015年4月
Apache Aurora		Mesos framework for long-running services and cron jobs	2015年3月
Apache Polygene	library	community based effort exploring Composite Oriented Programming for domain centric application development	2015年3月
Apache Samza	big-data	distributed stream processing framework	2015年1月
Apache Falcon	big-data	Data management and processing platform.	2014年12月
Apache Flink	big-data	platform for scalable batch and stream data processing	2014年12月
Apache BookKeeper	big-data	Replicated log service which can be used to build replicated state machines	2014年11月
Apache Drill	big-data	Schema-free SQL Query Engine for Apache Hadoop, NoSQL and Cloud Storage	2014年11月
Apache MetaModel	big-data, database, library	common interface for discovery, exploration of metadata and querying of different types of data sources	2014年11月
Apache Storm	big-data	Distributed, real-time computation system	2014年9月
Apache Celix	network	Implementation of the OSGi specification adapted to C	2014年7月
Apache Tez	big-data	High-performance and scalable distributed data processing framework	2014年7月
Apache VXQuery	big-data, xml	A parallel XQuery processor	2014年7月
Apache Phoenix	big-data, database	High performance relational database layer over Apache HBase for low latency applications	2014年5月
Apache Allura	content	Forge software for hosting software projects	2014年3月
Apache Olingo	library	OASIS OData protocol libraries	2014年3月
Apache Tajo	big-data	Big data warehouse system on Apache Hadoop	2014年3月
Apache Knox	big-data	Simplify and normalize the deployment and implementation of secure Hadoop clusters	2014年2月
Apache Open Climate Workbench	content	Climate model evaluation	2014年2月
Apache Spark	big-data	Fast and general engine for large-scale data processing	2014年2月
Apache Helix	big-data, cloud	A cluster management framework for partitioned and replicated distributed resources	2013年12月
Apache Ambari	big-data	Hadoop cluster management	2013年11月
Apache Marmotta		An Open Platform for Linked Data	2013年11月
Apache Chukwa		Open source data collection system for monitoring large distributed systems.	2013年10月
Apache jclouds	cloud, library	Java cloud APIs and abstractions	2013年10月
Apache Curator	database, library	Java libraries that make using Apache ZooKeeper easier	2013年9月
Apache JSPWiki	content	Leading open source WikiWiki engine, feature-rich and built around standard J2EE components (Java, servlets, JSP).	2013年7月
Apache Mesos	cloud	a cluster manager that provides efficient resource isolation and sharing across distributed applications	2013年6月
Apache DeltaSpike	javaee	Portable CDI extensions that provide useful features for Java application developers	2013年4月
Apache Bloodhound	build-management	Issue tracking, wiki and repository browser	2013年3月
Apache CloudStack	cloud	Infrastructure as a Service solution	2013年3月
Apache cTAKES	content	Natural language processing (NLP) tool for information extraction from electronic medical record clinical free-text	2013年3月
Apache Clerezza	content, osgi	Semantically linked data for OSGi	2013年2月
Apache Crunch	big-data, library	Simple and Efficient MapReduce Pipelines	2013年2月
Apache Oltu	library	OAuth protocol implementation in Java	2013年1月
Apache OpenMeetings	network	OpenMeetings: Web-Conferencing and real-time collaboration	2013年1月
Apache Flex	web-framework	Application framework for expressive web applications that deploy to all major browsers, desktops and devices.	2012年12月
Apache Kafka	big-data	Distributed publish-subscribe messaging system	2012年11月
Apache Syncope	identity, security	Managing digital identities in enterprise environments	2012年11月
Apache Cordova	library, mobile	Platform for building native mobile applications using HTML, CSS and JavaScript	2012年10月
Apache Isis	web-framework	Framework for rapidly developing domain-driven apps in Java	2012年10月
Apache OpenOffice	content	An open-source, office-document productivity suite	2012年10月
Apache Airavata	big-data, cloud, network	Workflow and Computational Job Management Middleware	2012年9月
Apache Bigtop	big-data	Apache Hadoop ecosystem integration and distribution project	2012年9月
Apache SIS	library	Spatial Information System	2012年9月
Apache Stanbol	content	Reusable components for semantic content management	2012年9月
Apache Any23	content	Anything to Triples	2012年8月
Apache Lucene.Net	database	Search engine library targeted at .NET runtime users.	2012年8月
Apache Oozie	big-data	A workflow scheduler system to manage Apache Hadoop jobs.	2012年8月
Apache Steve	library	Apache's Python based single transferable vote software system	2012年7月
Apache Flume	big-data	A reliable service for efficiently collecting, aggregating, and moving large amounts of log data	2012年6月
Apache VCL	cloud	Virtual Computing Lab	2012年6月
Apache Giraph	big-data	Iterative graph processing system built for high scalability	2012年5月
Apache Hama	big-data	a Bulk Synchronous Parallel computing framework on top of Apache Hadoop	2012年5月
Apache ManifoldCF	content	Framework for connecting source content repositories to target repositories or indexes.	2012年5月
Apache Creadur		Comprehension and auditing of software distributions	2012年4月
Apache Jena	library	Java framework for building Semantic Web applications	2012年4月
Apache Accumulo	database	Sorted, distributed key/value store	2012年3月
Apache Lucy	database	Search engine library for dynamic languages	2012年3月
Apache Sqoop	big-data	Bulk Data Transfer for Apache Hadoop and Structured Datastores	2012年3月
Apache Bval	javaee, library	Apache BVal: JSR-303 Bean Validation Implementation and Extensions	2012年2月
Apache OpenNLP	library	Machine learning based toolkit for the processing of natural language text	2012年2月
Apache Empire-db	database	Relational Data Persistence	2012年1月
Apache Gora	database	ORM framework for column stores such as Apache HBase and Apache Cassandra with a specific focus on Hadoop	2012年1月
Apache JMeter	testing	Java performance and functional testing	2011年10月
Apache Libcloud	cloud, library	Unified interface to the cloud	2011年5月
Apache Chemistry	library	CMIS (Content Managment Interoperability Services) Clients and Servers	2011年2月
Apache River	javaee	Jini service oriented architecture	2011年1月
Apache Aries	library	Enterprise OSGi application programming model	2010年12月
Apache OODT	web-framework	Object Oriented Data Technology (middleware metadata)	2010年11月
Apache ZooKeeper	database	Centralized service for maintaining configuration information	2010年11月
Apache Thrift	http, library, network	Framework for scalable cross-language services development	2010年10月
Apache Hive	database	Data warehouse infrastructure using the Apache Hadoop Database	2010年9月
Apache Pig	database	Platform for analyzing large data sets	2010年9月
Apache Shiro	library, web-framework	Powerful and easy-to-use application security framework	2010年9月
Apache jUDDI		Java implementation of the Universal Description, Discovery, and Integration specification	2010年8月
Apache Karaf	osgi, network	Server-side OSGi distribution	2010年6月
Apache Avro	big-data, library	A Serialization System	2010年4月
Apache HBase	database	Apache Hadoop Database	2010年4月
Apache Mahout	library	Scalable machine learning library	2010年4月
Apache Nutch	web-framework	Open Source Web Search Software	2010年4月
Apache Tika	library	Content Analysis and Detection Toolkit	2010年4月
Apache Traffic Server	http	A fast, scalable and extensible HTTP/1.1 compliant caching proxy server	2010年4月
Apache UIMA		Framework and annotators for unstructured information analysis	2010年3月
Apache Cassandra	database	Highly scalable second-generation distributed database	2010年2月
Apache Subversion	build-management	Version Control	2010年2月
Apache Axis	http, network, xml	Java SOAP Engine	2009年12月
Apache OpenWebBeans	javaee	OpenWebBeans: JSR-299 Context and Dependency Injection for Java EE Platform Implementation	2009年12月
Apache Pivot	library	Rich Internet applications in Java	2009年12月
Apache Community Development		Resources to help people become involved with Apache projects	2009年11月
Apache PDFBox	content, library	Java library for working with PDF documents	2009年10月
Apache Sling		Web Framework for JCR Content Repositories	2009年6月
Apache Camel	network, osgi	Spring based Integration Framework which implements the Enterprise Integration Patterns	2008年12月
Apache Attic		A home for dormant projects	2008年11月
Apache Buildr	build-management	Simple and intuitive build system for Java applications	2008年11月
Apache CouchDB	big-data, cloud, content, database, http, network	RESTful document database	2008年11月
Apache Qpid	network	Multiple language implementation of the latest Advanced Message Queuing Protocol (AMQP)	2008年11月
Apache CXF	library, network, xml	Service Framework	2008年4月
Apache Archiva	build-management	Build Artifact Repository Manager	2008年3月
Apache Hadoop	database	Distributed computing platform	2008年1月
Apache Synapse	http, network, xml	Enterprise Service Bus and Mediation Framework	2007年12月
Apache HttpComponents	http, library, network	Java toolset of low level HTTP components	2007年11月
Apache ServiceMix	network, osgi, xml	Enterprise Service Bus	2007年9月
Apache ODE	network, xml	Orchestration Director Engine: Business Process Management (BPM), Process Orchestration and Workflow through service composition.	2007年7月
Apache Commons	http, library, network	Reusable Java components	2007年6月
Apache Wicket	web-framework	Component-based Java Web Application Framework.	2007年6月
Apache OpenJPA	database, javaee, library	OpenJPA: Object Relational Mapping for Java	2007年5月
Apache POI	content, library	Java API for OLE 2 Compound and OOXML Documents	2007年5月
Apache TomEE	network	Java EE Web Profile built on Apache Tomcat	2007年5月
Apache Turbine	web-framework	A Java Servlet Web Application Framework and associated component library	2007年5月
Apache Felix	network	OSGi Framework and components	2007年3月
Apache Roller	content	Java blog server	2007年2月
Apache ActiveMQ	network	Distributed Messaging System	2007年1月
Apache Cayenne	database, library, network, web-framework, xml	User-friendly Java ORM with Tools	2006年12月
Apache OFBiz	content, database, http, network, web-framework, xml	Open for Business: enterprise automation software	2006年12月
Apache Tiles	web-framework	A templating framework for web application user interfaces	2006年12月
Apache Labs		A place for innovation where committers of the foundation can experiment with new ideas	2006年11月
Apache MINA	network	Multipurpose Infrastructure for Network Application	2006年10月
Apache Velocity	library	A Java Templating Engine	2006年10月
Apache Santuario	library, security, xml	XML Security in Java and C++	2006年6月
Apache Jackrabbit	database, library, network, xml	Content Repository for Java	2006年3月
Apache Tapestry	web-framework	Component-based Java Web Application Framework	2006年2月
Apache Tomcat	http, javaee, network	A Java Servlet and JSP Container	2005年5月
Apache Directory	network	Apache Directory Server	2005年2月
Apache MyFaces	javaee, web-framework	JavaServer(tm) Faces implementation and components	2005年2月
Apache Xerces	xml	XML parsers in Java, C++ and Perl	2005年2月
Apache Lucene	database, library, search	Search engine library	2005年1月
Apache Xalan	xml	XSLT processors in Java and C++	2004年10月
Apache XML Graphics	graphics	Conversion from XML to graphical output	2004年10月
Apache SpamAssassin	mail	Mail filter to identify spam	2004年6月
Apache Forrest	build-management, database, graphics, http, network, web-framework, xml	Aggregated multi-channel documentation, separation of concerns	2004年5月
Apache Geronimo	http, javaee, network, web-framework	Java2, Enterprise Edition (J2EE) container	2004年5月
Apache Struts	web-framework	Model 2 framework for building Java web applications	2004年3月
Apache Gump	build-management, testing	Continuous integration of open source projects	2004年2月
Apache Portals	web-framework	Portal technology	2004年2月
Apache Logging Services		Cross-language logging services	2003年12月
Apache Maven	build-management	Java project management and comprehension tools	2003年3月
Apache Cocoon	database, graphics, http, network, web-framework, xml	Web development framework: separation of concerns, component-based	2003年1月
Apache James	mail, network	Java Apache Mail Enterprise Server	2003年1月
Apache Web Services		Projects related to Web Services	2003年1月
Apache Ant	build-management	Java-based build tool	2002年11月
Apache Incubator		Entry path for projects and codebases wishing to become part of the Foundation's efforts	2002年10月
Apache DB		Database access	2002年7月
Apache Portable Runtime (APR)	library	Apache Portable Runtime libraries	2000年12月
Apache Tcl		Dynamic websites using TCL	2000年7月
Apache mod_perl	httpd-module	Dynamic websites using Perl	2000年3月
Apache HTTP Server	http, httpd-module, network	Apache Web Server (httpd)	1995年2月

3. Incubator一覧

Incubatorは過去のものを含めてApache Incubator Projects に一覧されているが、ここではそのうち現在進行中の50あまりのプロジェクトについて、開始日時の降順で一覧する。

Project	Description	Start Date
Coral	Coral is a data processing system to flexibly control the runtime behaviors of a job to adapt to varying deployment characteristics.	2018/2/4
ECharts	ECharts is a charting and data visualization library written in JavaScript.	2018/1/18
PLC4X	PLC4X is a set of libraries for communicating with industrial programmable logic controllers (PLCs) using a variety of protocols but with a shared API.	2017/12/18
SkyWalking	Skywalking is an APM (application performance monitor), especially for microservice, Cloud Native and container-based architecture systems. Also known as a distributed tracing system. It provides an automatic way to instrument applications: no need to change any of the source code of the target application; and an collector with an very high efficiency streaming module.	2017/12/8
ServiceComb	ServiceComb is a microservice framework that provides a set of tools and components to make development and deployment of cloud applications easier.	2017/11/22
Crail	Crail is a storage platform for sharing performance critical data in distributed data processing jobs at very high speed.	2017/11/1
SDAP	SDAP is an integrated data analytic center for Big Science problems.	2017/10/22
PageSpeed	PageSpeed represents a series of open source technologies to help make the web faster by rewriting web pages to reduce latency and bandwidth.	2017/9/30
Amaterasu	Apache Amaterasu is a framework providing continuous deployment for Big Data pipelines.	2017/9/7
Daffodil	Apache Daffodil is an implementation of the Data Format Description Language (DFDL) used to convert between fixed format data and XML/JSON.	2017/8/27
Heron	A real-time, distributed, fault-tolerant stream processing engine.	2017/6/23
Livy	Livy is web service that exposes a REST interface for managing long running Apache Spark contexts in your cluster. With Livy, new applications can be built on top of Apache Spark that require fine grained interaction with many Spark contexts.	2017/6/5
Pulsar	Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication.	2017/6/1
Superset	Superset is an enterprise-ready web application for data exploration, data visualization and dashboarding.	2017/5/21
Gobblin	Gobblin is a distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.	2017/2/23
MXNet	A Flexible and Efficient Library for Deep Learning	2017/1/23
Ratis	Ratis is a java implementation for RAFT consensus protocol	2017/1/3
Griffin	Griffin is a open source Data Quality solution for distributed data systems at any scale in both streaming or batch data context	2016/12/5
Weex	Weex is a framework for building Mobile cross-platform high performance UI.	2016/11/30
OpenWhisk	distributed Serverless computing platform	2016/11/23
NetBeans	NetBeans is a development environment, tooling platform and application framework.	2016/10/1
Spot	Apache Spot is a platform for network telemetry built on an open data model and Apache Hadoop.	2016/9/23
Hivemall	Hivemall is a library for machine learning implemented as Hive UDFs/UDAFs/UDTFs.	2016/9/13
Annotator	Annotator provides annotation enabling code for browsers, servers, and humans.	2016/8/30
AriaTosca	ARIA TOSCA project offers an easily consumable Software Development Kit(SDK) and a Command Line Interface(CLI) to implement TOSCA(Topology and Orchestration Specification of Cloud Applications) based solutions.	2016/8/27
SensSoft	SensSoft is a software tool usability testing platform	2016/7/13
Traffic Control	Traffic Control allows you to build a large scale content delivery network using open source.	2016/7/12
Pony Mail	Pony Mail is a mail-archiving, archive viewing, and interaction service, that can be integrated with many email platforms.	2016/5/27
Gossip	Gossip is an implementation of the Gossip Protocol.	2016/4/28
Airflow	Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines.	2016/3/31
Quickstep	Quickstep is a high-performance database engine.	2016/3/29
Omid	Omid is a flexible, reliable, high performant and scalable ACID transactional framework that allows client applications to execute transactions on top of MVCC key/value-based NoSQL datastores (currently Apache HBase) providing Snapshot Isolation guarantees on the accessed data.	2016/3/28
Gearpump	Gearpump is a reactive real-time streaming engine based on the micro-service Actor model.	2016/3/8
Tephra	Tephra is a system for providing globally consistent transactions on top of Apache HBase and other storage engines.	2016/3/7
Edgent	Edgent is a stream processing programming model and lightweight runtime to execute analytics at devices on the edge or at the gateway. (Formerly known as Quarks)	2016/2/29
Joshua	Joshua is a statistical machine translation toolkit	2016/2/13
iota	Open source system that enables the orchestration of IoT devices.	2016/1/20
Milagro	Distributed Cryptography; M-Pin protocol for Identity and Trust	2015/12/21
Toree	Toree provides applications with a mechanism to interactively and remotely access Apache Spark.	2015/12/2
S2Graph	S2Graph is a distributed and scalable OLTP graph database built on Apache HBase to support fast traversal of extremely large graphs.	2015/11/29
Unomi	Unomi is a reference implementation of the OASIS Context Server specification currently being worked on by the OASIS Context Server Technical Committee. It provides a high-performance user profile and event tracking server.	2015/10/5
Rya	Rya (pronounced "ree-uh" /rēə/) is a cloud-based RDF triple store that supports SPARQL queries. Rya is a scalable RDF data management system built on top of Accumulo. Rya uses novel storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes. Rya provides fast and easy access to the data through SPARQL, a conventional query mechanism for RDF data.	2015/9/18
HAWQ	HAWQ is an advanced enterprise SQL on Hadoop analytic engine built around a robust and high-performance massively-parallel processing (MPP) SQL framework evolved from Pivotal Greenplum Database.	2015/9/4
FreeMarker	FreeMarker is a template engine, i.e. a generic tool to generate text output based on templates. FreeMarker is implemented in Java as a class library for programmers.	2015/7/1
SINGA	SINGA is a distributed deep learning platform.	2015/3/17
Myriad	Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure.	2015/3/1
SAMOA	SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms that run on top of distributed stream processing engines (DSPEs). It features a pluggable architecture that allows it to run on several DSPEs such as Apache Storm, Apache S4, and Apache Samza.	2014/12/15
Tamaya	Tamaya is a highly flexible configuration solution based on an modular, extensible and injectable key/value based design, which should provide a minimal but extendible modern and functional API leveraging SE, ME and EE environments.	2014/11/14
HTrace	HTrace is a tracing framework intended for use with distributed systems written in java.	2014/11/11
Taverna	Taverna is a domain-independent suite of tools used to design and execute data-driven workflows.	2014/10/20
Slider	Slider is a collection of tools and technologies to package, deploy, and manage long running applications on Apache Hadoop YARN clusters.	2014/4/29
DataFu	DataFu provides a collection of Hadoop MapReduce jobs and functions in higher level languages based on it to perform data analysis. It provides functions for common statistics tasks (e.g. quantiles, sampling), PageRank, stream sessionization, and set and bag operations. DataFu also provides Hadoop jobs for incremental data processing in MapReduce.	2014/1/5
BatchEE	BatchEE projects aims to provide a JBatch implementation (aka JSR352) and a set of useful extensions for this specification.	2013/10/3
ODF Toolkit	Java modules that allow programmatic creation, scanning and manipulation of OpenDocument Format (ISO/IEC 26300 == ODF) documents	2011/8/1

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up