Wednesday, December 31, 2014

Talend Open Studio for Big Data

  1. Introduction
    1. Talend provides a powerful and versatile open source big data product that makes the job of working with big data technologies easy and helps drive and improve business performance, without the need for specialist knowledge or resources.
  2. Features
    1. Integration at Cluster Scale
      1. Talend’s big data product combines big data components for MapReduce 2.0 (YARN), Hadoop, HBase, Hive, HCatalog, Oozie, Sqoop and Pig into a unified open source environment so you can quickly load, extract, transform and process large and diverse data sets from disparate systems.
    2. Big Data Without The Need To Write / Maintain Code
      1. Ready to Use Big Data Connectors
        1. Talend provides an easy-to-use graphical environment that allows developers to visually map big data sources and targets without the need to learn and write complicated code. Running 100% natively on Hadoop, Talend Big Data provides massive scalability. Once a big data connection is configured the underlying code is automatically generated and can be deployed remotely as a job that runs natively on your big data cluster - HDFS, Pig, HCatalog, HBase, Sqoop or Hive.
      2. Big Data Distribution and Big Data Appliance Support
        1. Talend's big data components have been tested and certified to work with leading big data Hadoop distributions, including Amazon EMR, Cloudera, IBM PureData, Hortonworks, MapR, Pivotal Greenplum, Pivotal HD, and SAP HANA. Talend provides out-of-the-box support for big data platforms from the leading appliance vendors including Greenplum/Pivotal, Netezza, Teradata, and Vertica.
      3. Open Source
        1. Using the Apache software license means developers can use the Studio without restrictions. As Talend’s big data products rely on standard Hadoop APIs, users can easily migrate their data integration jobs between different Hadoop distributions without any concerns about underlying platform dependencies. Support for Apache Oozie is provided out-of-the-box, allowing operators to schedule their data jobs through open source software.
      4. Pull Source Data from Anywhere Including NoSQL
        1. With 800+ connectors, Talend integrates almost any data source so you can transform and integrate data in real-time or batch. Pre-built connectors for HBase, MongoDB,Cassandra, CouchDB, Couchbase, Neo4J and Riak speed development without requiring specific NoSQL knowledge. Talend big data components can be configured to bulk upload data to Hadoop or other big data appliance, either as a manual process, or an automatic schedule for incremental data updates.
  3. Products
    1. https://www.talend.com/products/big-data/matrix

      FEATURESTalend Open Studio for Big DataTalend Enterprise Big DataTalend Platform for Big Data
      Job Designer
      x
      x
      x
      Components for HDFS, HBase, HCatalog, Hive, Pig, Sqoop
      x
      x
      x
      Hadoop Job Scheduler
      x
      x
      x
      NoSQL Support
      x
      x
      x
      Versioning
      x
      x
      x
      Shared Repository
      x
      x
      Reporting and Dashboards

      x
      Hadoop Profiling, Parsing and Matching

      x
      Indemnification/Warranty and Talend Support
      x
      x
      LicenseOpen SourceSubscriptionSubscription


  4. Reference
    1. http://www.talend.com/

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.