Data Ingestion

Bring 'em on!

IIP simplifies data ingestion to help you make complete sense of your data. IIP supports all data – whether structured/unstructured or batch/real-time – from all data sources – NoSQL database to Twitter and Facebook feeds. Thus, you can ingest data into the platform from data sources such as relational databases, data streams, file servers, and JMS queue. While the platform comes pre-loaded with several data source types, you can customize these according to your business needs.

IIP's data ingestion tools work seamlessly to enable:
  • File ingestion through FTP, sFTP and URL
  • Data streaming through KAFKA
  • Twitter and Facebook data ingestion through in-built adapters
  • JDBC ingestion supported for DB2, Oracle, PostgreSQL, MySQL, SAP HANA, and SQLServer

IIP allows you to synchronize your Hive tables in the current Hadoop environment and get a unified view of new as well as existing data. Data is ingested for processing and analysis based on the data source, execution configuration and specified schema. After it is ingested, data is loaded into the specified data store and used for data exploration, modeling, processing, and analytics.

Data Modeling

Subtract noise, amplify insights

View it. Filter it. Refine it. IIP's Data Explorer performs data modeling on the ingested data. It uses exploratory data analysis to extract hidden and actionable information from 'noisy' data. Data scientists and analysts can create different data views easily by selecting columns, joining data and applying filters. The data modeling workbench features several innovative functionalities such as:

  • List of all schemas, files and streaming data in the explorer workbench
  • Data transformation to flatten XML and JSON
  • Search functionality that looks for specific schema
  • Drag-and-drop facility to move schemas into the graphic view and preview selected schema
  • Execute various functions on every column such as aggregate/mathematical operations, sort orders/sort by, filters, alias/coalesce, etc.

IIP also allows you to:

  • Perform joint operations between different schemas
  • Write SQL/Hive queries to generate views
  • Persist output into HDFS
  • Edit/update queries generated by the graphic interface
  • Publish data models to external systems and downstream applications
  • Retrieve data models through Rest API web service

Data Preparation

Turn raw data into an actionable asset.

Assess it. Structure it. Clean it. Enrich it. IIP's data preparation workbench leverages industry-leading vendor Trifacta to help you transform your data. With an automatically-generated visual representation of your data, the tool empowers you to quickly identify anomalies and outliers, as well as aggregate, standardize, and combine your data into an actionable asset.

In addition, the Trifacta data preparation workbench allows you to:
  • Work with all data formats, such as web logs and JSON, with automatic data recognition and structuring
  • Quickly assess data quality with metrics such as invalid values, missing values, outliers and value distribution
  • Transform data with rich functionality, including pivoting, joining, aggregating, time period calculations, splitting, and formatting

What makes IIP Data Preparation and Exploration with Trifacta unique?

Interactive Exploration Better explore and refine data with instant and continuous feedback on compelling visual representations.

Predictive Transformation Transform data intuitively with suggested transformations that constantly taking inputs from the data and the user to intelligently recommend ways to manipulate the data.

Intelligent Execution Experience unmatched performance with a seamless transition between interactive exploration to optimized, full-scale processing using IIP execution framework.

Collaborative Governance Focus on wrangling, not security, with assured support for security, data lineage and IIP access framework.

Data Pipeline

Manage your data, your way

IIP's Data Pipeline hosts several components that collaborate to manage data, perform edits and modify and reactivate the data pipeline. It enables you to easily sequence, schedule and execute data jobs. The scheduling feature specifies the business logic of your data management along with planning and running your tasks. Further, the fully customizable Workflow Manager allows you to confidently set and execute tasks.

With in-built data pipeline capabilities, you can define data-driven workflows and parameters for data transformations. Managing data jobs has never been this quick or easy.

Data Science

Making data science agile

IIP's Data Science module provides agility by integrating with R-Studio and allowing you to use various modeling algorithms from the R/CRAN libraries.

IIP leverages natural language processing (NLP) capabilities along with N-Gram, etc., for text analysis that includes text cleanup, stop-word removal and stemming. Further, the Sentiment Analyzer can analyze the overall context of a document and determine the sentiment attached to different words.

The Data Science module also includes a workbench with tools for investigative and operational analytics along with sandbox environments to build experimental and analytic workflows. Essentially, IIP allows you to go on a data treasure hunt and win the bounty.

Analysis and Visualization

Don't just print it, paint it

IIP enhances your data analysis and visualization experience and allows you to share, collaborate and integrate reports with third-party tools such as Tableau®, Qlikview, and other BI tools with rich and interactive visualizations.

IIP understands that generating reports is not enough. To truly understand data, one must be able to interact with it and visualize it across scenarios. IIP enables this with rich out-of-the-box visualization features such as bar/tree/pie charts or any of your custom favorite display tools, allowing you to mould data into comprehensible insights.

Ready to experience IIP?

Gain competitive advantage with an easy, one-click installation of IIP on AWS market place.
Buy it
Take your data out for a spin and get hands-on with IIP – free-of-cost
Try it