Workflow component works as the central point for managing, scheduling and monitoring processes that handle ingestion, enrichment and dissemination of the platform's working data.
Ingestion processes handle the extraction of the data from its external source and transform (and load) it to the platform's internal graph representation of the working data. Enrichment processes are used to generate new data based on the current state of the working data. Implementions of enrichment processes can be anything from simple link generation based on existing data to machine learning based approaches.
Separating different concern of data processing with the shared internal data storage allows one to re-use workflows on different data on demand basis.
Graph management related components are in charge of storing and distributing the internal data of the platform which is stored exclusively in graph format. It also maintains the provenance of the data, which is the key feature of the platform.
Components provide efficient update and query interfaces for platform's internal use and a UI for visualizing the state of the internal data for platform administrators.
This component handle the distribution of public subset of the working data and they can be APIs or applications. Possible implementations of component include SPARQL endpoints, REST(ful) document based APIs, simple data dumps, CKAN data registry and platform provenance browser.
Platform components communicate with each other using dedicated messaging service. The usage of such component allows for a implementation of persistent, fire-and-forget type of messaging and publish/subscribe communication within the platform.
Service discovery provides the backbone for dynamic composition of the platform components. The idea is that one should be able add new internal services (e.g. NER plugin for workflows) without having to restart the whole platform.
Platform Deployment includes the necessary configurations and tools for setting up and maintaining the whole ATTX component based platform or individual components. This component also addresses issues related to load-balancing and high-availability configurations.