Tracking Service-Oriented and Web-Oriented Architecture

SOA & WOA Magazine

Subscribe to SOA & WOA Magazine: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get SOA & WOA Magazine: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


SOA & WOA Authors: Andreas Grabner, TJ Randall, Lori MacVittie, Dynatrace Blog, Cynthia Dunlop

Related Topics: Virtualization Magazine, Java EE Journal, SOA & WOA Magazine, CEP on Ulitzer

Virtualization: Article

SOA Web Services Journal - Enterprise Data Fabric (EDF)

Formalizing middle-tier data management

Performance, scalability, and availability of traditional n-tier architectures have become enormously important to most IT architects. First, most meaningful business processes have at least a few real-time sub-processes, where latency and throughput are extremely critical factors to satisfy the mandated service level agreements. Second, data is becoming a dynamic entity, with an explosive increase in the number and types of data sources. With the advent of industry trends like algorithmic trading in capital markets and Radio Frequency Identification (RFID) in retail supply chains, instant analysis and distribution of relevant data streams is becoming a key requirement in most industries.

To address these challenges of modern IT, infrastructures such as J2EE (or Java EE going forward) need to support event-driven architectures in addition to traditional request-reply models. Such IT prerequisites drive the need for sophisticated middle-tier data management as a foundation for building, deploying, and operating mission-critical applications.

Enter EDF
Conceptually, an enterprise data fabric (EDF) represents a distributed middle-tier operational infrastructure that leverages main memory and disk across multiple disparate hardware nodes to store, analyze, distribute, and replicate data. By managing operational information in the middle tier, an EDF promotes the co-location of applications and the data they require as opposed to a typical siloed approach. By combining the essential features of caching, databases, messaging, and event processing, an EDF engenders a "distributed, active" data management approach that's quite antithetical to the "centralized, passive" approaches traditionally adopted.

Data Persistence and Virtualization
An EDF supports multiple distributed data caching and persistence topologies, virtualizing information from multiple data sources to guarantee high performance with location transparency, which enables consuming applications to use a single API to instantly access data irrespective of the underlying data source. Data can be managed in multiple heterogeneous formats (objects, tables, XML documents) and accessed via multiple programmatic and query languages such as Java, C++, SOAP, Xpath, SQL, and OQL.

Reliable Data Distribution
To support deployments like computational grids that potentially span hundreds or even thousands of nodes, an EDF provides a high-speed transport layer that utilizes multiple protocols like TCP/IP and Multicast. This layer enables scalable delivery of data across distributed members in a reliable fashion. Further, an EDF's distributed event notification service lets updates to a data element be propagated through a listener framework. Data distribution in an EDF may be push-based (changes propagated on update) or pull-based (changes propagated only on request). With the push-based model, distribution can be configured as synchronous or asynchronous. Unlike an enterprise messaging system, the object-based programming model is intuitive without the headache of message format, message headers, payload, etc. Sharing data objects combined with event notification enables parallel data processing in grid-like topologies.

Continuous Analytics
Besides storing and virtualizing static data elements, an EDF can also manage fast-changing data streams through a continuous querying model in which new events are constantly analyzed against pre-defined queries or patterns of interest. This model is optimized to rapidly determine queries affected by a particular event and notify relevant client applications via callback interface with almost no latency. An EDF provides a storage model for events that simultaneously allows correlation and querying across both real-time and non-real-time data.

High Availability
For business-critical operations, it's essential to ensure data availability at all times and avoid single points of failure (SPOF). An EDF provides high availability through replication to one or more "mirror" (backup) cache nodes. A mirror synchronously gets all the data changes from the primary nodes across the entire distributed system guaranteeing 100% backup of the data at all times. An EDF also supports disk persistence to guarantee data recovery even in scenarios involving complete application(s) failure.

EDF in Today's SOA Infrastructure
As IT organizations strive to deploy SOA infrastructures to support real-time business process, middle-tier management via an EDF-like infrastructure becomes an absolute necessity. An EDF serves as data middleware augmenting other Service Oriented Architecture (SOA) components. Just as application servers like WebLogic provide a container for applications, an EDF acts as a data container that can be embedded in the JVM of an application server instance or shared across a cluster(s) of application servers. In relevant scenarios, an EDF can act either as a JTS-compliant transaction manager or participate in container-managed transactions as an XA resource.

In the broad context of an SOA, an EDF serves as a data middleware complement to an Enterprise Service Bus (ESB), handling service/application integration, routing, and data transformations. Just as an ESB manages service flow and control, an EDF provides a distributed operational data storage layer from which ESB components can access data in multiple formats and in a highly available fashion. Alternatively, an EDF can expose the data held in the fabric as a service (with WSDL port types and operations) easily plugging into an ESB as one of the services. An EDF also complements the data mediation services offered via typical Service Data Objects (SDO) implementation by providing supplementary functions such as data caching, data distribution, and continuous analytics. Similarly, business process management (BPM) engines in SOAs can use a data fabric as a distributed persistence layer for contextual state and frequently accessed data. These strategies address another extremely important dimension of service orientation, which relates to service latency and availability.

More Stories By Bharath Rangarajan

Bharath Rangarajan is director of product marketing at GemStone Systems, where he oversees product positioning and market strategy for the GemFire product line. He has more than seven years of experience in enterprise software, dealing with data management issues in functional areas such as EAI, B2B collaboration and supply chain management. He has led technical and product teams at companies including i2 Technologies, SeeBeyond Corp. and Candle Corp.

Comments (5) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Tangosol Coherence 08/14/06 03:14:10 PM EDT

Sounds a lot like Tangosol Coherence ..

SYS-CON India News Desk 08/10/06 01:08:55 PM EDT

Performance, scalability, and availability of traditional n-tier architectures have become enormously important to most IT architects. First, most meaningful business processes have at least a few real-time sub-processes, where latency and throughput are extremely critical factors to satisfy the mandated service level agreements. Second, data is becoming a dynamic entity, with an explosive increase in the number and types of data sources. With the advent of industry trends like algorithmic trading in capital markets and Radio Frequency Identification (RFID) in retail supply chains, instant analysis and distribution of relevant data streams is becoming a key requirement in most industries.

SOA Web Services Journal News 08/09/06 07:31:05 PM EDT

Performance, scalability, and availability of traditional n-tier architectures have become enormously important to most IT architects. First, most meaningful business processes have at least a few real-time sub-processes, where latency and throughput are extremely critical factors to satisfy the mandated service level agreements. Second, data is becoming a dynamic entity, with an explosive increase in the number and types of data sources. With the advent of industry trends like algorithmic trading in capital markets and Radio Frequency Identification (RFID) in retail supply chains, instant analysis and distribution of relevant data streams is becoming a key requirement in most industries.

SYS-CON Australia News Desk 08/01/06 06:33:00 PM EDT

Performance, scalability, and availability of traditional n-tier architectures have become enormously important to most IT architects. First, most meaningful business processes have at least a few real-time sub-processes, where latency and throughput are extremely critical factors to satisfy the mandated service level agreements. Second, data is becoming a dynamic entity, with an explosive increase in the number and types of data sources. With the advent of industry trends like algorithmic trading in capital markets and Radio Frequency Identification (RFID) in retail supply chains, instant analysis and distribution of relevant data streams is becoming a key requirement in most industries.

JDJ News Desk 08/01/06 05:36:46 PM EDT

Performance, scalability, and availability of traditional n-tier architectures have become enormously important to most IT architects. First, most meaningful business processes have at least a few real-time sub-processes, where latency and throughput are extremely critical factors to satisfy the mandated service level agreements. Second, data is becoming a dynamic entity, with an explosive increase in the number and types of data sources. With the advent of industry trends like algorithmic trading in capital markets and Radio Frequency Identification (RFID) in retail supply chains, instant analysis and distribution of relevant data streams is becoming a key requirement in most industries.