Tracking Service-Oriented and Web-Oriented Architecture

SOA & WOA Magazine

Subscribe to SOA & WOA Magazine: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get SOA & WOA Magazine: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


SOA & WOA Authors: TJ Randall, Lori MacVittie, Andreas Grabner, Dynatrace Blog, Cynthia Dunlop

Related Topics: SOA & WOA Magazine

SOA & WOA: Article

On the Road to Web Service-Level Management

On the Road to Web Service-Level Management

Web services is now delivering on the promise of interconnecting systems, within and between organizational boundaries. But the benefits of open interoperability of such distributed resources only increase the complexity of the computing environment that has to be managed.

Earlier this year Gartner published a report defining a Web services management platform as one of four platforms required for successful Web service deployments. In this article we'll look at the role traditional systems management has played in the enterprise, requirements specific to managing a Web services environment, and how Web services in a managed environment can improve service levels while reducing the overhead and costs associated with managing complex distributed environments.

Systems Management
Since Web services are WSDL-described network-accessible software components, some of the basic requirements for the management of Web services are the same as those for the management of applications. Traditionally, application management has meant the monitoring and control of an application throughout its life cycle (installation through startup, execution, configuration, to final shutdown and decommissioning).

Monitoring includes collecting metrics and events relevant to the application from the application's execution environment (hardware, operating system, etc.), as well as the application itself. Control could include installation, configuration, startup, shutdown, and general health and status, as well as real-time tuning to ensure optimal performance. In order for an application to be "monitorable" and controllable, it needs to expose basic management information including identification, status, metrics, configuration, operations, and events. This set of information is referred to as the manageability model for the application. The information populating this model may be supplied explicitly by the application, implicitly through the application's environment, or through both channels.

The typical architectural model for enterprise systems management is the manager-agent. In this model, the management system communicates with an agent using a predetermined protocol that may be proprietary or based on standards. The agent is usually local to the application and responsible for communicating with the managed applications and the management system. The agent forwards events from the application to the management system, and forwards requests from the management system to the application.

Management Standards
A number of management models and protocols were developed to standardize the manager-agent architecture, protocol, and information model. Even so, the most popular system management products deployed today, CA Unicenter and IBM/Tivoli, rely on proprietary protocols between their managers and agents. Fortunately, they also consume information from managed resources using standardized management technologies like the Simple Network Management Protocol (SNMP) and the Common Information Model (CIM).

SNMP, from the IETF, was developed as a stopgap solution for accessing management information on a device. SNMP, however, lacks native support for operations, secure authorization, and relationships. At the same time, it became obvious that the information model needed to be standardized independently of how the information was expressed or accessed.

The Distributed Management Task Force (DMTF) has been defining a standard manageability model, CIM for IT resources, for over five years. One of the most important benefits of CIM is the common vocabulary for simple concepts like status and description. The information in a CIM model can be accessed using Web Based Enterprise Management (WBEM), which defines a protocol using XML-encoded CIM meta-schema, classes, and instances over HTTP. While many of the systems, devices, and network models are very mature and complete, the application model is still being developed.

Nearly all the existing management technologies, protocols, and models have been focused on managing the configuration and status of specific resources rather than business views of distributed applications and systems, in essence, business services. End-to-end views of resources used by a business system are hard to develop and understand and the status of those systems even harder to infer. The traditional coarse granularity of management status (up, down, degraded, etc.) does not provide enough context to tell a busy operator or business administrator whether a system, or service, needs attention. More importantly, it doesn't tell them if the system is behaving as expected by its users, namely their business partners, suppliers, customers, or other internal personnel.

Web Service Management
The emergence of service-oriented architectures (SOA) and industry commitment to Web services and Grid computing requires new management capabilities to be available in order to truly achieve the visions and goals of these initiatives. Both Web services and Grid services leverage an environment based upon deploying services that are loosely coupled across heterogeneous, dynamic environments. To effectively manage such a dynamic environment, it is critical that management decisions be based on messages (context and payload), service descriptions, and service provider information, and decoupled from traditional management facilities. Management of this new environment extends past the traditional systems and applications to include administration of services and provide management visibility and control at the service level.

Gartner defines the Web Services Management Platform (WSMP) as "a set of software services that is designed to help coordinate the activities of services while they are being used." Different service provider platforms (e.g., WebSphere, .NET, etc.) will not be able to manage services deployed on other platforms due to disparate management components. Gartner views a WSMP as the bridge to enable and provide interoperability for managing services across platforms.

Service-Based Management
Today most businesses are not investing in IT without a clear return on investment, lower total cost of ownership, and clearly demonstrated cost savings. Investments made in Web services and future Grid service initiatives offer the opportunity to realize these requirements, but need to be deployed in a consistent, repeatable, and manageable fashion. Traditional operations management has not been able to offer the unique management functionality that can help achieve these requirements as compared to service-based management.

 

In order to achieve optimal IT investment, there must be strategic alignment between business requirements and IT investment and management to support that alignment, i.e., service-level management (SLM). SLM is achieved through the proper definition of services, relationships between services, and their correlation and representation as business processes. Traditional operations management platforms have been narrowly focused on specific systems and applications as opposed to a service-based dynamic environment.

Service-based management requires the provision and consumption of services in a nonintrusive manner while maintaining the loosely coupled nature of SOA. The WSMP does precisely this by acting as a transparent intermediary, or broker, between consumers and providers of services. The broker handles requests and manages the runtime provisioning of service endpoints to the requests dynamically. It finds the most appropriate service for service requests on demand. The broker then supports the interaction between consumer and provider with management facilities for availability, versioning, provisioning, configuration management, logging, auditing, alerting, error management, transformation, and integration with security facilities for authentication and authorization.

A WSMP that transparently mediates between Consumers and Providers to resolve service requests on demand offers two important advantages:

1.   Messages received can be intercepted or inspected for additional information pertaining to the consumer or the request and taken into consideration; for example, identity, geography, time, price, etc., offering opportunities for differentiated service offerings or intelligent routing based on context.
2.   Management can be proactive in resolving error conditions such as service failures or potential service-level breaches by intelligently routing messages to best available resources.

WSMP should not be seen as a replacement to systems management, but rather as a conduit and extension to external, broader management facilities such as those offered by Tivoli, CA, BMC, that can correlate and add management to the infrastructure used to implement the Web service (hardware, networks, etc.).

 

Web Service-Level Management
Web services and their description form the basis for a common definition of assets between business and IT. Once the process-based Web services and individual Web services are identified and described they can both have service-level objectives (SLO) and service-level agreements (SLA) associated with them. This then enables business and IT to have a common definition of assets to support the alignment of business needs and IT resources, as well as define a level against which the service will be measured and managed.

For systems management to really meet the needs of the organization, both the resources being managed (Web services) and the manager (WSMP) need to take on certain responsibilities. The resources must provide enough information and operational interfaces such that the resources can be centrally monitored and controlled. The Manager must be capable of analyzing the information provided by individual resources and correlating information from multiple resources, and provide the ability to act on the information to better manage the Quality of Service (QoS) being achieved by services and offered to consumers. The manager, where appropriate, should also be able to manage the environment proactively such that every attempt is made to meet the declared QoS, whether those be internal SLO or a more formal SLA.

The major roles involved in interactions are the Provider and the Consumer; therefore, management should be addressed from the perspective of both roles. A Manager role, when seen from the Provider's perspective, has management capabilities and visibility beyond the Web service and into the service instance or implementation. The Provider, therefore, has management capabilities (visibility and control) over elements of the architecture that a Consumer would not, e.g. the hosting environment. The separation of service from its implementation and environment means that the lower-level elements that support the service are important to the Provider so they can manage at both levels. For example, service Providers may want to replicate service instances, launch new service instances to meet SLA at peak load times, or perhaps failover to alternative service instances hosted elsewhere. In this case, being able to manage the service as it is exposed to the Consumer and the elements of the architecture supporting that service is critical to meeting business objectives.

From the Consumer's perspective, only the service as defined by the service definition they have consumed can be managed. The Consumer acting as a Manager will have visibility and control of the requesters, but in most cases will not be offered management beyond visibility (metering and monitoring) for the service. For example, the Consumer (or a third party) may want to, and be allowed to, look at the performance and availability metrics or measurements offered by a service it consumes to ensure adherence to SLA. There are three interrelated arenas of responsibility for Web services;

1.   Service monitoring and reporting: Monitoring and reporting on the usage, health, and QoS being delivered by services
2.   Service execution management: Routing of requests, differentiated service offerings, security, and fault management
3.   Service environment management: De-pendency management, deployment, nondisruptive versioning, and upgrades

Service Monitoring and Reporting
At the very least, a Manager has to be capable of garnering metrics and events about a service's performance, availability, usage, and configuration (policy driven). Metrics represent information logged by the service, gathered periodically, or requested at a point in time, and should be as raw as possible such that ambiguity in their meaning is avoided, since any type of metric or measurement is meaningless without knowing the formula used to derive value. The Manager also correlates information from multiple services such that measurements and monitoring cover entire transactions or processes that may involve multiple service interactions occurring over extended periods of time. Event monitoring includes listening for events posted by services that signal significant individual events, e.g. failure or state changes. Based on this management information, the Managers are capable of showing the health of the services and the system overall at any point in time.

Collected metrics are used to calculate measurements regarding performance availability and usage. With the Manager defining and recording measurements, services can be monitored to ensure they are meeting agreed upon service levels. Breach conditions and early warning of potential breach conditions can then be monitored and managed. The SLA agreements that define measurements should include the formula used for measurements that define the parameters of the SLA (average response time, throughput, and availability). SLAs are defined for specific customers and therefore the metric collection needs to be consumer aware and tie directly into the policies for authentication and authorization for the service. SLAs also define periods of time for which service-level objectives are applicable. Again, this definition affects configuration of the services in terms of warning events and metrics.

Service Execution Management
Management of Web services is not just about providing metrics concerned with performance and availability, it is also about configuration of policies that drive execution and deliver the best QoS from the services being managed. This includes managing error conditions and fault situations, managing the load of requests based on the performance of available services and the identity of requesters, and ensuring security policies for authentication and authorization are adhered to.

Web services that are provisioned may well be supported by many instances of the service so that anticipated capacity can be met within the agreed service levels defined for the service. This means that requests for service need to be intercepted or inspected so they can be routed to the most appropriate service instance available. In Grid computing, this can also mean life cycle management, where more service instances can be launched (capacity on demand) to support the Consumers and meet SLA. Routing may also be affected by policies concerning the identity of the requester. Important Consumers may be shown greater consideration when routing their requests, or Consumers with a more stringent SLA may take priority over others with looser agreements, such that all SLA are met.

Web services can also evolve over time, especially where Consumer requirements drive changes into existing services. Routing must therefore be cognitive of versions and compatibility between services, such that rolling upgrades as well as side-by-side versions of services can be managed appropriately.

Managing the interactions between Consumers and Providers includes security and is part of managing the overall QoS. A managed environment provides security services that enable applications to enforce Access Control, Identity Management, and Entitlement Management. Requirements that define Quality of Protection (QoP) may well be defined in SLA, and policies that enforce that QoP are part of the configuration of a service and need to enforced and managed.

Errors and failures are inevitable in any environment but should be managed (resolved) wherever possible such that Consumers are unaware of problems and can continue undisturbed. Errors and fault conditions need to be intercepted by the Manager before being returned to the Consumer to see if there are any ways in which the faults can be resolved. This is, again, based on configuration metadata and driven by policies that govern retry attempts with the requested service, routing to alternate services, back off times, and overall timeouts for requests.

Service Environment Management
As more Web services are deployed in organizations, they are unlikely to be isolated and will depend on other services to complete the functionality they offer. This means that deployed services will need to be managed in terms of their dependencies and relationships to other services. Management therefore needs to ascertain information about services and their relationships at deployment time so that they can be better managed. Relationships include requirements and dependencies - for example process-based Web services should define the services (more appropriately service types) they require for the activities defined within the process. Relationships can also be declared with respect to the lineage of the service, which encapsulates compatibility or noncompatibility between service versions. Relationships should also define conflicting or exclusive relationships where only a single instance of a service can be available at one time in a managed environment. These relationships help the Manager maintain the QoS of the services it manages in terms of availability, problem cause analysis, and evolving service deployment.

Information specific to a service definition (functional) can also be related to other information concerning the service, for example the generic SLA that applies to this service, or other service definitions (operational) that can be utilized at runtime for monitoring and configuring services.

The Manager uses these declared relationships and associated meta information at deployment time and runtime. Once deployed, the Manager is responsible for publishing the service to the appropriate discovery mechanism, making management and operational controls available to management applications and consoles, and accommodating non-disruptive evolution of the service. Evolution includes managing side-by-side versions of services and routing appropriately between them based on Consumer requests and rolling upgrades, where service implementations can be changed and extended without disruption to the Consumers.

All three areas of management are interrelated and need to work collaboratively within a WSMP to deliver real business value to an organization. For example, performance and availability information needs to be made available to the execution management facilities such that it can effectively manage QoS; service level objectives need to be considered for reporting and monitoring and execution management, security and identity information needs to be utilized for differentiated service offerings and service relationships need to be used for correlations of management information and metrics and support of nondisruptive change management.

Moving to Web Service-Level Management
In order for Web services to be manageable in an interoperable manner, three building blocks must be standardized: the basic manageability information that the components of the Web services architecture must support, how that information is accessed, and how the manageable components are discovered.

The first of the building blocks, the minimum, basic information required for managing Web services and their environment, is being developed and standardized in the W3C's Web Services Architecture Working Group's Management Task Force. The Web Services Architecture includes a requirement that implementations of the architecture must be manageable. A task force was initiated to satisfy this requirement and is working to publish a manageability model for each of the components of the Web services architecture, i.e., services, hosting environment, and discovery agency. The manageability model must include identification, configuration, metrics, and events for the components.

The second and third of the building blocks, access to and discovery of the management information, are being developed and standardized in the OASIS Management Protocol Technical Committee (MPTC). The MPTC is defining how to access manageability information for any managed resource using Web services. The same specification should also work for Web services as a specific type of managed resource.

These building blocks have been discussed in terms of how to manage Web services in particular, but you can see that the same principles can be applied to managing IT resources in general. The use of Web services to expose IT resources on Grid systems is being developed and standardized by GGF at Globus. They are also defining how to manage these IT resources using Web services and Grid services. It is logical that they should be able to leverage the same foundation building blocks being developed by the W3C and the OASIS MPTC.

Conclusion
Management of distributed computing environments has always been difficult, but the dynamic nature of Web services and the more loosely coupled nature of interactions make it even more difficult to control and administer. However, Web services, coupled with service-level management in a WSMP, offers opportunities for organizations to better leverage their existing IT investment, better manage IT assets and resources in alignment with business objectives, and introduce autonomic concepts in managing their IT infrastructure that effectively reduce the overhead and cost of managing and maintaining their operational systems. Web services management should not be seen as a requirement once services are deployed and proliferating throughout the enterprise, it is a requirement for day-one deployment of your first Web service.

Web service management platforms and Grid computing are critical components on the path to creating dynamic, service-centric networks of self-managing computing resources and as Gartner rightly points out, "enterprises that do not embrace the producer and management platforms will fail to deliver any Web services beyond trivial initiatives through 2004."

More Stories By Mark Potts

Mark Potts is a fellow and chief technology officer within HP's management software business. Prior to joining HP, Mark was the founder and chief technology office of Talking Blocks, which HP acquired in September 2003. Talking Blocks was a software company that provided products for managing SOA and Web services.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.