Enterprise Architecture in practice

Saturday, May 19, 2012

Viking Culture - An Agile Approach

What?
What do Vikings have in common with lean and agile approaches of Software Engineering or Enterprise Architecture? How does your quest find its way in troubled and uncertain waters? I have heard talks about engineering that date back to the 18th century, but I would like to go back 7th century. As I was reading "The Hammer and the Cross" by Robert Ferguson, some similarities from a long gone period struck me. Engineering with little resources and the ability to cope in a hostile environment, while still having some cultural, mercantile and warfare success during a few hundred years. They discovered America 500 years before Columbus. It certainly has influenced Northern part of Europe (eg. Great Britain and Normandy) (This is no cultural ranking or justification, but used to highlight some aspects of agile and lean practices. See footnote)

Odin

Risky knowledge seeking

Gaining knowledge was very central in Viking culture and Northern heathendom. You where obliged to seek knowledge and experience, and at any risk. Odin (the main god) sacrificed one eye to obtain knowledge. So at the very foundation of culture is bold knowledge seeking. Knowledge seeking actually means that failure will happen. This together with transparency (see below) provided a basis for trust and forgiveness when failing, and not an embarrassment.
This is most certainly a foundation of software engineering as well, you will get nowhere without seeking knowledge. Be inquisitive, find combinations, test new technology, and understand its appliance. And then being patient as you convince the stakeholders and project members the new possibilities and how to apply them. Be bold and brave!

Don't give oath

Early in the book Ferguson quotes that some Viking party when having meeting with some King of England, where reluctant to give an Oath. The agile view would be that it is first when you get into a situation you know what to do. You just can't promise that you will act in a certain way, its depends on the situation.
This may be brought back to up-front design, estimates and project plans. The nature of any project with a degree of uncertainty, is that you will face elements that you can't foresee. So just don't promise anything, you will regret it.

Admit and go free
This is also about transparency. Very central to the culture was the "Ting". This show a strong natural inclination towards the involvement of the whole community in common decisions and in courts of law. Living scarcely across the country, communities met both at local and central places to discuss law, and to judge actions. The gatherings show a balance between consensus driven decisions and local actions depending on the situation. It also show that transparency is important to get consensus, and that individuals understand the purpose and interpretation of law. If you failed, you will be forgiven, provided that your actions was well intended.

Viking "Longhouse"

As an example: If you killed someone (in daylight or by fire at night...) and you passed 3 settlements without telling anyone, then you would be sentenced to death (or being classified as an "outlaw"). Now, if you admitted your action (it may have been an accident, or depending on the situation it was considered self-defense ;-) ), that provides transparency and you involve the community in you humble action.
In engineering, it is vital that errors are acted upon and that -- by admitting or taking responsibility --, you take part of the team and handle things up-front, instead of pretending that nothing happened. You actions will come for a day, so it is better to face it.
In Enterprise Architecture is about balancing business and IT. Central steering (principles, target architecture and business value), with local actions (understanding purpose and executing projects).

You can't cheat nature
Surviving in a climate where the soil is frozen 6 months of the year take some planning, being diligent, and taking risk. In this type of condition, it all depends on how well you plan, cooperate, gain craftsmanship and do your life. You can't survive by lying or cheating. You can't cheat nature.
In our discipline that system you build better work. You can't pretend that it works. It is not a report, not a picture, or some abstract stuff: it is working software. It is out there for every user to see. You can't hide.
On the other hand if you have vast resources, you can make it happen even without a plan or good craftsmanship. It will just take longer time and cost all too much. So even without an agile approach, working software is possible, but not desirable.

History has its stories of success and failure. Success occurs now and then, but everything seems to fail in the end. Where are we heading?

(There are certainly other aspects of the Vikings that does not fit. The book is all about why the Viking age had such a turbulent end. And there are certainly aspects of other great cultures of ancient times that would serve as examples.)

Viking Culture - An Agile Approach by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Wednesday, February 29, 2012

Module and Aggregate design in the CAH

As a result of the successful PoC we ran, we detailed our design for Modules and Aggregates in the Continual Aggregate Hub. And maybe this just is our Kinder Surprise; 3 things in one:

Maintainable - Modular, clear and clean functional code as close to business terms as possible. Domain Specific Language.
Testable - Full test coverage. Exploit test driven development and let the business make test cases as spread sheets.
Performance - Liner scalable. Cost of HW determine speed, not development time. Too often you start out with nice modules but end up cross cutting the functionality, and chopping up transactions into something that is far from the business terms. That makes a hard system to maintain. We will achieve high performance without rewriting.

This article explain the logical design of the Module that is responsible of Assessment of income tax, the Aggregate that contains the Assessment data, our DSL, deployment issues, and lifetime issues. They represent components of the CAH. "Assessment of income tax is an type of "Validate consolidate and Fix", and the Assessment data is of type "Fixed values".
We will release the Java Source for this, and I expect that our contractors (Ergo/Bekk) to blogg and talk about the implementation. They too are eager on this subject and will focus in this direction of software design.

Tax Assessment Module

This is a logical view of the Module 'Selvangivelse 2010' handling the tax assessment form and the case handling on it for the year 2010. There will exist one such module for each income year, it consumes supporting xml-documents for the assessment of tax for that year. The Module is where all business logic is and the Modules (of different type) together form the processing chain around the TaxInfo Aggregate Hub. It is layered as defined in DDD and as you see the Aggregate Hub is only present at the bottom - through a Repository - and this also is the boundary between pure Java logic and the "grid infrastructure / scalable cache".

Selvangivelse (Tax form) Module

The core business logic is in 'SelvangivelseService'. This is where the aggregates from the 'SupportRepo' is extracted to (through an Anti Corruption Layer) and put into the business domain, all business logic is contained and it is where the real consistency checks reside (today there is 3000 assessment rules and 4000 consistency checks). 'SupportRepo' is read-only for this Module and may consist of as many as 47 different schema's. These are supplied by other Modules earlier in the "tax pipeline". 'SelvangivelseService' persist and read from 'SelvangivelseRepo', and makes sure that the last version is consistent or handles manual case handling on it. Tax assessment forms may also be delivered independently by the tax payer (no information about this tax payer exist in the 'SupportRepo'), so it is important to have full audit as to all changes. Tax Calculation is another Module, it is fed by 'SelvangivelseRepo', and for that component it is read-only.
There is also a vertical line and it illustrates that usage of the Selvangivelse (the assessment form) must be deployed separately from producing it (This "read only stack" has been discussed previously, but it is to provide better up-time and to be able to migrate in Assessment forms from the Mainframe without converting the logic.)

Most of the business logic in 'SelvangivelseService' is static (Java construct), so that is does not need any object allocation (it is faster). Object creation is mostly limited to the Entities present in the Aggregates. Limiting the amount of functionality present in the Aggregates that go in the cache, makes the cache more robust as to software upgrades. This may be in conflict with good object orientation, but our findings show a good balance. The though business logic is actually not a concern to the cache-able objects, but is a concern of the Service. In other words a good match for DDD and a high performance system!

Domain Specific Language
We see that maintenance is enhanced by having clear and functional code as close to business terms as possible. Good class names and methods has been used and we believe that this is the right direction. We do not try to foresee changes to the business logic that may appear in the future, risking only to bloat down the business code with some generics we may not need and that hinders understanding. And acknowledge that things have to be rewritten now and then, and must re-factor in the future without affecting historic information or code. (see deployment further down). Note that we have already in the design separated functional areas into Modules, meaning that one years tax assessment handling may be completely different from another.
Our DSL have been demonstrated for business people, and the terms relate to existing literature ("Tax-ABC" is a nasty beast of 1100 pages ;-) ). Business confirm that they can read and understand the code, but we do not expect them to program. We expect them to give us test cases. Close communication is vital, and anyone can define test cases as columns in a spreadsheets (there are worse test scenarios that is not able to be represented as a table, but at least they are a good start and actually covers most of what we have seen in this domain so far.)

Example DSL for summarizing fields in the Tax form

The DSL approach is more feasible for us than using a Rule Engine. Partly because there are not really that many rules, but data composition, validation and calculations which a normal programming language is so good at. Also by having a clear validation layer/component, the class names and the freedom to program Java actually makes the rule-set more understandable, it does not get so fragmented. The information model is central, that to has to be maintained and flexible. And last but mot least; because of lifetime requirements and other support such as source-handling (eg. github), refactoring and code quality (eg. sonar) is so much more mature and well known in the Java world, that any Rule Engine vendor just cant compete. (Now there certainly are domains where Rule Engine is a good fit, but for our domain, we can wait.)

Logical design of the xml super-document

Aggregate store: the XML-document
All aggregates are stored in a super-document structure consisting of sections. The content of these sections are generic, except the head. (see Aggregate Store For the Enterprise Application)
The aggregate has a head that is common for all document types in the TaxInfo database. It defines the main keys and the protocol for exchanging them. It pretty much resembles the header of a message or the key-object of the Aggregate in Domain Driven Design. The main aggregate boundary is defined by who it concerns, who reported the data, the schema type and the legitimate period. In either case it is there as the static long lasting properties of an aggregate in the domain of the CAH. We do not expect the ability to change here, without rebuilding the whole CAH. 'State' is the protocol and as long as the 'state' is 'private', no other module can use it.
The other sections of the document are owned exclusively by the module that produce such content, and belong to the domain the module implements. 'Case' is the state and process information that the module mandates, in the example the document may be 'public' in any of the different phases that tax handling goes through. For example: for a typical tax payer (identified by 'concerns') when an income year is finished there will exist 4 document of this type, each representing phases in the tax process (prognosis, prefilled, delivered and assessed).
The Aggregate section is where the main business information is. All content that is relevant is stored here (also copied, even though is may be present in other supporting documents). This makes the document valid on its own, and must not be put together at query time or for later archiving. Any field may either be registered uniquely in this aggregate, or it may be copied and reference some other document as its master. This is referenced by 'ref GUID' and is used by the business logic in 'SelvangivelseServce' to sew objects in the supporting aggregates together, and create the domain object model of the Module.
The Anomalies section contains validation errors and other defects in the Aggregate, and only concerns this occurrence. We may assess such a document even though it has anomalies, and information here is relevant to the tax payer to give more insight.
The Audit section contain all changes to the aggregate, also automatic handling, to provide insight into what the system has done during assessment. This log contains all changes from the first action in first phase of assessment, not just the present document.
Both Anomalies and Audit can reference any field in the Aggregate.

Aggregates are not stored all of the time, only at specific steps in the business process are things stored as xml. Mostly when legislation state that we must have an official record. For example when we send the pre-filled form to the Tax -payer. The rest of the time the business logic run - either automatic because of events or manual case handling - and update the objects without any persistence at all. Sweet!

Deployment

Logical deployment view

Every node is functionally equal and has the exact same business logic. To achieve scale the data is partitioned between the different nodes.
This illustration show how the different types of aggregates are partitioned between the different nodes, where the distribution key is to co-locate all aggregates that belong to the same "tax-family". This is transparent for the business logic, and the grid software makes sure that partitioning, jobs, indexes, failover, etc. are handled. By co-locating we know that all data access is local to the VM. This gives high performance. The business logic function even if we have a different distribution key, it only takes more time to complete. Some jobs will span VM's and they will use more time, but Map-Reduce makes sure that most of the work is handled within one VM.

We have learned that memory usage is pretty linear with time usage, and that the business logic is not the main driver. The difference in 10 or 100 rules is much less than 2 or 3 Kb of aggregate size. So fight for effective aggregates, and build clear java business logic!

Lifetime deployment

Where do you draw the line?

There are deployment challenges as time goes by. Technology will change over time, and at some point we will have a new os, jvm, or new grid software to run this. We have looked into different deployment models, and believe that we have the strategy and tool-set to manage this. At some time we must re-factor big-time and take in new infrastructure software, but we do not what to affect previous years data or logic. We want to handle historic information side-by-side in some Module, and the xml representation of the aggregate is the least common denominator. It is stored in TaxInfo. Modules deployed on different infrastructure must be able to communicate via services. Modules running on an old platform will co-exsist with Modules on a new platform. In these scenarios we do not need high performance, so they need not co-exist in the same Grid, but communicate via WebServices.
In the above mentioned solution space, it is important to distinguish between source and deploy. We may very well have a common source, with forks all over the place if that is necessary. Even though we deploy separate Modules on separate platforms, we can still have control with a common source. (I may get back with some better illustrations on this later.)

Conclusion and challenges

It is now shown that our domain should fit on this new architecture and that the CAH concept hold. We also now think that there has been no better time as to rewriting existing legacy. We have a platform that performs and we can build high level test cases (both constructed and regression test) "brick by brick" until we have full coverage. Also we understand the core of the domain much better now and we should hurry because core knowledge are soon leaving for pension.

This type of design running on a grid platform need not only do so because of performance. It will benefit because the aggregates are stored asynchronously. The persistence logic is out of the business logic, and there will be no need to "save as fast the user has pressed the button". It allows you to think horizontal through the business logic layer, and not vertically as Java architecture sadly has been forcing us for so long.

As presented in PoC we now have a fantastic possibility to achieve the Kinder Surprise, but we certainly must have good steering to make a better day.

Kinder Surprise; simpler, cheaper, faster!

Module and Aggregate design in the CAH by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Saturday, January 21, 2012

Tax Norways PoC results

Our results from the Proof of Concept has been presented at Software 2012 (Massivt skalerbar skatteberegning) and Ark 2012 (Kinderegget: enklere, billigere og mye raskere) here in Oslo. We have tested essential complexity (the core of the problem, although a subset of the overall functionality) in our quite complex multi-phase tax assessment process (up to 47 sub-forms assembled together onto a "tax-foundation" form of 800 fields with 4000 rules and 3000 validations), and subsequent tax calculation (also quite complex).
As a background there is a context to this, a concept, and logical design.

Our findings show that a 12 server grid (Intel i7) with 500 Gb of RAM will process everything in less than 5 minutes (for a population of 5.1 million), on a hardware platform costing 5% of todays expenses. We also see that having full test-coverage is highly achievable and will of course drastic reduce maintenance cost in the long run. Plain Java with good class names and methods (DSL "Domain Specific Language") makes this rock.

This platform handle tax forms at over 50.000 forms pr. second.

The aggregates (as defined in Domain Driven Design) really make the difference, and in this domain it is a great fit.

BTW: using this type of in memory architecture will also gain applications who do not need scale. Storage is asynchronous from the usage scenarios. Just store when the right business state is met. The business logic and information model is nice and clean. No persistence tweaking anymore :-)

See you at SW2012!

Tax Norways PoC results by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Wednesday, December 14, 2011

Dont let the Enterprise Service Bus lead to Context Bleeding

In this article I would like to discuss bad usage of integration patterns and SOA tools, and why I often favor the Anti-Corruption Layer pattern of Domain Driven Design. I observe that many SOA projects does not end up with something more service oriented, but actually an event bigger ball of mud. Suddenly many more systems must be live, projects slow down and things get more complicated. The ESB becomes One Ring to Rule them All, which is not a good thing.
Why is this?

Good Service Orientation should bring clear separation of concern, easier maintenance, less code, independent deploy and release cycles, more frequent releases, easier sourcing, and a higher degree of flexibility (among others). The idea of a bus (ESB) is good enough, but it does not relieve you from the real challenge; complexity and functional dependency. Where you previously had FTP files separating silos, they now must both be up and running. When things break you have 100.000 broken messages on the bus to clean up. These messages are a long way from home; they break out of context. Probably they are better understood within their domain.

The challenge is to find a design and migration strategy with lower maintenance cost in the long run. You should make things simpler and testable, by using DDD on your system portfolio.

The intention with these integration patterns are good; the Aggregate and Canonical patterns promise encapsulation but often end up with handling complexity outside of its context. That leads to a tough maintenance situation.

Scenario

The initial stage where silos send and depend on information directly from each other:

Silos supporting and depending on each other

An ESB tries to make things easier
, but the dependency is still there

Secondly the ESB comes to the "rescue". We just put a product inbetween and pretend that we now have loose coupling. We may get technical looser coupling and reuse of services/formats, but the functional dependency is still there. Most probably this is not a situation with less maintenance, you have just introduced more architecture. In most integration scenarios people with deep knowledge of their silo talk to each other, directly. The canonical format supported by an integration team is just a man in the middle, they will strive to understand the complexity behind services and messages. Then your total maintenance organization does not scale, the functional throughput of projects slow down, because the integration team need to know "everything".

Black Octopus tentacles

There is also another problem with this approach. Most tools (and actually their prescribed usage as taught in class) let the ESB product make adapters that go into the silo. Many silos have boundaries, like CICS, but others offer database connections so that adapters for the ESB actually glue into the implementation (of the silo). Now we are getting into en even more serious maintenance hell. Each silo has a maintenance cycle and organization supporting the complex systems it is. By not involving this organization and not letting this organization support the services they actually offer to the environment, you will have trouble. The organization must know what they silo is being used for. How else are they going to support SLA, or make sure that only consistent data leave the silo? This is illustrated as a black Octopus with tentacles into the systems having different parts of the organization tied up even closer.

Context Bleeding

This very fast leads to context bleeding. Every EBS vendor has tools and repositories for maintaining the Canonical format. The problem with this is that it is maintained outside its domain, and the organization supporting it. Now many Entities and Aggregates are outside their Bounded Context... Or even worse they are replicated outside, endorsed in a more generic representation, where the integration team has put an extra layer of complexity on top of it. This generic representation also hides the ubiquitous language, making communication between organizations even harder. And just to add to this; how testable is it? This is your perfect "big ball of mud". You do not want to handle complexity outside its domain.
It is a much better situation for those with the deep knowhow of the silo to construct and support the services and canonical messages they offer. The integration team should mostly be concerned with structure and not content.

The same can be said about business processes orchestrated outside of their domain; it may get into a "make minor" approach that does not enhance ease of maintenance. Too often there is high coupling between process state and domain state. (see Enterprise Wide Business Process Handling)

Better approach

Dark green ACL maintained by each silo

So I think a better approach is to use Domain Driven Design and the Anti Corruption Layer. This pattern better describe the ownership and purpose of integration with another domain, while keeping clear boundaries. Maintenance and release cycle is now aligned with the silos maintenance cycles. I also think there is a better chance for higher level services where systems cooperate. This leads to simpler integration scenarios, illustrated by a slimmer ESB.

This is not complete without emphasizing the importance of functional decomposition between the silos, so that they have a clearer objective in the system portfolio. But this takes time, and often you need an in-between solution. ESB-tools are nice for such ad-hoc, but don't let it be your new legacy. Strive for granular business level services, so that you limit the "chatting" between systems and make usage more understandable (but this standardization is more a business challenge, than an IT challenge). Too many ESB's end up like CRUD repositories; illustrating only an open bleeding wound of the silo.

The objective is: Low Coupling - High Cohesion. Software design big or small - the same rules apply.

Dont let the Enterprise Service Bus lead to Context Bleeding by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, September 27, 2011

Tax Norways Proof of Concept

Why a Proof of Concept?

Continual Aggregate Hub and Processing
In our Target Architecture there is one central part (we will test out concepts from Continual Aggregate Hub, "Restaurant" and Aggregate Storage) and it has to do with tax and fee calculation. It does collection, assessment and calculation in one scalable architecture, and this has to be flexible as to collecting new types of information and putting new calculations on them. We believe it is unlikely that we find some COTS in this area, as this data and its rules are highly domestic and decided by our fellow politicians.

Our priorities:

Maintenance - Modular, clear and clean functional code

Testable - Full test coverage

Speed - Liner scalable. Cost of HW determine speed, not development time.

Yes we can!
We have seen that there are so many properties in the large scale financial architectures (as the ones John Davies has talked about in several talks over the years) that are similar to ours, that we have to find out if our domain and challenges can be solved with this.

This is really the core of the PoC; we have to play, test and learn what this architecture is like for our domain and how it affects our ability to change. We need to show this to our stakeholders; business, architects, operations, programmers, designer etc. We must have this experience to leverage risk in planning, understand the cost, to be able to communicate between these groups, and to be able to describe in more detail.

In practice we do a full volume test for 2009 and 2010, but with limited amounts of basic data and business rules, and not fully end-to-end. We will tackle the core challenges, with a small rule set, and also have a "version 1" blue-print for new business initiatives we know will be coming. We target 2 years in memory, and assessment and tax calculation at 5000 tps. We target HW costs at 10% of todays level.
It is a playground for new technology. We are testing the domain solved in this type of architecture, and it is not a test of a product. Any acquisition of product will be done later.

Just do it
Late spring this year we opened a bid for participating in this Proof of Concept, where we presented our thoughts about the target architecture (We have gotten a lot of good feedback on this :-) ). EDB/Ergo, Bekk won the bid, and they teamed up with Incept5.
We are starting with Gemfire as the processing architecture, and as much plain vanilla pojo as we can. We will be working on this through January 2012. This will rock!

2012.01.22: The results are here, and will be presented at Software 2012.

Tax Norways Proof of Concept by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Thursday, September 22, 2011

Enterprise wide business process handling

Introduction
(October 2014. This article is updated with the Case Container)
Understanding the requirements, functional and non-functional, is vital when the architecture is defined. The theme in this article is about enterprise wide business process handling. The challenge is to make different types of systems participate in the business work flow and understand what happened for later insight. BPM, Business Process Management, is a huge field with many tools, notation forms, patterns etc. (See process handling in Continual Aggregate Hub and cooperating federated systems). (A system in this text is the same as a Module, although it may reflect that our current systems are not good Modules. We want our current systems to cooperate as well as our new Modules.)
We have chosen to think Domain Driven Design and to elaborate on these main issues:

how to automate the business process between cooperating Modules
how to ensure that some manual (human) task is done
how to achieve high degree of flexibility in changing business processes
how to know the current state of a Party
how to know what has happened to a Party
how to measure efficiency in the business activities over time
how to make very different types of systems cooperate

And some non-functional requirements:

how to handle the operational aspects over time (batch vs events)
how to best support that a manual task is now automated
how to make it testable
how to have it scalable
how to change and add Services and Modules without downtime
how to change process without downtime
how to re-run a process, if somethings have failed for some time
how to handle orchestration (defined process) and choreography (event drive) side by side

Observations of SOA tool state and of integration patterns
I will not discuss this too much, but just summarize some observations (I'll get back to how in some later blog):

In-flight processes is a pain
Long lasting processes is a pain (both: software needs update now and then)
Often not loose coupling between process and the state of some domain entity. A lot of processes occur and function within the domain ("make-minor" approach by some process-system is not good)
Processes are hard to test
Scaling and transaction-handling gets complex
Tools have too much lock-in
The promise of BPEL visual modeling to communicate with business fails
The canonical integration pattern often leads to context bleeding and tight coupling
The aggregate integration pattern often is a sign of complex integration, that probably should be addressed with a system by itself
Business process state is hidden, and history of events is lost or is drowned in technical details
Message brokers are great to move messages, but bad in history, and not a good tool in operational flexibility
Too many parties are involved so that maintenance gets slow

Our logical 5 level design
Main goal is to have full control of all events that flow between cooperating Modules, but not achieve an uncontrollable event driven system. (An event-driven system may just as well be diagnoses as "attention-deficit/hyperactivity disorder (ADHD)" system.)

Enterprise Process Log
This is purely a log component, with services for registering business and answering queries about them. It has no knowledge of the business process, but of course has some defined keys and types that defines a valid business event (a business activity leads to a business event). It is the Modules that emit domain events to the log component, and the domain defines what the events are. It is a place to store the business level CQRS type of events (or Soft State if you like from BASE), the more detailed events are kept the Modules. Any type of system can emit events either live or in large batches. The implementation effort to send events is little, and the events may be informational and not necessarily used in an automated process. This log will give insight into what happened with a Party. The lifetime principle is taken care of, as this log must be very long lasting. So we want it as simple as possible. (see The BAM Challenge)
Global Task-List
This is a simple stand alone task-list whose sole responsibility is to assign tasks to groups of Case Handlers (CH) that do manual tasks in the enterprise wide domain. The task-list has no knowledge of the overall process. The tasks the task-list receive contain enough information to explain what the task is, and to redirect the CH to the right GUI for doing his task. The tasks are defined and maintained in the Module that has a need for manual work, but dispatched to this common task list. These tasks are tasks that are well defined in a work-flow. When a task is done, or no longer relevant (the Modules that owns the task decide), then the task is removed from the task list.
Process Flow Automation
When we first have the events in the event log, automating the next step is quite easy and can be done in a lightweight manner. A simple Message Driven Bean may forward the message in a event driven manner via some JMS-queue, or a large legacy system may query for a file once a week because that is the best operational situation for it (operational flexibility is also discussed in CDH). Also events may be logged that only come in operational use a year later, making maintenance flexible, and history robust.
Case Container
This is discussed in this article about a generic Case Container representing all Cases handled in our domain. Its purpose to to contain the Case with all its metadata, the process state and references to all incoming and outgoing xml-documents.
Case Handling System (super domain)
This is usually called Case Handling and consists of a case which is outside the existing Modules or systems, and that need the completion of many formal sub-processes, but at startup of the process it is not possible to foresee how the case will be solved. This is typically where know business process ends, and a more ad-hoc process is applicable. Also this system support the collection of different types of information relevant for such case-to-case systems. This information may very well be external information collected in a manual matter.
(by 2014 we have not gone any further into this part of the overall design. It seems that the CAH and the Case Container is sufficient)

Above is the logical design, and this is what we think we need. You might say it follows a hub-spoke design, where the Modules are the spokes and these 4 elements comprise the hub. These are all 4 discrete components that interact in a services oriented manner, with each other and with other Modules or systems. The main idea is that this will enhance maintenance and reduce the need for customizing COTS.

Ill 1. Basic flow and service orientation

Illustration 1. Just to show a fire-and-forget situation (green) where the Tax Calculation and Collection are interested in House events. Tax Calculation wants them live, and the processing forwards the event, while Collection every 2 months, via file, and then issues service requests (yellow) to Party to get details. The EPL event holds a reference to the Case Container. The notified Module then uses the Case Container to open all relevant information for this Case and the process state the Case is in.

Ill 2. Application layer interacts with EWPH

Illustration 2. Show a Module of DDD where the application layer interacts with the enterprise wide process handling. The green line show how the Case Container is shipped to the Module where its reference are opened by the Application layer and sent to the Domain layer for handling. The Case container is really the container that it is in the logistics world; bringing a complete set of good from one place to another.

We are implementing this with REST and XML, where feeds play an important part in transporting events and data. URI represent addresses for Modules, and are linked to specific Case Container Types.

We do not mandate the usage of the Task-list. If there is some need for task-list internal to a Module that is more efficient for the users that handles tasks solely in a Module, it is OK when it gives a better maintenance in the long run.
Also we do not mandate what technology that does different work in the Process flow automation, it may be Message Driven Beans forwarding to some queue, Mule, Camel, BPEL for some orchestrated process, or simple Query to file.

Design and overview
Of course you still have to have a good design and understanding of the business processes. And it must be there to communicate between business and IT (BPMN is great for this). But as in other areas of systems development: Design is to communicate between people, and implementations is to communicate to machines. Therefore a combined design-implementation (eg. BPEL) will have a hard time achieving both.
The business process is not fragmented, but I argue that the implementation of the business process is best handled in the above-mentioned manner; The process will sometimes occur within a Module (system), and sometimes in-between.

Enterprise wide business process handling by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, September 20, 2011

The BAM challenge

One interesting issue arose when we looked into the requirements for Business Activity Monitoring (BAM). The main task to answer is: "How is the organization performing?", "Are we more effective than last year?". Key Performance Indicators (KPI) is a well established jargon, but how to achieve them?
A KPI can only be answered over time, and for large governmental organizations, that's a long time. Just reorganizing may take a year, and making the new organization perform better takes longer than that.

To measure you have to have some measuring points, and they are to be compared over time. It is not about a business performance measuring tool by itself, it is about what business activities where performed, and how much effort some business entity put into one such activity.

There are three challenges in this:

First a reorganizing usually affects how account dimensions in the economical systems are organized, that makes the accounts discontinuous and disturbs the measurement of the effort put in. There must be defined some long lasting Business Effort measured in cost (Business Cost, BC). When the account dimensions are defined, they must also map to this BC, so that we keep continuity.
There are challenges to this; "How much does an IT system cost?", "What is the price of this automated task". (Many organizations have a hard time identifying this).
Secondly the Business Activity (BA) itself over time is performed in different IT systems or done manually. From experience with performance tuning, I believe it is obvious that the IT systems must support a long lasting BA that survives various implementations of that activity. That is what the Enterprise Process Log (see Enterprise Wide Process Handling) is all about. This is where we collect and keep the BA's.
The BC and BA must have a comparable period of time.

The BAM-tool is not the solution by itself, and we don't want too much tied up on some SOA implementation. Also BAM-tools are often concerned with what happens on the ESB, there are so many other places that may emit BA's and BC's. The solution must be simple and must last long.

KPI1=BC1/BA1 for a period
KPI2=BC2/BA2 for a period

In our domain a BA would be "Number of tax statements processed" or "Complaints handled" during some time period. The BC would be "Cost of people and systems for assessing tax statements", and "Cost of complaint department", during the same time period.

We will collect BA's from the Enterprise Process Log, and use our data warehouse for the compilation with the BC's and the analysis. The analytics might be in Excel, although we may buy something for this analysis and reporting.
It is the long lasting measuring points and a standardized period of time that is the real value.

The BAM challenge by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Pages