Tuesday, October 03, 2006

Axis 2, Maven and the problems with distributed systems

I've had a lesson this evening in the fragility of distributed systems. I decided to try out Axis 2, the 1.0 release, and to further the education I brought down the source version and decided to build it. Now Maven works with a whole list of repositories and Axis has a whole heap of dependencies, which in themselves have dependencies.

And like any good distributed system where you don't have control over the remote servers... these dependencies don't exist anymore. One was maven-itest-plugin, the other was a rather specifically named stax-utils-20060501.jar and there could have been others. The solution for anyone who wants to know is to edit the project.properties file and add in http://ws.zones.apache.org/~dims/maven/ as a repository, it all seems to work fine then.

But here is a good open source project, run by good guys who know their stuff, and they are being bitten by the age old problem of distributed systems. Namely that they've designed something, or are using something, that assumes that everything is always available. This is a common mistake that people make when building distributed systems and SOA isn't going to solve this problem any time soon. What this example does show is that building robust systems is all about being prepared for failures and building systems that are designed to fail gracefully.

Oh and after I got it built and deployed it exceptioned with a NullPointer in the init...

Technorati Tags: , , ,

4 comments:

Anonymous said...

In your post you talk about software "degrading gracefully".

Does this run at odds to the comments in this blog post about synchronized collections - it suggests that an application that has a problem but keeps on running is a "very bad thing".

I guess this is about the definition of "graceful"... when it has a problem it ought to let someone know, if it can.

Also, as usual, programmer experience counts (knowing what could go wrong).

Anonymous said...

"Does this run at odds to the comments in this blog post about synchronized collections - it suggests that an application that has a problem but keeps on running is a "very bad thing".
"

No, I don't think it's at odds - it's almost orthogonal really.....

ConcurrentModificationException is indicative of a pessimistic pattern that assumes you want a global, totally accurate view of everything for some well determined period of time.

It turns out that for a lot of concurrency and scaling reasons you want to avoid requiring these kinds of accuracy. And in fact, for many things, it's completely un-necessary.

So, in this case, we would only have a "failure" were we to assert the tightest possible constraints on our concurrency model. And, if indeed you have such a failure, you'll have to stop.

In respect of graceful degradation, the more loose our constraints the better off we are in terms of our ability to continue to process. If for example we don't need a globally accurate picture we can afford to lag on some updates rather than fail outright.

In the case of maven, I think replication is the only way to go rather than graceful degradation - either you can get what you need to build or you can't. Although it might be possible to allow some version skew in your build materials which would yield a form of graceful degradation.

Regardless, the assumption that something is always available is a classic distributed systems mistake and not accounting for that in important things such as builds is bad. Whether it's a fault in Maven or just a deployment oversight, I wouldn't know - perhaps Maven has a workaround for this kind of thing like multiple sources for some piece of build material.

My two cents,

Dan Creswell.
http://www.dancres.org/

Steve Jones said...

I'm with Dan here, its Horses for courses. If going wrong means that you get crap data and the system is unstable then continuing is wrong. If however its an air traffic control system then its a good idea when you loose a logging thread to continue.

If you are building a distributed systems then you need to understand what is fatal, and what you can cope with. Not being able to connect to a system doesn't mean you have to fail, you might drop over to another, more reliable but slower, communication mechanism.

But I'll accept that with Maven its just a matter of searching out where those jar files live!

Anonymous said...

Its a very nice blog for...
architects in bangalore , architects in bangalore , interior designers in Bangalore , interior designers in Bangalore , architects in bangalore , architects in bangalore , interior designers in bangalore