Monday, December 12, 2011

One Month of Open Source Development in 60 Seconds

In October Wayne twittered that I'm ruling the Dash statistics for 2011 in terms of number of commits and that I was #4 in 2010. Looking at my usual schedule as a full time committer and lead of the CDO project I'm not too surprised.

Of course I recognize and appreciate that a pretty large team has gathered around the CDO project and is making it increasingly successful. Following our tradition to make even complex systems comprehensible I invite you to follow one month in the open source life of our swarm of little busy bumble bees:



YouTube


Thank you, team, for making CDO!

Thank you, community, for using CDO!

Thank you, gource, for this nice visualization!

Friday, September 30, 2011

CDO 4.0 SR1 and 4.1 M2 are available

The CDO Model Repository has a new Downloads page that we're quite proud of:


It offers all kinds of nifty features like composite p2 repositories, repository contents pages, automatic release notes and, last but not least, our new help center (available for 4.1 all drops):


The help center contains the full reference documentation and will be augmented with programmer's guide like articles step by step now.

For those who would like to play with CDO to get a first impression how it feels we're now offering example server and client products, ready to be installed on different platforms (see the first screenshot above).

We hope that these new services are convenient for you and we'd appreciate your feedback to make it even better. Happy modeling...

Thursday, July 7, 2011

Concurrent Access to Models

Most people know that EMF models are inherently unsafe to access concurrently from multiple threads. It's immediately obvious when you look at the following code that has been generated with the standard JET templates.



Ed usually argues that it's the application's responsibility to control concurrent access to the model if it knows that multiple threads are involved. The application knows best how to do it efficiently for specific access patterns and ideally how to avoid deadlocks. Note that adding synchronized modifiers everywhere is counter productive. It wouldn't make the model completely thread safe and in addition unordered access would likely end up in deadlocks that are hard or impossible to resolve.

A model is basically nothing more than an object graph and how a particular thread navigates through such an object graph is highly specific to the particular application. As a result the most commonly implemented locking scope is the entire model. Only one thread at a time can access the model, all other threads must block on a single mutex:

The EMF Transaction project supports a protocol for clients to read and write EMF models on multiple threads but it has two major drawbacks:
  • It is very coarse grained because the locking scope is the entire model.
  • It is intrusive because each single access to the model must be wrapped.
A one-way road! What if we, instead of letting threads compete for the ability to access the model, hand a separate model copy to each thread. This is neither coarse grained nor intrusive because each thread can access all model elements at all times with normal application code; no wrapper commands are needed.


This approach obviously enables concurrent threads to access the (their) model at any time, but hey, isn't it extremely expensive to instanitate the entire model multiple times? Of course it is! So let's go further down this road and see what can be done to solve the foot print issues.

Let's assume that in the most common scenarios the models can be pretty big but a single transaction, i.e., the number of objects changed between two consecutive commits, is rather small. Then we could refactor our model classes to delegate all model state access to a new kind of entity that can now be shared among the model objects of all open transactions. Let's call these shared entities revisions and their managing container a session.


The model objects are now very cheap in terms of foot print because they only store a pointer to their current revision in addition to some general EMF infra structure such as the list of adapters. The revisions contain all the modeled state plus a version number (which is explained below).

Nice, now the model can be read by multiple threads without main memory being blown up. But with this design the original problem of concurrent write access is not addressed! The modifications that one thread applies to a model object end up in a shared revision, possibly overwriting changes made by other threads.

It's obvious that transaction scoped writes must not alter the shared state. So we refactor our model classes again so that the setters automatically create and link copies of the used shared revisions. Let's call them transactional revisions.


A simple implementation of a commit operation would execute these steps:
  1. The versions of all transactional revisions are checked against the versions of the current shared revisions to detect conflicting commits of other transactions.
  2. Move the transactional revisions into the session.
  3. Notify other transactions so that they can eventually adjust their revision pointers to the new shared revisions. Note that conflict potential in these other transactions can be detected early at this point in time!

That's it! Too simple?

It probably isn't that simple in many ways. But there's already a mature Eclipse technology available that cares for all of the aforementioned aspects and more.

Surprise, surprise, it's the CDO Model Repository, a highly efficient and scalable runtime platform for your models. The following code snippet illustrates how to use CDO to let 100 threads modify the same model:

You may have noticed that in the above example code the commit operation of a background thread can fail because the company object has just been modified by a different thread. With CDO you can easily implement a pessimistic locking strategy by acquiring a single explicit write lock on the company object. Alternatively you can register shipped or custom conflict resolvers with your transactions if you prefer to stay optimistic as long as possible.

Happy multi threading!

Wednesday, June 22, 2011

Bringing It All Together

A header that makes sense these days in several regards.

The Annual Release

The committers of dozens of Eclipse projects have worked hard to fix bugs and integrate new features. And today is, once more, the magic day when all this work becomes available to the public in a single combined effort, called Indigo this time.



Traditionally the weeks immediately preceeding the annual release are dedicated to fixing bugs, augmenting the documentation or enhancing the homepage. So much fun!

Release Engineering Tools

It’s my belief that the only project better than a project with a process is a project with a process that is tool supported. This year I’ve taken the chance to invest into the release engineering tools of my project, the CDO Model Repository.


An incremental project builder validates the third version segment of OSGi bundles against a baseline of implementation digests, similar to what PDE’s API Tools achieve for the major and minor version numbers:


A generator for modular help plugins combines the JavaDocs of multiple source plugins and enables cross-references between them:


An automatic promotion service recognizes new builds from the continuous integration, copies them to downloads.eclipse.org, composes them into a number of p2 repositories and generates web pages.


Of course we also deliver a large number of new features with today’s CDO 4.0 release but I plan to write a separate article about those.

Eclipse Demo Camps

Yesterday I’ve been attending the Eclipse demo camp in Braunschweig. Alex has taken a nice photo of my CDO 3D show:


The camp started early, all the presenters managed to keep the schedule and many of us could round up this interesting evening at a local bar, enjoying food and drinks. Next Tuesday I’ll have the pleasure to do the same demo in Hamburg again. I'm going to bridge that time with some decent gardening.


Next Wednesday Martin and I will organize our first demo camp in Berlin. The idea was born at the end of the last EclipseCon in Santa Clara when Mike Milinkovich told me that he’d like to come to Berlin and give an Orion demo. So there we go.


The registrations have already exceeded our minimum expectation of 50 but personally I hope that Berlin and Brandenburg can do better! Please take a minute and invite your friends, colleagues and partners to this event.


The chance to talk to the director of the Eclipse Foundation will not come back any time soon for most of us. The other presenters will contribute to a cool line up, too, of course. I’m looking forward to meeting you next Wednesday evening.

Monday, March 21, 2011

CDO Enters the 3rd Dimension

Update: The new room is 
Ballroom B+C
(not D, as in the printed schedule!)

EclipseCon is near and I'd like to invite you to attend Martin's and my talk CDO 3D on Monday shortly after lunch time.


As you may know or not, CDO is a runtime environment for distributed shared EMF models. Especially for organizations with huge models (e.g. the NASA, banks like the UBS AG, etc.) CDO is indispensible and has become sort of modeling mainstream in the past years.


Although I've always invested a lot into cool animated Powerpoint slides and although CDO comes with really new functionality each year, we've recognized a slight tendency of the conference audience to decide for parallel talks about completely new modeling technology, if they were forced to choose one. This fact (and the guy who shouted "next year we get Pixar Studios" after my last EclipseCon talk) has made me think about new ways of presenting a complex distributed technology. That's why this year's talk is titled "CDO 3D".


We will have no Powerpoint slides anymore but fully focus on real-time demos of a distributed system with a CDO model repository server and two CDO client applications. The client applications have RCP user interfaces, as well as a self-made scripting console that we will use to demo the API usage of CDO and the immediate influence of local CDO calls on the entire system.


In addition we've developed a 3D visualization frontend, that renders the contents and activities in multiple Java virtual machines into a 3D canvas in real-time. We've instrumented these VMs so that the frontend can even visualize the method calls between the Java objects and the network traffic between the VMs. This diagram outlines the basic architecture of our presentation system:


If you're still asking yourself "What the hell is he talking about?" watch this short video:

(click here for watching a larger video)

Of course we'll also talk about some of the cool new features in CDO 4.0 like OCL queries, Blobs and Clobs, cross referencing and referential integrity checks, fail-over cluster and the brand new backend integration with MongoDB. I'm looking forward to see you in Santa Clara!