Outer Web Thought Log
May 12, 2004
Cocoon ramblings

Yesterday, I was at the J-Spring conference of the freshly minted Dutch JUG. I lost a previous version of my trip report because of my misconfiguration of ecto, so I'll briefly summarize here. I only saw a few talks since I spent much more time in the corridors chatting with people, but still it seemed like a genuinely interesting event. The opening keynote was about Java Studio Creator, which is a draw/drag/drop IDE made out of the Raven project. Jim Laden from Sun came over from the UK to present it, and Sun really could have sent a more experienced speaker, or at the very least someone better-acquainted by the tool. His presentation and demo were quite bad, and running JSCreator on an underpowered laptop didn't do much justice to the tool. Underneath, JSCreator generates JSF-code, and it was fun comparing that with what we have in Cocoon Woody/Forms: I think we captured all of what JSF does, minus the JCP stamp of course. I also went to another JSF talk, and to the closing keynote about Groovy. In the hallways and during the reception, I chatted with the MMBase guys, primarily about content management (duh) and open source licenses. Definitely a bunch of fun guys to hook up with.
My talk went well, primarily since it was the third or fourth time I delivered it, and I was right after the keynote which was that bad that it was quite easy to do a bit better. I had plenty of impromptu chats with newbie Cocoon users and explorers during the day, and inevitably they all came up with the same two main remarks or concerns: performance and documentation.
While there exists plenty of documentation for Cocoon, it is quite unorganized and requires some perseverance from its readers to locate what they need. This is caused by the main problem of Cocoon itself - which is the huge number of options to tackle a specific problem. While the "Power Trio" application development model is being pushed these days, all of the other/older options still exist, and must sometimes be used in combination with the Trio to attain a certain goal. Users find this confusing at best: they feel Cocoon is a huge beast and that they need to learn all about it before being able to make a good application design. If I contrast this with the amount of Cocoon components we use on a daily basis, which shouldn't be more than 30%, it's easy to smile when people make this remark and tell them they should hire us for a workshop, but in the end it is clear that there's a lot of cruft in Cocoon and we should somehow address this. I sometimes feel that there's too much friendliness in the Cocoon community: we simply fail to delete someone's code -- afraid of hurting that guy's feelings. Also, while some application problems can be solved in multiple ways, these ways don't necessarily work well with each other, because of slight overlap or total orthogonality. The request for Cocoon books was there as well yesterday: we really need up-to-date 2.1 books.
Another recurring question was performance: because of its perceived complexity, people fear Cocoon will be much slower than low-level-Struts-based application. While people obviously are messing up performance by chaining multiple XSLT transformations or using complex XSLT on top of very large documents, Cocoon has a certain overhead because of all of the machinery that comes with it. Also, some of the newer Trio stuff clearly hasn't been checked for performance-sanity, like FlowScript. I've been encountering this myself this week.
As some of you may know, we are currently building a not-so-easy Cocoon application, and we've been using the FlowScript/CForms combo for that. On Monday, out of sheer boredom and because I just found out about ab, I started testing the publishing-only side of this application, which uses FlowScript to connect to a Java client object, which on its turn uses HTTP (Jakarta httpclient) and XML to retrieve a document from the backend. Since all this HTTP and XML (de)serialization stuff comes with an overhead, I expected a difference between talking directly against the back-end's HTTP stack, and looking at the same document retrieved from the back-end by Cocoon, with some further (lightweight) transformation applied on top of it for display. I hadn't expected however the difference to be this big:

(average response time/request over 100 requests, no concurrency)
Out of experience, I know I'm able to get response times of 15 ms for a simple Cocoon pipeline on my test machine (just my laptop, duh): that must be the inherent overhead of using Cocoon on my hardware. Our backend application seems to be in the same range for normal-sized documents: about 15 ms as well. So we have a baseline response time of about 30 ms to account for in any case, and the rest of the response time will be needed for HTTP connection set up, retrieving the XML data from the backend, parsing it, and providing it to the publishing pipeline.
Bruno also hacked together a Cocoon Action-based variant of the original FlowScript version, so that I was able to add that one to the comparison mix. As you can see, the FlowScript approach seems to have an average, built-in 60 ms penalty. 60 ms is huge in my mind, and since the Rhino stuff in Cocoon is now a bit abandoned and left on its own, and you encounter all sorts of integration mess between the Java- and the JavaScripting-world, we're not so convinced anymore that JS/FlowScript is a conceptually -and- technologically sound way to develop web applications.
I'm personally still quite sold on the continuations idea, but its implementation clearly seems to require some major overhaul, perhaps based on a different (non-scripting?) language to be useful in serious production environments: JavaScript FlowScript makes your application between two and three times slower than using plain old Java (in Actions or pipeline components - we haven't ventured in the recently committed JavaFlow yet). Of course, raw performance isn't always the best way to gauge software quality, but at the very least it learns us that speed problems haven't been encountered yet in a production environment, or else we would have heard about it. Hence we're doubting about its use beyond playtime projects at all. Beware however that it perhaps wasn't too smart of us to use FlowScript for mere publishing-oriented behaviour, and we should have made the switch to plain Java code before. OTOH, 80% of our application is about interactivity, forms and all that, and we expect/plan/design this to be fast as well. Besides, there's the constant, time-consuming, frustrating exploration of the current continuation/flow implementation in Cocoon - and the lack of time, energy and funding to come up with ways to address those issues. Oh, and the fact that we got grilled during our previous (premature and under-researched) attempt to do something about it, of course. But we're big boys with thick skins, and some time, some day, I'm sure something will happen.

Posted by stevenn at May 12, 2004 04:04 PM ()
Comments

Steven, Thanks for speaking on the J-Spring conference. At the moment evaluations are coming back and as I can see now at the moment: You did well!!

Klaasjan Tukker, Chairman NL-JUG

Posted by: Klaasjan Tukker at May 12, 2004 10:41 PM

These are very interesting findings. Please, if you have a few minutes time to spare (I know) add a testcase for the javaflow implementation. I am *really* keen on seeing some numbers lined up!

Posted by: Torsten Curdt at May 13, 2004 07:20 AM

Same as Torsten: a test with JavaFlow would really be interesting.

We should not consider switching back to state-automata driven architectures because of a slowish JS implementation. Continuations allow so powerful interaction scenarios that we must work to make them faster. And JavaFlow is one of the answers to this.

Posted by: Sylvain Wallez at May 13, 2004 11:32 AM

Nice job producing some FUD, Steven. Why don't you spend the time actually profiling your application and demonstrate with certainty where the problem is, rather than just spouting your currently unfounded speculation. There may well be a problem with Rhino/JavaScript but you have _not_ demonstrated it.

Posted by: Chris Oliver at May 14, 2004 07:06 PM