Friday, July 27, 2012

Good article on crowdsourced benchmarks

In a recent article, Derrick Harris talked about how crowdsourcing could be the future of benchmarking. As someone who participates deeply in standardized benchmarks (TPC and SPEC), I wanted to comment on some of the important messages in his blog.

Derrik talks about benchmarking within the context of Hadoop, but in general the article applies to benchmarking across multiple technologies. While SPEC and TPC benchmarks have incredible industry credibility, its hard to ignore the fact that Hadoop, NoSQL, and many open source projects have long since played a different game. I read blogs all the time that talk about simple developer laptop performance tests. While these benchmarks (more realistically performance experiments) aren't what would matter in a datacenter in enterprise application performance, they usually after some review and adjustment tend to have good bits of performance knowledge. I also see single vendor performance results and claims that give very little information.

I have, in the past, talked about the value of standardized benchmarks. I talked about why doing such benchmarks at SPEC lead to unbiased and trusted results. I think the key reason is the rigor and openness by which the review is done and the focus on scenarios that matter within enterprise computing. Also SPEC has years of experience in benchmarking to leverage to avoid pretty common performance testing mistakes. It's impossible to compare a developer laptop performance experiment to any SPEC benchmark result. The result from SPEC is likely far more credible. With SPEC, benchmark results are usually submitted from a large number of vendors meaning the benchmark matters to the industry. With performance experiments, until there is community review and community participation, there is only one vendor which leads to "one off" tests that have less long standing industry value. The scenario where I wrote about this - a Microsoft "benchmarketing" single vendor result - is a very good example of how results from a single vendor don't have much value.

But there is a problem with some SPEC benchmarks - the community by which results are disclosed and benchmarks designed is a closed community. It's great that SPEC is an un-biased third party to the vendors, but that doesn't mean the review is a community of the consumers of the results. I think Derrik reflects on this by talking about how "big benchmarks" aren't running workloads anyone runs in product. I disagree, but do believe due to the lack of open community it's harder for the consumers to understand how the results compare to their own workloads. I personally will attest to how SPECjEnterprise 2010 and its predecessors have improved Java based application servers for all customer applications. While it might not be clear how that specific benchmark matters to a specific customer's use of a Java based application server, it is not true that improvements shown via the benchmark don't benefit the customer's applications over the long haul. In contrast to Derrik's views, this is why customers benefit from vendors participating themselves in such benchmarking - I don't think this have occurred if all benchmarking done was done without vendor involvement.

BTW, full disclosure of the performance experiment and results is critical. You can see in the recent Oracle ad issue, that the whole industry loses without such disclosure. Any performance data should be explained within the context of publicly available tests and methodology, tuning etc.

I think if you put the some of these views together (Derrick's and my standardized benchmark views), you'll start to see some possible common threads. Here I think are the key points:

1) We need open community based (crowdsourced is one option, more open standardized benchmarking is another) benchmarking in this day and age. By doing this, the results should be seen as not only trustable but also understandable.

2) Any benchmark, to have value, must have multiple participants actively engaged in publishing results and actively discussing the results and technologies that led to the results. By doing this, the benchmark will have long standard industry value.

I hope this post generates discussion (positive and negative). I'd love to take action and start to figure out how the industry can move forward in open and community based benchmarking.

Thursday, July 26, 2012

Mobile Development on Resume - Check

In the last few weeks, I've been working on a mobile application as a side project. I used IBM Worklight Studio (get the free for developers version here) to design and package the application. Today I used the Android SDK to deploy that application to my Motorola Photon (Android) Smart Phone.

Not really a performance oriented post, but I wanted to quickly talk about how easy this was. I was able to, without any knowledge of Android programming specifics, get this application written and deployed.

Worklight allows you to use open and portable HTML5/JavaScript and popular AJAX widget libraries (jQuery/DOJO) to implement applications that look to be native on each device you target all the while allowing you to access device specific features. So far, I've only deployed to my personal Android (note I'm not a Apple fan). If my wife lets me, I might deploy to her iPhone (she unfortunately is an Apple fan). The cool thing about Worklight is I should be able to take the same HTML5/JS codebase and re-target to iPhone. My guess is now that I have one version done, moving to iPhone shouldn't take more than a few hours.

With HTML5/JavaScript/JavaScript Mobile Widgets and embedded browsers becoming ubiquitous, it really does seem like this environment is becoming what Java is to servers. Write "once", run "everywhere". I did run into small issues (like Date formatting), so its not perfect yet, but its getting darn close. Tends to feel like Applets on the client years and years ago. I wonder if this development paradigm will, over time, make the mobile development experience as easy and open as Java has to writing server applications.

Thursday, July 12, 2012

Basic WebSphere Liberty Profile Tuning

I was in need of going beyond out of the box tuning for the WebSphere Liberty Profile recently.  I was working with a system that was front ending services hosted in Liberty.  We wanted to make sure that Liberty wasn't the bottleneck in the overall system.  Turns out that the tuning proved that Liberty wasn't an issue even with the out of the box tuning.  However, since the tuning isn't yet documented, I wanted to put out what I learned.  There will be more tuning information coming in a refresh to the InfoCenter, so I'll update this post with that link when that formal documentation exists. [EDIT]The infocenter now has tuning information - see this topic[/EDIT].  Here is what I tuned:

JVM Heap Sizing:

I won't advise you on heap size tuning as there is a wealth of information on JVM tuning that basically applies to WAS or Liberty and is mostly affected by your application memory needs. Of course, Liberty requires a lower server cost of memory footprint, but beyond that the tuning is similar. In order to do basic heap tuning, create a file in the same directory as the server called jvm.options with the following contents.
-Xms1024m
-Xmx1024m

Thread Pool Sizing:

Thread pool tuning is always interesting. It's easy to say that you should create roughly the same number of threads or slightly more than the number of server processor threads if the application has no I/O. Unfortunately, no application is void of I/O and therefore needs to wait sometimes. Therefore, you usually want to allocate 4 to 5 times the number of threads than can execute with no I/O. Based on this (assuming a server that has 16 cores), add the following to your server.xml. Of course this totally depends on the I/O your application does. I suggest tuning this to a lower value and do a performance run. If you can't saturate the application, tune it higher and repeat. If you set this value over the optimal value, it won't hurt you tremendously, but you will be more efficient if you get closer to the optimal value due to context/thread switching.
<executor name="LargeThreadPool" id="default" coreThreads="40" maxThreads="80" keepAlive="60s" stealPolicy="STRICT" rejectedWorkPolicy="CALLER_RUNS" />

HTTP Keep-Alive Tuning:

As my application was services based, I wanted the clients to be able to send multiple requests using HTTP Keep-Alive to keep latency down. Otherwise, without this tuning, the connection would close and I'd have to endure HTTP/TCP setup/teardown cost on every request which can be slow and burn up ephemeral ports on a load client. If you want to set the keep alives to be controlled by the client and more infinite from the server side, set the following option in server.xml (make sure this is under a httpEndpoint stanza):
<httpOptions maxKeepAliveRequests="-1" />

Monitoring:

I was pinged by an IBM'er trying to monitor the performance of Liberty. There is a good overview of the monitor feature you can add to a server to allow JMX based monitoring in the infocenter.

Updated 2012-08-09 - Added monitoring section.

Tuesday, July 10, 2012

Web Scale and Web 2.0/Mobile Changes to App Server Performance

Recently, I've been working on two performance projects.  The first relates to a Web 2.0 and Mobile application designed for web scale.  The second relates to recent performance improvements we made in SPECjEnterprise 2010 for   the WebSphere Application Server 8.5 which is based upon Servlet/JSP MVC presentation technology designed to be run on a clustered application server configuration for scale out.  I wanted to write about the how the app server behaves differently between these applications based on the inherently different approaches to application architecture.

A few years ago, I remember discussing how Web 2.0 would change the performance profile of a typical application server handling requests from browsers.  It was pretty simple to see that Web 2.0 would increase the number of "service" (JAX-RS doing http/xml or http/json) requests.  Other less obvious changes to the performance profile of an application server that result are documented below.

Static content goes away completely saving app server cycles.

It should already be a well known practice that html pages and images and stylesheets which people consider static shouldn't be served by an application server and instead moved to a http server or content distribution network (CDN).  A full blown application server just isn't the fastest server for serving basic http requests for files that don't change.

If you look at a typical Servlet/JSP (Web 1.0 server side MVC) approach, you'll see JSP pages stored on the server with substantial static content that has scriptlets or tag libs that mix in some dynamic content.  If you look at a typical web page (let's say twitter for example), my guess is the static content on any dynamic pages (the html markup to organize the table containing the tweets) is like 70% of the page content, with 30% being actual dynamic data.  We have spent alot of time making our app server perform well sending out this mixed static and dynamic data from our JSP engine.  The output of such data includes handling dynamic includes, character set conversions, processing of tag libraries, etc.  This action of outputting the JSP content is the aggregation of basically static content and true dynamic content from back end systems like databases.

Once you move to Web 2.0 and Mobile, you can treat that static content as truly static moving the static content to a web server or CDN.  Now the browser can do the page aggregation leveraging AJAX calls to get only the dynamic content from the application server via JAX-RS services, static content in the form of HTML and JavaScript served from web servers or CDN's, and JavaScript libraries to combine the two sets of content.  Now all that work that used to be done in the JSP engine is removed from the application server freeing up cycles for true dynamic service computing.

Sessions/Authentication change in Web Scale applications offering easier scale out.

As customers are starting to implement Mobile solutions, they are finding that the load the mobile solutions drive ends up being "Web Scale" or scalable beyond the traffic generated by browser traffic alone due to the always accessible apps offered to their customers.

In SPECjEnterprise or any full blow JEE application that uses HttpSession, sessions are typically sticky with a load balancer out front redirecting the second and following web request from any client back to the server which last serviced the request based on a http session cookie that identifies the preferred server for following requests.  Additionally, this session data is typically replicated across a cluster in the case the primary server for any user fails, so the user can be redirected to a server that has a copy of the stateful session data.  These architectures simply assume if the session isn't loadable locally or replicated that the user must not be logged in yet.

If one wants to write an application that scales to web scale, this approach isn't sufficient.  You will find most services of such web scale (Amazon S3, Twitter, etc.) force the user to login before accessing any AJAX services.  In doing so they then associate a cookie based token for that browser that acts as an authorization token that each service can double check before allowing access.  They can check this token against a central authority no matter which application server the user comes through.  This allows the infrastructure to stay stateless and scale in ways that the clustered HttpSession/sticky load balancer doesn't allow.

This approach changes the performance profile of each request as it means each service call needs to first authenticate the token before performing the actual service work.  I'm still experimenting with ways of optimizing this (using application centric data grids such as eXtreme Scale can help), but it seems like this trade-off to peak request latency for the benefit of easier horizontal scale out is to be expected in Web Scale architectures.

I think both of these changes are quite interesting when considering the performance of Web 2.0 / Mobile and Web Scale applications and aren't obvious until you start implementing such an architecture.  I think both show simplifying approaches to web development that both embrace web concepts and help performance and scale of your applications.

Have you seen any other changes to your application infrastructure as you have moved to Web 2.0 and Mobile?  If so, feel free to leave a comment.