Ajax data loading on the new details page: Multiget and Multicontroller

Multicontroller and Multiget:

So recently we went through a redesign with one of our web pages. This was pretty cool because it gave us a chance to rebuild the architecture from ground up. We didn’t want to code super controllers, the kind that has too many lines of code and tries to do everything in one place. That makes the code harder to maintain and reuse. Our page is also fairly long and it has lots of data, so we didn’t want to load everything at once, it would be slow. Instead we use ajax and json to load in phases. Load the important things first and let the user read that. As they are reading, continue to load the rest of the page. Our page is going to have many controllers (instead of one) and there is going to be lots of ajax.

Now the issues …
Adding more ajax urls and controllers isn’t free. Each ajax request requires the overhead cost of a network connection and having a bunch of connections can kill your page performance. At the time of this writing, our new details page requires a subsequent 25 different ajax requests to load all the data needed. To deal with the above issues, we used a combination of Multiget and Multicontroller.

 

The implementation …
All the ajax urls we need for the page are stored in a single place in a javascript file.

“batch1″: [
     "/myendpoint/trackbacks",
     "/myendpoint/neighborhoodstats",
     "/myendpoint/similars"
]

The good thing about this design is
1) If you want to know what ajax calls are being made and what data is being loaded, there is one place to go. You don’t have have to dig through separate individual javascript widgets to find them.

2) You can separate the logic into different controllers. Instead of have one big controller that loads everything, you can separate different data into its own controllers, making it more maintainable and reusable.

 

Multiget.
Now that its really easy to add more ajax urls, we need a way to deal with the network saturation problem mentioned earlier. Multiget is concept for grouping individual ajax requests into a single batched request. Our implementation of it is shown in the images below. We group 3 separate ajax requests into a single multiget batch. The browser will then send a single network requests for the batch to a web server in our data center. That web server will then make separate network requests for each ajax url. Those request can be executed in parallel.

You might have noticed that we ended up with more tcp connections for multiget. This is true, but those connections are all within our data center which is fast. The browser side connection is going through the internet, which is slow (especially with mobile networks), so we minimize that.

Ok fine, we have multiget, job is done right … not exactly.
I mentioned at the beginning that our details page includes 25 ajax requests. That means each time a user views our details page, and there are lots and lots of page views, our data center has to be able to handle an additional 25 separate tcp network connections to load all the ajax data. Plus, each of those ajax requests have other overhead, such as user authentication, or you might have some common java interceptor/filter code that is executing for each request, etc.. Our data center have fast networks and servers, but it simply won’t scale if developers just keep adding more ajax endpoints.

 

Multicontroller.
Multicontroller is similar to Multiget but with one big difference, it will not issue separate tcp connections to full fill each ajax request. Instead it will handle each ajax request within the same web server that received the initial batch request.

Now we only have 1 single tcp network connection being made, thus drastically reducing the overhead costs associated with multiple network requests.

It’s not exactly perfect either…. (but nothing is)
1) Each requests is handled serially. If you originally had one huge controller loading everything, then this didn’t make it worse. Also, you could make a separate implementation that use multiple threads to handle each request.

2) You have less control of where a specific url can be handled. For example: Redfin normally will route certain urls such as “/endpoint/similars“ to very specific web servers because they were optimized to handle those requests. Because multicontroller will always handle each requests within the same server, you will lose out on that optimization. Now you could adjust your routing logic to handle this problem, but for now, we didn’t have a need for it.

Multicontroller is built on top of Spring MVC. The high level pseudo code looks somewhat like…
String[] urlList = getAllUrlsFromRequest();
for each url in urlList {
        // request is a java representation of the url request
        HttpServletRequest request = getRequest(url);
       // use spring mvc to figure out which java class should be used to process the request
       Object handler = getHandlerForRequest(request);
      // now do the work internally, do not send a full network request, no special routing
      JSONView result = handler.process(request)
     // now write the result out to the jsp page …
}

With our new details page, you can mix and match multiget with multicontroller. You can even have many multicontroller requests within a multiget.

So as a developer, how do you know when you should use multiget or when to use multicontroller? Here are my initial thoughts:

1) If you have several controllers that load the same type of data, and each one is really quick then you want to use multicontroller. For example, we have 3 ajax calls related to user personalization. One to load users private notes, one to load users photos, and one to load users favorite homes. All 3 requires user authentication, loading a “login” object from the database and responds within 20 milliseconds. This is a good use case for multicontroller, because now you only do authentication once, and retrieving common objects from the database (such as the login object) should be fast b/c it is warmed in Hibnerate’s cache from the initial batch request.

2) If you have requests that needs to be routed to special servers, use multiget.

 

Response Time:
Both Multiget and Multicontroller will only return once all the requests in the group have been completed. This means it will only be as fast as its slowest component. Keep this in mind when grouping requests.

 

Caching:
The cache header for the entire Multiget/Multicontroller response will be set to the lowest value of its individual requests.

Discussion