Introduction to AppEngine
AppEngine is the Platform as a Service (PaaS) offering from Google. It allows you to deploy and run Java and Python applications on Google's infrastructure. Java applications run in a sand boxed Servlet container that scales completely automatically. To deploy an application you just push a button in your IDE, without any worries about setting up and managing an application server.
AppEngine problems
From time to time you’ll find some very negative feedback about AppEngine. In most cases this is not because AppEngine is bad, but because applications where not designed to run on AppEngine. This article will give you some hints about how to design an application that works and performs well on AppEngine.
Google AppEngine is one of the most interesting solutions for Java developers when looking at cloud platforms. It allows you to deploy applications within seconds and offers a broad range of services to make development easier. And it’s cheap; for small applications you don’t pay anything at all. This makes it the perfect platform to use for small scale applications that can’t be put on a dedicated server. The lack of good shared hosting solutions pushes most developers away from Java into the PHP world for this kind of small applications, but AppEngine solves that problem.
All this goodness comes at a price though. When you write applications just the same way you do for a dedicated server you’ll run into performance problems soon, even with the smallest application. That doesn’t mean AppEngine offers bad performance, but it’s architecture is so fundamentally different that you have to design your applications in a way that matches this architecture. When you write your applications specifically for AppEngine you’ll unleash it’s full power and it will be a great platform.
The two main problems
The problems that you’ll face when using AppEngine come in two flavors which will both be discussed in this article:
- Instance startup times
- DataStore related problems
The first problem, instance startup times, is something you normally don’t care about. Most Java frameworks are designed to do as much processing at application startup time because you don’t restart applications often anyway. That’s very different when running on AppEngine. The core feature of AppEngine is it’s scalability. An application that gets a lot of load will automatically start extra virtual machines to spread the load over multiple machines. That’s not something that will happen only on massive loads; you’ll see instances starting very soon already. You’ll even run into this when your application doesn’t get any load at all. To not waste resources on applications that are not doing anything AppEngine will stop all instances for an application when it didn’t get any requests for about a minute. That means no matter what application you have you’ll have to deal with starting new instances. Each time an instance is started you deal with a cold startup of your application including loading and starting all frameworks you use. Every second counts now, because the user will not receive any response as long as the application is starting.
Google announced that in AppEngine 1.4 there will be the possibility to pay for reserving instances and the availability of an API to “warm up” new instances. This solves part of the startup problem, but you’ll have to pay for it. This might be no problem for large applications, but is exactly the thing we were trying to avoid for smaller applications.
Performance gain 1 - Choose frameworks based on startup time
A lot of the frameworks that are used by a lot of Java developers add up to 25 seconds of startup time. Users will not wait for 25 seconds to see a web page. We’ll have to improve this. The most important step in this is to get rid of frameworks that take very long to startup and configure the framework you use to improve startup time. This means not every framework is a good fit for AppEngine. An unfortunate example of this is Grails. Although Grails is one of my favorite frameworks in other environments, it’s 20+ second startup time is simply unacceptable for AppEngine. So, would I advice to get rid of all frameworks and start using Servlets/JSP directly? Not really. That would set you back on productivity and code maintainability too much and it’s not necessary either.
The two stacks I used a lot on AppEngine are the following:
- Weld, JSF2 and JAX-RS (more or less a stripped down Java EE 6 Web Profile)
- Spring 3.0 including Spring Web MVC
Spring still does offer significantly better startup performance after some tuning at this moment though. The Weld team is working hard on improving the startup time of Weld dramatically which will make it a perfect fit for AppEngine in the upcoming version.
Performance gain 2 - Get rid of JPA/JDO
AppEngine offers two APIs to work with the DataStore. Remember that the DataStore is not a relational database. Because of that both JPA and JDO loose some of their power.
- Relationship mappings are very limited.
- Join queries are not supported
- Polymorphic queries are not supported
- Caching support works differently
Performance gain 3 - Don’t use classpath scanning
Whenever I use Spring I use annotations as much as possible to keep my XML configuration to a minimum. For declaring components I use @Controller/@Component instead of bean configuration in XML. This means that the framework must scan for annotated classes at startup however which adds some startup time. On AppEngine it’s always better to reduce scanning for classes.
Another example is RestEasy. Normally I just let the framework scan for @Path annotations, but on AppEngine it’s better to use an explicit Application class instead. This are just two examples of frameworks I use a lot, but there are many different frameworks that give you this choice.
Performance gain 4 - Use memcache
Caching is useful for most web applications, but AppEngine gives you a great infrastructure for it. On AppEngine you can use MemCache which is a highly scalable distributed cache.
From an API point of view MemCache is very similar to using a HashMap with methods such as put, get, delete and contains. Data in the cache can disappear any moment (it’s not persistent), but will normally live until it expires. The expiration time is something you specify when you put something in the cache. The general idea is to put as much data in MemCache as possible in a useful way. Most web applications are read-mostly, which means there are many more users reading data then writing data.
Most people start by caching data from the DataStore. The DataStore is relatively slow (compared to a local RDMS) so that’s a quick win. Objectify even supports this declaratively with annotations. You can go a step further though. For RESTful Web Services it’s useful to place JSON strings in the cache. Converting an object graph to a JSON string costs time, so why would you do that over and over again if the data didn’t change? The same thing is true for pages. You could create a Servlet filter that simple returns a cached page (the HTML) instead of re-rendering a page with data that didn’t change.
DataStore usage
AppEngine’s DataStore is a non-relational, schemeless data store. Wait, let me repeat that: The DataStore is NOT relational. This is probably the most important thing to keep in mind while developing AppEngine applications. "No problem" you might say, "those NOSQL data stores are ultra scalable so who would ever bother about performance?" Yes, the DataStore is extremely scalable. It has to store data for a virtually infinite amount of applications that all store a virtually infinite amount of data. To be able to do that the DataStore must be distributed, so yes it's scalable. But that doesn't really go well together with traditional relational data.
Performance gain 5 - Join in-memory
Because the DataStore is so fundamentally different then a relational database you must work with it in a different way too. First of all, there are no joins. The DataStore is basically one very large table, where each row can have it’s own set of columns. If there is only one table, a join doesn’t make much sense. Of course you still need relations between entities in your application, so we have to come up with something for that. Lets take the following simple SQL query as an example:
select emp.name, dep.name FROM employee
LEFT JOIN department ON department.id = employee.dep_id
A first naive approach on AppEngine could be:
- select all books
- iterate over books
- iterate over authorKeys for each book
- get author for each key
Objectify ofy = ObjectifyService.begin(); List<Book> books = ofy.query(Book.class).list(); StringBuilder sb = new StringBuilder(); for (Book book : books) { sb.append(book.getTitle()); sb.append(": "); for (Key<Author> authorKey : book.getAuthorKeys()) { final Author author = ofy.get(authorKey); sb.append(author.getFirstname()).append(author.getLastname()).append(", "); } sb.append("<br>"); }
For each employee we simply just query again for the related department. Now we have a performance problem. If we have 500 employees, we would have 500 + 1 queries (the N + 1 problem). This approach wouldn’t perform on a relational database, and it doesn’t perform on AppEngine either.
One approach I use a lot in this case is an “in-memory join”:
- select all authors
- build in-memory map of authors (key=authorId, value=author)
- iterate over books
- get author for book from in-memory list of authors
Objectify ofy = ObjectifyService.begin(); List<Book> books = ofy.query(Book.class).list(); StringBuilder sb = new StringBuilder(); final List<Author> authors = ofy.query(Author.class).list(); final Map<Long, Author> authorMap = new HashMap<Long, Author>(); for (Author author : authors) { authorMap.put(author.getId(), author); } for (Book book : books) { sb.append(book.getTitle()); sb.append(": "); for (Key<Author> authorKey : book.getAuthorKeys()) { final Author author = authorMap.get(authorKey.getId()); sb.append(author.getFirstname()).append(author.getLastname()).append(", "); } sb.append("<br>"); }
That seems like something very counter-initiative if you’re from the relational world. Why do something in code that the database can do for you? Well that’s the thing, the database can’t in this case. CPU cycles are relatively cheap on AppEngine, so that’s not really a bottleneck either. And the result can be cached in MemCache. Either the “joined” set of books/authors, or just the author table (e.g, if books change more often).
This doesn’t work you would have millions of authors. You don’t want (and are impossible) to load millions of authors in memory just link 500 books to their department. In that case you can use a bulk get. This is a normal get operation, but with multiple id’s as arguments. Those objects will be loaded in one batch. The approach would be as follows:
- select all books
- build set of all required authors for all books
- batch get required authors
- iterate over books
- get author for book from in-memory list of authors
Objectify ofy = ObjectifyService.begin(); List<Book> books = ofy.query(Book.class).list(); StringBuilder sb = new StringBuilder(); Set<Key<Author>> authorKeys = new HashSet<Key<Author>>(); for (Book book : books) { authorKeys.addAll(book.getAuthorKeys()); } final Map<Key<Author>, Author> authorMap = ofy.get(authorKeys); for (Book book : books) { sb.append(book.getTitle()); sb.append(": "); for (Key<Author> authorKey : book.getAuthorKeys()) { final Author author = authorMap.get(authorKey); sb.append(author.getFirstname()).append(author.getLastname()).append(", "); } sb.append("<br>"); }
In the graph below you can see the difference in performance is dramatic. For a dataset of 1000 books and 5 authors the first approach takes over 20 seconds, while the other approaches are around 200-300ms.
Performance gain 6 - De-normalize
In some cases you query two related entities so often that you would be better of by de-normalizing the data. In the example above we could get rid of all the extra code if we would just add a departmentName field to the employee entity. Is that a better approach? Well, it depends. It’s definitively faster, but you have the overhead of having to keep the two fields in sync somehow.
I hope this article helps in getting applications to run better on AppEngine. It's not hard at all, just different. And you'll get a great platform for it in return.
Thx for this. You seem to be mixing book & employee examples--the post might be clearer if you stuck with one.
ReplyDeleteI see you use JSF2. myfaces or mojarra? Do you use a component library like richfaces or primefaces?
ReplyDeleteWhich jax-rs impl do you use?
I read that Seam doesn't work on appengine. appengine doesn't support CDI so you use Weld?
Thx.
For JSF2 I use Mojarra, I don't use component libraries a lot (I prefer plain jQuery in many cases) but I have good experiences with PrimeFaces on GAE. I didn't try other component libraries, but according to this post RichFaces 4 has GAE support: http://mkblog.exadel.com/2010/10/richfaces-4-m3-gae-support-new-richfaces-4-book/
ReplyDeleteFor clarity: CDI is just a specification. This specification is not supported out of the box by GAE because it's just a Servlet container. Weld is the reference implementation of CDI and works quite well on GAE because of it's Servlet support. When you're talking about Seam, I guess you mean Seam 2. With JSF 2 and CDI (Weld) you won't really need Seam any more, because they are the evolution of Seam back into the Java EE platform. Seam 3 is build on top of CDI and I guess that some of the modules will play well on GAE too. Take a look at http://seamframework.org/Seam3 for currently available modules.
For JAX-RS I've used both RESTEasy and Jersey with success, both work well on GAE. RESTEasy has better CDI integration and slightly better startup time though on GAE, so I slightly prefer RESTEasy.
ok... your example about "DataStore usage" is good.
ReplyDeletebut if I have an entity with 1001 rows?
Great article!. Could you provide some real numbers:
ReplyDelete- what's your minimal startup time you could achieve having decent framework for development?
- 1000 authors and 5 authors take 300ms in the best case . How does it scale - what would it take to do the same with 1M books and 50k authors? 10M books...
Thank you
Thanks for the feedback. I agree larger datasets would be a useful addition to the examples because there are some limitations in that area too. I will try to provide some examples and startup time numbers later this week.
ReplyDeleteI've also looked at this site: http://gaejava.appspot.com/
ReplyDeleteIt gives me up to 5seconds delay removing 100 records with JDO. Is that normal?
Another thing that every time I run the test the result vary (especially JDO) case. Is that likely to be the startup time?
Just to give you a heads up, I'm working on a new article about dealing with large datasets on AppEngine. I've to spread testing over a few days because those kind of numbers eat up my daily quota very very quickly, but I'll publish within a few days. Very positive results so far!
ReplyDeleteOne thing to remember when working with large datasets is that doing anything that makes your user wait results in a crappy user experience. It always boggles my mind when people talk about not being able to fetch large data sets into memory, or mutate large numbers of models or entity groups in 30 seconds. These aren't things you should be doing while the user is waiting for the next page anyway.
ReplyDeleteTasks took all of this stuff to the background, and that's where it should stay.
Great Article
ReplyDeleteJava Training in Chennai | Online Java Training
Wow!! Really a nice Article. Thank you so much for your efforts. Definitely, it will be helpful for others. I would like to follow your blog. Share more like this. Thanks Again.
ReplyDeletelg mobile service center in chennai
lg mobile service center
lg mobile service chennai
lg mobile repair
This comment has been removed by the author.
ReplyDeleteAmazing article. Your blog helped me to improve myself in many ways thanks for sharing this kind of wonderful informative blogs in live. I have bookmarked more article from this website. Such a nice blog you are providing.
ReplyDeletecoolpad service center near me
coolpad service
coolpad service centres in chennai
coolpad service center velachery
This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.
ReplyDeleteapple service center chennai
apple service center in chennai
apple mobile service centre in chennai
apple service center near me
This is the exact information I am been searching for, Thanks for sharing the required infos with the clear update and required points.
ReplyDeleteoneplus service center chennai
oneplus service center in chennai
oneplus service centre chennai
Thanks for sharing this unique information with us.Keep update like this.
ReplyDeleteDevOps Training in Velachery
DevOps Training in Anna Nagar
DevOps Training in Tambaram
DevOps Training in T Nagar
DevOps Training in Vadapalani
DevOps Training in Adyar
DevOps Training in OMR
DevOps Training in Thiruvanmiyur
DevOps Training in Porur
Really very happy to say, your post is very interesting to read. I never stop myself to say something about it.You’re doing a great job. Keep it up...
ReplyDeleteBecome an Expert In DBA Training in Bangalore! The most trusted and trending Programming Language. Learn from experienced Trainers and get the knowledge to crack a coding interview, @Bangalore Training Academy Located in BTM Layout.
Good article! I found some useful educational information in your blog about Selenium, it was awesome to read, thanks for sharing this great content to my vision.
ReplyDeleteJava training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery
What a really awesome post this is. Truly, one of the best posts I've ever witnessed to see in my whole life. Wow, just keep it up.
ReplyDeleteBusiness Analytics Training
Business Analytics Course In Hyderabad
Very interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.
ReplyDeleteBlue Prism Training in Bangalore
Best Blue Prism Training Institutes in Bangalore
I have recently visited your blog profile. I am totally impressed by your blogging skills and knowledge.
ReplyDeleteMongoDB Online Training
MongoDB Classes Online
MongoDB Training Online
Online MongoDB Course
MongoDB Course Online
This post is really helpful for us. I certainly love this website, keep on it. Rajasthan Budget Tours
ReplyDeleteInfertility specialist in chennai
ReplyDeleteSexologist in chennai
Sexologist doctor in chennai
Saham perusahaan diterbitkan di atas kertas, memungkinkan investor untuk memperdagangkan saham bolak-balik dengan investor lain, tetapi bursa yang diatur tidak ada sampai pembentukan Bursa Efek London (LSE) pada tahun 1773. Meskipun sejumlah besar gejolak keuangan mengikuti pendirian segera dari LSE, perdagangan pertukaran secara keseluruhan berhasil bertahan dan berkembang sepanjang tahun 1800-an. cek juga markets dan Cara Investasi Saham Dengan Modal Kecil
ReplyDeletewhat is contrave
ReplyDeletesilicon wives
sky pharmacy
atx 101 uk
macrolane buttock injections london
hydrogel buttock injections
buying vyvanse online legit
buy dermal fillers online usa
mesotherapy injections near me
xeomin reviews
Hi there! I just want to offer you a huge thumbs up for the great information you have here on this post. I’ll be coming back to your website for more soon.
ReplyDelete🌐야한동영상🌐
You’re so cool 오피헌터! I do not believe I’ve truly read through a single thing like this before. So good to find somebody with a few original thoughts on this issue. Really.. many thanks for starting this up. This site is one thing that is required on the internet, someone with a little originality.
ReplyDeleteThere is visibly a bundle to know about this. I think you made certain good factors in features also. 횟수 무제한 출장
ReplyDeleteWhen I originally commented I clicked the 스포츠마사지 -Alert me when new remarks are added- checkbox as well as now each time a comment is added I get 4 emails with the very same remark. Is there any way you can eliminate me from that solution? Many thanks!
ReplyDeletecover coin hangi borsada
ReplyDeletecover coin hangi borsada
cover coin hangi borsada
xec coin hangi borsada
ray hangi borsada
tiktok jeton hilesi
tiktok jeton hilesi
tiktok jeton hilesi
tiktok jeton hilesi
MPPSC Coaching in Indore
ReplyDeleteNice site, nice and easy оn thе eyes ɑnd great content too.Feel free to visit my pаge : 토토
ReplyDelete토토사이트 Outstanding post, you have pointed out some great points, I besides think this is a very good website.Also visit my site;
ReplyDelete카지노사이트 Wow that was odd. I just wrote an incredibly long comment but after I clicked submit my comment didn’t show up.Grrrr… well I’m not writing all that over again. Regardless, just wanted to say excellent blog!My webpage
ReplyDeleteHiya, I’m really glad I have found this information. Nowadays bloggers publish just about gossip and internet stuff and this is really irritating. A good web site with exciting content, this is what I ?need. Thanks for making this web site, and I will be visiting again. 카지노사이트
ReplyDeletetül perde modelleri
ReplyDeletenumara onay
mobil ödeme bozdurma
nft nasıl alınır
Ankara Evden Eve Nakliyat
TRAFİK SİGORTASI
dedektör
web sitesi kurma
aşk kitapları
폭스나인 폭스나인 폭스나인
ReplyDeleteI think I have never seen such blogs ever before that has complete things with all details which I want. So kindly update this ever for us.
ReplyDeletefull stack developer course with placement
Super Post. It was worth reading.
ReplyDeleteJava course in Pune
There is clearly a lot to learn about this topic. I believe you've highlighted some important points and features as well.
ReplyDeletefullstacktrainingcenter