Executing Long Running Tasks in Google App Engine – How to do it?
Most of the times a question flashes into the mind of the developers especially those who work on the Google Cloud Platform:
What if I am using Google App Engine Platform as a Service and will be having long running tasks that should run in the background for hours or maybe even days, is it possible?
Yes, it is possible. The answer is: use the “Task Queues”, one of the most laudable features provided by Google App Engine. But when hosted in the production, many types of unexpected problems arise with respect to long running tasks in task queues, which if not addressed, there may be a surprising behavior of the application in the production. Towards the end, we will be quickly discussing the different configurations required for respective application-specific requirements which are necessary for a long running task to run in the task queue.
Before going towards the discussion, let me quickly brief what is Google App Engine and what are Task Queues?
Google App Engine:
Google App Engine (most commonly referred as GAE) is a platform for building scalable web applications and mobile backends. It offers different features for web applications such as Automated Security Scanning for detecting web vulnerabilities, supports popular development tools such as Eclipse, IntelliJ, Maven, Git, Jenkins and PyCharm which makes developing on GAE developer friendly.
Moreover, features like User Authentication using Google Accounts, NoSQL Datastore, Memcache and Task Queues make GAE incomparable.
Therefore, it is most commonly preferred for developing web applications hosted on Google Cloud Platform.
GAE Task Queues:
Sometimes, there might be a scenario where a user takes a particular action on the web application and that task could be run outside of the user’s request which can be executed later.
For example, if a user wants to upload an “online” file to the web application, the user can provide just the link of the file to be uploaded and instead of waiting for the file to upload and prevent itself from performing other tasks on the application, user can return anytime later to check the progress of the upload. Here, the upload task is assigned to a task in task queue which runs asynchronously outside the user’s request and completes the task. Thus, the user can perform other tasks on the web application while the upload job will still be in progress in the background.
In this way, Task queues can help us to carry out important tasks that can be executed in the background.
Coming to the topic, let us proceed towards the discussion on how to run long running tasks in task queues and what configurations prevent the tasks to do so in production?
Let’s understand some very important information for achieving this:
Often times, the application behaves exactly as expected in the development server, but when on the production server there are chances that most of the features of the application does not work or does not yield as expected. This is because when we deploy the application on the cloud, we actually use the cloud resources and configurations that might be different from the development server. These resources and configurations can be configured by understanding the different cloud instances that are offered by the cloud service provider which are generally well described in their documentation along with the respective costs.
GAE offers two types of instance configurations, Frontend instances, and Backend Instances. As the names describe, the frontend instances are used to compute the operations that are carried out at the end-user level. The backend instances perform the computations in the background.
By default, when we first deploy our application to the GAE without editing the app-engine.XML (The application configuration file), the instance allocated is the most basic Frontend Instance. Beyond doubt, each instance is associated with respective prices.
As discussed above, scaling is the most promising feature of Google App Engine. Thereby, each instance can be configured with appropriate scaling. There are three types of scaling offered by the GAE, namely, Automatic Scaling, Basic Scaling and Manual Scaling. The configuration of scaling options can considerably affect the cost of running the application.
As the default instance is a Frontend instance, the default scaling is Automatic Scaling as per the documentation. This is where the concern for running long task in the task queue rises. How?
As per the official documentation, in Automatic Scaling, a task in a task queue can run maximum for 10 minutes. Thus, if there are tasks that can execute within the 10-minute deadline, this type of scaling would not create many problems. But we are talking about the long-running tasks that exceed the 10-minute deadline. So, how to make it work?
Indeed, automatic scaling would not help, we can switch to basic scaling or manual scaling. But while using Frontend instance, only automatic scaling is allowed. Therefore, it is also necessary to use Backend instance instead of Frontend instance.
Now, when used Backend instance, configure the app to use basic or manual scaling. For more information on scaling types and other information, please visit the official GAE documentation. (Link)
With the configuration of Backend instance and a Manual or Basic scaling, there are no restrictions for tasks to execute in 10 minutes deadline, instead, a task can run in the background until it completes its execution. However, using basic scaling would be preferred to control costs if there is no need to complex initializations and relying on state of instance’s memory over time.
Final Words:
While using different features of the Google App Engine, it must be noted that the behavior of the application is different on the development server and production server. Thus, before making a release, the application should be well tested for every feature on the Production Server.