App Autoscaling

In the Getting started, we have created a simple application. But when the load of this application exceeds the endurance range of computing resources, the application's service response will slow down, time out, or even be unavailable. The EverAI platform provides an autoscaling mechanism that can help your application automatically expand under high load conditions, eliminating the need for you to manually deploy new computing nodes. This enables your application to quickly increase its load capacity in a short period of time.

The EverAI platform currently provides two autoscaling mechanisms, one is based on at least the number of online free workers; the other is based on the maximum number of queues.

Min Free Workers

First you create a configmap by the following command, this configmap includes policy parameters about autoscaling, the example defines that the mini number of workers is 1, the maximum number of workers is 5, the mini number of free workers is 1, and the step size of worker for scaling up is 1 (the number of workers for each scaling up is 1).

everai configmap create get-start-configmap \
  --from-literal min_workers=1 \
  --from-literal max_workers=5 \
  --from-literal min_free_workers=1 \
  --from-literal scale_up_step=1 \
  --from-literal max_idle_time=60

Based on the app.py code in Getting started, When you define tha object of app, you should add parameter autoscaler.

from everai_autoscaler.builtin import FreeWorkerAutoScaler

CONFIGMAP_NAME = 'get-start-configmap'

app = App(
    '<your app name>',
    image=image,
    volume_requests=[
        VolumeRequest(name=VOLUME_NAME),
    ],
    secret_requests=[QUAY_IO_SECRET_NAME],
    configmap_requests=[CONFIGMAP_NAME],
    resource_requests=ResourceRequests(
        cpu_num=1,
        memory_mb=1024,
    ),
    autoscaler=FreeWorkerAutoScaler(
        min_workers=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='min_workers'),
        max_workers=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='max_workers'),
        min_free_workers=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='min_free_workers'),
        max_idle_time=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='max_idle_time'),
        scale_up_step=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='scale_up_step'),
    ),
)

Secondly, you run everai app run to check your code locally. And then open image_builder.py, update your image registry's version. Run everai image build to buld image and push the image to quay.io. Aftering build image, you can run the following command to upgrade your app.

everai app update

Now, your app has the ability to autoscale.

Run everai worker list, you can see that there is one worker under low load conditions. Note that CREATED_AT and DELETED_AT use UTC time display.

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
5LJtBqJsRYgT67ZEuMAt88  RUNNING   FREE             2024-07-29T02:42:50+0000

In this step, you can use ab to test your app's performance, and expand your app's workload. At same time, you should observe that the changes in the number of workers.

ab -s 120 -t 120 -c 4 -n 300000 -H'Authorization: Bearer <your_token>' https://everai.expvent.com/api/routes/v1/<your namespace>/<your app name>/sse

In the worker list, you can see four busy workers and one free worker, there is just one worker running before the performance test. It means that the EverAI platform has finished the scale job for your app automatically.

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
5LJtBqJsRYgT67ZEuMAt88  RUNNING   BUSY             2024-07-29T02:42:50+0000
VtU8mgrcneAqQzSh9GE5KQ  RUNNING   BUSY             2024-07-29T03:06:50+0000
iMrG9wGR6y8xYanE3Dpxna  RUNNING   BUSY             2024-07-29T03:07:10+0000
UtybEMGoZ4FtAF5Vjuddmu  RUNNING   BUSY             2024-07-29T03:07:30+0000
LurJwFLStbYoarbNjepGHV  RUNNING   FREE             2024-07-29T03:07:50+0000

When the ab performance test is over and the peak period of the application business load has passed, the system will automatically determine the load status of the workers. After max_idle_time, the workers generated by the expansion will be released and restored to the set number of min_workers. In this example, by executing everai worker list, you can see that the number of workers in the app has returned to one worker before expansion. The system has completed the automatic scaling down operation for your app.

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
5LJtBqJsRYgT67ZEuMAt88  RUNNING   FREE             2024-07-29T02:42:50+0000

Max Queue Size

First you create a configmap by the following command, this configmap includes policy parameters about autoscaling, the example defines that the mini number of workers is 1, the maximum number of workers is 5, the maximum queue size is 2, and the step size of worker for scaling up is 1 (the number of workers for each scaling up is 1).

everai configmap create get-start-configmap \
  --from-literal min_workers=1 \
  --from-literal max_workers=5 \
  --from-literal max_queue_size=2 \
  --from-literal scale_up_step=1 \
  --from-literal max_idle_time=60

Based on the app.py code in Getting started, When you define tha object of app, you should add parameter autoscaler.

from everai_autoscaler.builtin import SimpleAutoScaler

CONFIGMAP_NAME = 'get-start-configmap'

app = App(
    '<your app name>',
    image=image,
    volume_requests=[
        VolumeRequest(name=VOLUME_NAME),
    ],
    secret_requests=[QUAY_IO_SECRET_NAME],
    configmap_requests=[CONFIGMAP_NAME],
    resource_requests=ResourceRequests(
        cpu_num=1,
        memory_mb=1024,
    ),
    autoscaler=SimpleAutoScaler(
        min_workers=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='min_workers'),
        max_workers=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='max_workers'),
        max_queue_size=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='max_queue_size'),
        max_idle_time=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='max_idle_time'),
        scale_up_step=Placeholder(kind='ConfigMap', name=CONFIGMAP_NAME, key='scale_up_step'),
    ),
)

everai app update

Now, your app has the ability to autoscale.

Run everai worker list, you can see that there is one worker under low load conditions. Note that CREATED_AT and DELETED_AT use UTC time display.

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
PWwUmUqNYuzzM5sa98ajJL  RUNNING   FREE             2024-07-01T09:47:31+0000

Run everai app queue, you can see the queue size is 0 in the queue list.

  QUEUE_INDEX  CREATE_AT                 QUEUE_REASON
-------------  ------------------------  --------------

In this step, you can use ab to test your app's performance, and expand your app's workload. At same time, you should observe that the changes in the number of workers and queues.

ab -s 120 -t 120 -c 4 -n 300000 -H'Authorization: Bearer <your_token>' https://everai.expvent.com/api/routes/v1/<your namespace>/<your app name>/sse

During the performance test, run everai worker list and everai app queue agian, you can see the changes. Now, the queue size is 2 in queue list.

  QUEUE_INDEX  CREATE_AT                 QUEUE_REASON
-------------  ------------------------  --------------
            0  2024-07-03T22:24:07+0000  WorkerBusy
            1  2024-07-03T22:24:07+0000  WorkerBusy

In the worker list, you can see 2 workers running, there is just 1 worker running before the performance test. It means that the EverAI platform has finished the scale job for your app automatically.

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
PWwUmUqNYuzzM5sa98ajJL  RUNNING   BUSY             2024-07-01T09:47:31+0000
dweBRSPD395BvtBDsZYum8  RUNNING   BUSY             2024-07-03T22:24:08+0000

ID                      STATUS    DETAIL_STATUS    CREATED_AT                DELETED_AT
----------------------  --------  ---------------  ------------------------  ------------
PWwUmUqNYuzzM5sa98ajJL  RUNNING   FREE             2024-07-01T09:47:31+0000

App Autoscaling

Min Free Workers​

Max Queue Size​

Min Free Workers

Max Queue Size