Coherence*Web Session Reaper

Skip to end of metadata
Go to start of metadata

Introduction

As part of Coherence*Web, HTTP sessions are eventually cleaned up by the Session Reaper, and the associated memory is thus freed up. As such, the Session Reaper provides a service much like the JVM's own Garbage Collection (GC) capability: The Session Reaper is responsible for destroying any session that is no longer used, which is determined when that session has timed out.

Each HTTP session contains two pieces of information that determine when it has timed out. The first is the LastAccessedTime property of the session, which is the timestamp of the most recent activity involving the session. The second is the MaxInactiveInterval property of the session, which specifies how long the session will be kept alive without any activity; a typical value for the MaxInactiveInterval property is 30 minutes. (The MaxInactiveInterval property defaults to the value specified for the coherence-session-expire-seconds configuration option, but it can be modified on a session-by-session basis.)

Each time that an HTTP request is received by the server, if there is an HTTP session associated with that request then the LastAccessedTime property of the session is automatically updated to the current time. As long as requests continue to arrive related to that session, it will be kept alive, but once a period of inactivity occurs longer than that specified by the MaxInactiveInterval property, then the session expires. Session expiration is passive - occurring only due to the passing of time. The Coherence*Web Session Reaper scans for sessions that have expired, and when it finds expired sessions it cleans them up.

Configuration and Operation

There are three basic questions that the Session Reaper configuration answers:

  • On which servers will it run?
  • How frequently will it run?
  • When it runs, on which servers will it look for expired sessions?

The Session Reaper runs as part of the application server. That means that if Coherence is configured to provide a separate cache tier (made up of "cache servers"), then the Session Reaper does not run on those cache servers.

Consider the three different deployment topologies used with Coherence*Web:

  • In-Process: The application servers that run Coherence*Web are storage-enabled, so that the HTTP session storage is co-located with the application servers; there are no separate cache servers used for HTTP session storage.

  • Out-Of-Process: The application servers that run Coherence*Web are storage-disabled members of the Coherence cluster; there are separate cache servers used for HTTP session storage.

  • Out-Of-Process with Coherence*Extend: The application servers that run Coherence*Web are not part of a Coherence cluster; the application servers use Coherence*Extend to attach to a Coherence cluster which contains cache servers used for HTTP session storage.

Every single application server running Coherence*Web will run the Session Reaper. By default, the Session Reaper runs concurrently on all of the application servers, so that all of the servers share the workload of identifying and cleaning up expired sessions. (The coherence-reaperdaemon-cluster-coordinated configuration option causes the cluster to coordinate reaping so that only one server at a time is performing the actual reaping; the use of this option is not suggested, and it cannot be used with the Coherence*Web over Coherence*Extend topology.)

The Session Reaper is configured to scan the entire set of sessions over a certain period of time, called a reaping cycle, which defaults to five minutes. (This length of the reaping cycle is specified by the coherence-reaperdaemon-cycle-seconds option.) Since the Session Reaper is expected to scan all of the sessions that it is responsible for and to clean up any expired sessions within the reaping cycle, this setting indicates to the Session Reaper how aggressively it must work. As a result, if the cycle length is configured too short, the Session Reaper will use additional resources without providing additional benefit. If the cycle length is configured too long, then sessions may not be cleaned up as quickly after they have expired. In most situations, it is far preferable to reduce resource usage than to ensure that sessions are cleaned up quickly after they expire. As a result, the default cycle of five minutes is a good balance between promptness of cleanup and minimal resource usage.

During the reaping cycle, the Session Reaper scans for expired sessions. In most cases, the Session Reaper will be responsible for scanning all of the HTTP sessions across the entire cluster, but there is an optimization available for the Single Tier topology. In the Single Tier topology, when all of the sessions are being managed by storage-enabled Coherence cluster members that are also running the application server, then the session storage is co-located with the application server, and as a result it is possible for the Session Reaper on each application server to only scan the sessions that are stored locally. (This behavior can be enabled by setting the coherence-reaperdaemon-assume-locality configuration option to true.)

Regardless of whether the Session Reaper scans only co-located sessions or all sessions, it does so in a very efficient manner by using two of the advanced capabilities of the Coherence data grid:

  • First, starting with Coherence version 3.5, the Session Reaper does not actually look at each session; instead, it delegates the search for expired sessions to the data grid using a custom ${xhtml} implementation. This ValueExtractor takes advantage of the ${xhtml} interface introduced in Coherence version 3.5 so that it can determine if the session has expired without even deserializing the session. As a result, the selection of expired sessions can be delegated to the data grid just like any other parallel query, and can be executed by storage-enabled Coherence members in a very efficient manner.

  • Second, instead of selecting all of the expired sessions at once using a parallel query, the Session Reaper only queries on member at a time; this allows the Session Reaper to divide the work of the query across the duration of the reaping cycle. Additionally, this eliminates the need for group communication when querying for expired sessions.

  • Lastly, since the work of cleaning up expired sessions is broken up across the entire reaping cycle, this ensures that the selection of expired sessions is also broken up across the reaping cycle, so that the selection occurs close before the clean-up of expired sessions, thus reducing the chance that multiple application servers would attempt to clean up the same expired sessions. (The Session Reaper uses the ${xhtml} class to automatically query on a member-by-member basis, and to do so in a randomized order that avoids harmonics in large-scale clusters.)

Each storage-enabled member can very efficiently scan for any expired sessions, and it will only have to scan one time per application server per reaper cycle. The result is an out-of-the-box Session Reaper configuration that will work well for application server clusters with only two servers, as well as application server clusters with several hundred servers. Furthermore, the configuration will work well for applications with several hundred concurrent sessions, as well as for applications with several million concurrent sessions.

To ensure that the Session Reaper does not impact the smooth operation of the application server, it breaks up its work into chunks and schedules that work in a manner that spreads the work across the entire reaping cycle. Since the Session Reaper has to know how much work it must schedule, it maintains statistics on the amount of work that it performed in previous cycles, and uses statistical weighting to ensure that statistics from recent reaping cycles count more heavily. There are several reasons why the Session Reaper breaks up the work in this manner:

  • If the Session Reaper consumed a large number of CPU cycles at one time, it could cause the application to be less responsive to users; by doing a small portion of the work at a time, the application remains responsive.

  • One of the key performance enablers for Coherence*Web is the near caching feature of Coherence; since the sessions that are expired are accessed through that same near cache in order to clean them up, expiring too many sessions too quickly could cause the cache to evict sessions that are being used on that application server, leading to performance loss.

In other words, the Session Reaper performs its job efficiently, even with the default out-of-the-box configuration, by:

  • Delegating as much work as possible to the data grid;
  • Delegating work to only one member at a time;
  • Avoiding group communication;
  • Enabling the data grid to find expired sessions without even deserializing them;
  • Restricting the usage of CPU cycles;
  • Avoiding cache-thrashing of the near caches that Coherence*Web relies on for performance.

Tuning

In summary, the following list contains suggestions for tuning the out-of-the-box configuration of the Session Reaper:

  • If the application is deployed with the in process topology, then set the coherence-reaperdaemon-assume-locality configuration option to true.

  • Since all of the application servers are responsible for scanning for expired sessions, it is reasonable to increase the coherence-reaperdaemon-cycle-seconds configuration option if the cluster is larger than 10 application servers. The larger the number of application servers, the longer the cycle can be; for example, with 200 servers, it would be reasonable to set the length of the reaper cycle as high as 30 minutes (i.e. setting the coherence-reaperdaemon-cycle-seconds configuration option to 1800).

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.