Clustering Services

In the JBoss clustering topic you learned about the JBoss clustering architecture, JGroups and caching, this topic builds on that and discusses clustering services, the following will be discussed

The Java EE application components above have certain clustering requirements

Object/Component
Requires load balancing
Requires state replication
Servlet/JSP
yes
yes
EJB SLSB
yes
no
EJB SFSB
yes
(you need sticky session load balancing
yes
EJB Entity
no
(EJB 2.x entity beans could be called remotely but in EJB3 they can't)
yes
(if you use Hibernate 2nd level cache)
EJB MDB
yes
no
JNDI Object
yes
yes

HTTP load Balancing

I have already discussed HTTP load balancing in Tomcat clustering section , have a read to get an idea what HTTP load balancing is and then pop back here. If you have an application that needs to be highly available or be able to scale for very large number of users then you need load balance requests across multiple nodes, however you do not need a cluster to do this. If the application is completely stateless then there is no need for a cluster, if the application is stateful then you can use sticky sessions (I discussed sticky session in my Tomcat Apache server and Tomcat clustering) and again a cluster is not needed. If the application need to be fault-tolerant then you need a cluster, so that you can perform state replication.

To perform HTTP load balancing you need to provide a load balancer in front of your web servers, the load balancer will then distribute the HTTP requests across the nodes. Typically in the real world you would use a hardware load balancer and use Apache web servers to service the HTTP requests. The Apache web servers then would handle any static web pages and pass on any servlet/JSP requests to an application server such as Tomcat or JBoss, this request normally would use the AJP connector and again can be load balanced across a number of nodes.

Because JBoss uses Tomcat you can configure the AJP connector which is configured in server/xxx/deploy/jbossweb.sar/server.xml file, you need to add a jvmRoute (just like Tomcat) to the <engine> definition.

jvmRoute <Engine name="jboss.web" defaultHost="localhost" jvmRoute="node1">

Remember the jvmRoute is a logical name that is defined in the load balancer configuration of the Apache web server, for an example if you are using the mod_jk the value of the jvmRoute attribute should be set to the name of the worker.

HTTP Session Replication

Web applications often keep state information be it login details or presentation state, the all JBoss configuration is capable of replicating state but you must enable this in your applications even if the distributable cache has been enabled. To do this you add the distributable element to your application's WEB-INF/web.xml file .

enable state replication <web-app>
  <distributable/>
  ...
</web-app>

The HTTP session cache is configured in the standard-session-cache cache configuration in server/all/deploy/cluster/jboss-cache-manager.sar/META-INF/jboss-cache-configs.xml and jboss-cache-manager-jboss-beans.xml in the same directory.

JBoss can trigger when you want the cache to be replicated during a session, there are a number of options

Replication-trigger option Description
SET replication only occurs if an attribute is put into the session using a set call, if you get an attribute out of the session and modify it value, the value isn't replicated
SET_AND_GET All set and get calls on any attribute trigger state replication. This can affect performance.
SET_AND_NON_PRIMITIVE_GET Behaves the same as SET_AND_GET except that it only replicates on a get if the attribute received isn't a wrapper form a primitive data type. This is the default option
ACCESS Triggers replication on every request that accesses the session, this is slow but ensures the session timestamp is synchronized between all the nodes in the cluster, preventing the session from being evicted in one cache while remaining in others.
Example
WEB-INF/jboss-web.xml <jboss-web>
  <replication-config>
    <replication-trigger>
      SET_AND_NON_PRIMITIVE_GET
    </replication-trigger>
  </replication-config>
</jboss-web>

You can also set the granularity on what gets replicated, there are some good example in docs/dtd/jboss-web_5_0.dtd file

Replication-granularity option Description
SESSION The entire session is replicated upon a replication trigger, this option is the default and is preferred when sessions are generally small in size
ATTRIBUTE Only dirty session attributes are updated in addition to some session data
FIELD only the dirty fields on an object are updated, this is the best performance option as it can cut down the amount of data replicated.
Example
WEB-INF/jboss-web.xml <jboss-web>
  <replication-config>
    <replication-granularity>
      SESSION
    </replication-granularity>
  </replication-config>
</jboss-web>

To enable the field-level replication you need to edit the file server/all/deployers/jbossweb.deployer/META-INF/war-deployers-beans.xml and change the cacheName property from standard-session-cache to field-granularity-session-cache, then you can use the field option in your META-INF/jboss-web.xml file. There is an attribute to determine if the replication should happen immediately or not, replication-field-batch-mode if set to true then the replication (the default) the session batches up all the changes and replicates them at the end of the request.

There is a annotation that you can use to ensure your classes are replicated

Replication annotation @Replicable
public class Employee {
  ...
}

Session passivation is configured in the server/all/deploy/cluster/jboss-cache-manager.sar/META-INF/jboss-cache-manager-jboss-beans.xml file, the passivation property controls how the cache interacts with the cache loader if set to true the cache uses a secondary storage only when a node is evicted from memory, if set to false then the replication changes are written immediately. The shared property tells the cache that the cache loader isn't shared by multiple caches but is unique to this cache. The location defines where the passivated sessions will be stored by the cache loader by default this is the configurations data directory.

passivation settings <property name="cacheLoaderConfig">
  <bean class="org.jboss.cache.config.CacheLoaderConfig">
    <!-- Do not change these -->
    <property name="passivation">true</property>
    <property name="shared">false</property>

    <property name="individualCacheLoaderConfigs">
    <list>
      <bean class="org.jboss.cache.loader.FileCacheLoaderConfig">
        <!-- Where passivated sessions are stored -->
        <property name="location">${jboss.server.data.dir}${/}session</property>
        <!-- Do not change these -->
        <property name="async">false</property>
        <property name="fetchPersistentState">true</property>
        <property name="purgeOnStartup">true</property>
        <property name="ignoreModifications">false</property>
        <property name="checkCharacterPortability">false</property>
      </bean>
    </list>
    </property>
  </bean>
</property>

Clustering Session EJB's

There are two reasons why you would want to cluster session EJB's

Both stateful and stateless beans can make use of load balancing, SFSB need sticky-session load balancing to achieve server affinity and SLSBs can make use of random or round-robin load balancing. To enable a session bean to be clustered you must run the all JBoss configuration and annotate the bean with the org.jboss.ejb3.annotation.Clustered annotation, it also has a number of attributes

@Clustered Annotation
loadBalancingPolicy

the value must be a String and will use the org.jboss.ha.framework.interfaces.LoadBalancePolicy interface which as a number of different policies

  • FirstAvailable - randomly select a target node and sticks with that node for all calls on the proxy (known as sticky-session)
  • FirstAvailableIdenticalAllPoxies - sticks with the same randomly selected node, if it dies another is randomly selected
  • RandomRobin - every EJB request is directed to a random node
  • RoundRobin - cycles across the list of nodes in the cluster sequentially
partition

specifies the particular cluster that the bean participates in, the default is DefaultPartition

Example
Clustered annotation examples

@Stateless
@Clustered
public class ExampleBean implements SomeInterface {
  ...
}

@Stateless
@Clustered(loadBalancingPolicy="RoundRobin", partition="MyPartition")
public class ExampleBean implements SomeInterface {
  ...
}

Replicating SFSBs is the same as above but you can only specify FirstAvailable for the load balancing policy for stateful beans. The cache used to store and replicate the state of the SFSB can be configured in server/all/deploy/cluster/jboss-cache-manger.sar/META-INF/jboss-cache-configs.xml and jboss-cache-manager-jboss-beans.xml under the sfsb-cache cache configuration. The @CacheConfig annotation has a number of attributes that you can configure

@CacheConfig Annotation
maxSize
specifies the maximum number of beans that can be cached before the cache should start passivating beans using the LRU algorithm, default is 10000
idleTimeoutSeconds
specifies the maximum period of time a bean can go unused before the cache should passivate it, default is 300
removalTimeoutSeconds
specifies the maximum period of time a bean can go unused before the cache should remove it altogether, default is 0
replicationIsPassivation
specifies whether the cache should consider a replication as being equivalent to a passivation and invoke any @PrePassivate and @PostActivate callbacks on the bean. by default it is set to true since replication involves serializing the bean and preparing for and recovering from serialization is a common reason for implementing the callback methods.
Example
  @Stateful
@Clustered
@CacheConfig(maxSize=5000,removalTimeoutSeconds=18000)
...

Clustering Entities

In Java Persistence API (JPA) entities are not remotely accessible, so you do not need to cluster entities for the purpose of load balancing, but we do care about state replication and high availability.

The EJB3 specification says nothing about entity caching, but JBoss uses Hibernate as its JPA implementation which is a second-level cache, what we mean is that you can use different cache implementation but JBoss uses JBoss cache because it is distributed, transactional and already built into JBoss.

JBoss cache has support for caching the following data types

The data is either replicated or invalidated across the cluster, thus reducing the number of database queries your application has to make. Entities can use several cache configurations, you must configure your JPA persistence context and your beans to use the entity cache because you might not want all your entity types to participate in the cache, queries generated by Hibernate can be cached as well.

First we must configure entities to use the cache, you must configure JPA to know about the cache, you can do this using the META-INF/persistence.xml file in your application

META-INF/persistence.xml

<persistence-unit name="tempdb" transaction-type="JTA">
  <jta-data-source>java:/DefaultDS</jta-data-source>
  <properties>
    <property>
      name="hibernate.cache.region.factory_class"
      value="org.hibernate.cache.jbc2.JndiMultiplexdJBossCacheRegionFactory"/>
    <property>
      name="hibernate.cache.region.jbc2.cachefactory"
      value="java:CacheManager"/>
    <property>
      name="hibernate.cache.region.jbc2.cfg.entity"
      value="optimistic-entity"/>
    <property>
      name="hibernate.cache.region.jbc2.cfg.collection"
      value="optimistic-entity"/>
    <property>
      name="hibernate.cache.region.jbc2.cfg.ts"
      value="timestamps-cache"/>
    <property>
      name="hibernate.cache.region.jbc2.cfg.query"
      value="local-query"/>
  </properties>
</persistence-unit>

All the configurations for these caches are found in server/all/deploy/cluster/jboss-cache-manger.sar/META-INF/jboss-cache-configs.xml and jboss-cache-manager-jboss-beans.xml, the below table summarizes the cache configurations that are available with JBoss AS 5

Cache configuration name
supported data types
Best for
cache mode
Node locking
Initial state transfer
optimistic-entity
Entities, collections
Entities, collections
Synchronous invalidation
Optimistic
No
pessimistic-entity
Entities, collections
Entities, collections
Synchronous invalidation
Pessimistic
No
local-query
Queries
Queries
Local
Optimistic
N/A
replicated-query
Queries
Asynchronous replication
Optimistic
No
timestamp-cache
Timestamps
Timestamps
Asynchronous replication
Pessimistic
Yes
optimistic-shared
Entities, collections, queries, timestamps
 
Synchronous replication
Optimistic
Yes
pesstimistic-shared
Entities, collections, queries, timestamps
 
Synchronous replication
Pessimistic
Yes

After you have selected the cache that you want to use, you can tell an entity to participate in the cache by annotating the entity class, the @Cache has several options

@Cache annotation
READ_ONLY use this if the data never changes, once the entity is in the cache, it isn't retrieved again unless the cache is manually evicted
READ_WRITE

use this if the data is occasionally updated, but where isolation is necessary to avoid stale data. This option is not available with JBoss cache, so you must use a different cache implementation

NONSTRICT_READ_WRITE

use this if the data is occasionally updated, but where some stale data can be tolerated. This option is not available with JBoss cache, so you must use a different cache implementation

TRANSACTIONAL Guarantees full transactional isolation up to repeatable read, will not tolerant stale data.
Example
@Cache annotation @Entity
@Table(name="CATEGORIES")
@Cache(usage=CacheConCurrencyStrategy.READ_ONLY)
public class Category implements Serializable {
  private Long jpaId;
  private String categoryName;
  ...
}

Clustering JNDI

In Java EE accessing a server-side component almost always requires a JNDI lookup, JBoss provides a cluster-aware, high availability JNDI services called HA-JNDI that runs on top of the existing JNDI framework. The service provides client applications with the following features

Each node will have its own local JNDI service running, each node will also have a HA-JNDI service running that uses JGroups to be aware of the other HA-JNDI services running on the other nodes in the cluster (see below picture). Any objects bound into HA-JNDI service are replicated across the cluster, local JNDI bound objects are not replicated.

The below picture represent two nodes running a HA-JNDI and local JNDI service, when the client performs a EJB lookup on node A and node A HA-JNDI service doesn't have the EJB it calls the local JNDI service on node A, if this doesn't have the EJB then we then move on to the next node within the cluster trying to find the EJB, in the worst case you do not find the EJB will have to perform a HA-JNDI and local JNDI lookup on every node in the cluster.

To enable the HA-JNDI service you edit the server/all/deploy/cluster/hajndi-jboss-beans.xml file, the HAJNDI depends on the cluster in the HAPartition bean that is defined in server/all/deploy/cluster/hapartition-jboss-beans.xml. There are a number of properties that the HAJNDI uses

HAJNDI properties
bindAddress network address the jndi service bind to waiting for clients requests
port the port number the client uses to look up a naming-service dynamic proxy
rmiPort After the client looks up a dynamic proxy using the port, the dynamic proxy uses the RMI port to communicate with the JNDI server to do naming lookups
backlog defines have many queue requests are allowed before start getting "Connection Refused" errors
discoveryDisabled used to disable automatic discovery
autoDiscoveryBindAddress the network address to bind to for client auto discovery
autoDiscoveryAddress the multicast address to listen to for automatic discovery
autoDiscoveryGroup the multicast port to listen to for automatic discovery
autoDiscoveryTTL the TTL (time to live) in seconds for an automatic discovery request from the client
loadBalancePolicy the load-balancing policy to use inside the dynamic proxy down-loaded by the client

In your jndi.properties file you can use either manual or auto discovery JNDI services

manual java.naming.factory.initial=org.jnp.interfaces.NamingContextFactory
java.naming.provider.url=192.168.0.1:1100,192.168.0.2:1100
auto discovery java.naming.factory.initial=org.jnp.interfaces.NamingContextFactory
jnp.partitionName=MyCluster
jnp.discoveryGroup=230.0.0.4
jnp.discoveryPort=1102
jnp.disableDiscovery=false
jnp.discoveryTimeout=5000