Seven Rules for Optimizing Entity Beans

Site hosted by Angelfire.com: Build your free website today!

Requires login

Early Access

Downloads

Bug Database

Submit a Bug

View Database

Newsletters

Back Issues

Subscribe

Learning Centers

Articles

Bookshelf

Code Samples

New to Java

Question of the Week

Quizzes

Tech Tips

Tutorials

Forums

Articles Index

Seven Rules for Optimizing Entity Beans

By Akara Sucharitakul
May 2001

Entity beans provide a clear model to represent persistent business objects in applications and their design. In object models, simple Java^TM objects are normally represented in a straightforward way, but do not include the transactional persistence management functionality usually required for business objects. Entity beans not only allow the same type of modelling and thinking about business objects in an object model, but they also encapsulate persistence mechanisms while hiding all complexity behind the bean and container services. This allows applications to manipulate them as Java object references. By hiding the persistent form and persistence mechanism from any calling code, entity beans allow for creative persistence optimizations by the container, while keeping the data store open and flexible, only to be determined at deployment time.

Based on Enterprise JavaBeans^TM (EJB^TM) development projects, which make extensive use of object-oriented methodologies and result in heavy use of entity beans, Sun engineers have learned about using entity beans in the real world. This article elaborates on such development experience to:

Explore various optimizations
Provide rules and recommendations to achieve optimal performance and flexibility
Discuss how to avoid the known caveats

Use Container-Managed Persistence When You Can

Container-managed persistence (CMP) not only eases the job by requiring a lot less coding, but also enables many optimization possibilities within the container and the container-generated database access code. The container has access to the in-memory buffer of the bean which allow it to monitor for any change in the buffer. Storing the buffer to the database before committing a transaction can be avoided if the buffer has not changed. This avoids unnecessary expensive database calls.

Another instance of this optimization is calling find methods. Finding an entity provides a reference to that entity, which can be, and most likely will be, used right after the find operation. This usually involves two databases accesses:

Finding the record in the database and retrieving the primary key
Retrieving the record data into the buffer

CMP allows for optimizing these two database accesses into a single database access whenever it makes sense to do so, allowing retrieval of both the key and the record data in one database call.

Write Code that Supports Both Bean- and Container-Managed Persistence

In many cases the EJB's author does not have control over how the EJBs are deployed and whether the container used for deployment will support CMP. Also, the deployer may choose to use bean-managed persistence (BMP) in the target container. You must find a way to implement the beans to allow BMP deployment without disrupting the support for the potentially more optimized CMP mechanism. One easy way to achieve this is to separate pure business logic from the persistence mechanism. Implement the business logic in your CMP class, which can be deployed alone if CMP is chosen. Then implement the persistence code into the BMP class, which inherits from the CMP class. This preserves all the business logic of the CMP superclass and adds database access code in the BMP subclass, as shown in figure 1.

Figure 1: Separation between CMP and BMP

This model sounds easy to implement but still leaves the following flexibility inhibitors:

It is impossible to extend the implementation classes.
Doing so means the subclasses must extend both the CMP and the BMP superclass directly. In addition, the BMP subclass will have to extend its direct CMP implementation. This causes multiple class inheritance, as shown in Figure 2. Such multiple inheritance is not allowed in Java programming.
Figure 2: Multiple inheritance not supported in Java
There is no easy way to support alternate persistence implementations.
This would be useful if, for instance, implementations contain database specific code for different database vendors or different database types (for example relational, object, and other historic types)

To solve these problems, you can change the current model for the BMP class to delegate all the persistence code to a helper class and leave a skeleton in the BMP class. This kind of helper class is called a Data Access Object (DAO). You can provide multiple DAO classes by subclassing a DAO interface, which then allows the correct DAO subclasses to be instantiated. This is shown in Figure 3. There are many ways to choose and instantiate the correct DAO subclass, for example, by reading environment entries or by figuring out the DB type and choosing the most appropriate subclass.

Figure 3: Delegation and allowing for alternate DAO implementations.(Click to enlarge)

Using this model, both the CMP implementation and the DAOs that contain the implementations can be easily extended; everything but the BMP classes can be extended. Because these BMP classes contain a skeleton of delegation code that looks almost the same for every entity bean, and the instantiation code to choose the right DAO object, you can easily copy from one bean to another with very little modification. Also, you could eventually automatically generated them with a tool.

While it is possible to extend an entity bean to reuse the logic provided by another entity bean, it is not possible in the EJB 1.1 specification to allow extension of entity types that extend the home interface. Finders and create calls are theoretically always remote calls. The generated stubs will never represent a subtype, although logically you might want to (for instance, to use them as a factory method). This prevents the use of many useful design patterns in EJB implementations that follow the EJB 1.1 specification.

Minimize Database Access in ejbStores

When using CMP, the bean has absolutely no control over ejbStore, and leaves all optimizations to the container. CMP optimizations provided by the container become a differentiating factor between a fast and a slow container.

Whenever the bean is deployed using BMP, it is very useful to maintain a dirty flag for the buffer. All changes to the buffer will have to set the dirty flag so that it is checked by ejbStore. If the dirty flag is not set, which means the buffer has not been changed, ejbStore just skips all the expensive database access calls. This trick is especially useful for beans that are often queried but rarely updated. They usually comprise a rather big part of many applications (for example, lookup tables).

There are caveats with this technique. Since the dirty flag is set for the database accesses, it should fit into the category of BMP code rather than CMP code. Set it in either the skeleton BMP code or in the DAOs when implemented using the pattern previously explained. Because the DAOs are never invoked on calls to the business methods, they are not the right place to host the dirty flag and leave it to the BMP skeleton. This inherently makes the BMP code more complicated because business methods need to be overridden with delegations to the superclass.

Philosophically, the writer of an EJB should not have to deal with system-level issues like buffer handling. Unfortunately EJB 1.1 using BMP does not provide any alternative for such an optimization, so the bean author still has to set this dirty flag by hand.

Always Cache References Obtained from lookups and find Calls

Reference caching is useful for both entity beans and session beans. JNDI lookups of EJB resources, such as DataSources, bean references, or even environment entries can be fairly costly, and it is simple to avoid redundant lookups. To solve this problem, always:

Define these references as instance variables.
Look them up in the setEntityContext (method setSessionContext for session beans).

The setEntityContext method is only called once for a bean instance, so looking up all required references at this time is not really costly. Avoid looking up references in any other method, especially the database access methods, ejbLoad and ejbStore (not even DataSource lookups). Such methods can be called frequently, resulting in a lot of time spent inefficiently in lookup calls.

Calls to the finders of other entity beans are also heavyweight calls. While these calls may or may not fit into bean initialization callbacks like setEntityContext, cache the references resulting from finds whenever applicable. The cost of making redundant calls to finder methods can be very high. If the reference is only valid for the current entity, you need to clear the references before the instance gets reactivated to represent other entities. This should be done inside the ejbActivate method.

Always Prepare Your SQL Statements

This optimization is useful for all code using SQL to access relational databases. And since most of the current EJB implementations use relational databases, this rule is also very useful for EJB development where the author of the bean has to write database access code.

For each SQL statement processed by the database, the database has to spend time compiling the statement before executing it. Good relational databases, however, can cache the statement and its compiled form and match new statements against the cache to retrieve its compiled form. However, in order to use this optimization, the new statements must exactly match the old statements.

Non-prepared statements
For non-prepared statements, data and the statement itself are passed in the same string and although the statement will look the same in subsequent calls, the data does not match, thus disabling this optimization.
Prepared statements
For prepared statements, only pass the statement without the data to the database and that form gets cached.

When you use the statement, data is then passed to it and the statement gets executed. Normally, the statement is compiled at prepare time, but subsequent prepares already match the cache and are not recompiled. This technique promotes very high statement cache hit ratios (close to 100%) that minimizes the amount of statement compilations. For small database accesses, this can decrease the statement execution time by up to 90%.

Close all Statements Properly

When dealing with database access code in BMP implementations, never leave the statements open after database access calls. Each open statement corresponds to an open cursor in the database. (While the garbage collector eventually claims the open statement and closes it at garbage collection (GC) time, you do not have control over the time the GC kicks in. Do not enforcing GC by calling the System.gc method.) Leaving statements open causes the database to have excessive open cursors, which cost resources in the database that could otherwise be utilized to improve database performance.

Also, be sure to catch all exceptions properly when closing the statements. An exception in closing one statement must not cause other statements to be ignored and left open.

Avoid Deadlocks

Application code does not have direct control when ejbStore (or their equivalents for CMP) is called. The container decides when to make this call, which is usually at the end of a transaction.

If multiple entity beans or multiple entities are involved in a transaction, the sequence in which ejbStore gets called is not defined. This also means the user has no control over the access/locking sequence to the database records representing those entities. A mixed up locking order very likely results in deadlocks when multiple tables/rows are involved.

Looking on the bright side, ideally, container-controlled access and locking in EJBs should allow the container to account for deadlocks and free the developers/deployers from such worries. Unfortunately, few if any of the currently available commercial application servers do a good job in ordering database locks, leaving deadlock problems in complex entity bean deployments to the deployer.

The applicable rule, at least at the time of this writing, is to assume the container will invoke the database access calls of the beans in the same sequence a transactional method of the bean is first accessed as part of the transaction. To make this clearer, consider the following example: Assume entity bean EB1 has transactional method m1 and entity bean EB2 has transactional method m2. If EB1.m1 is accessed before EB2.m2 as part of the same transaction, you can also assume that EB1.ejbLoad is called before EB2.ejbLoad and EB1.ejbStore is called before EB2.ejbStore. This means the entity or database record representing EB1 is to be locked before EB2. To avoid deadlocks, make sure that EB1 is always called before EB2 in any transaction throughout the whole application.

As application servers become more intelligent and know how to order database accesses, authors and deployers can be less careful on the bean access sequence. However, code that strictly deals with invocation order as in the example provided will continue to run on future servers as well as today's.

The Vehicle and Car bean source code provided here illustrates practical use of most of these rules and can be used as a basis or template for future entity beans developments. ( To download the jar file, right-click on the link and then click 'Save Link As'.)

Going Forward

These rules, when utilized with entity-intensive developments and deployments, can significantly help increase the performance and flexibility of the beans, allowing them to adapt to different persistent storage types while minimizing their load on the container and the underlying system. This gives the deployer the flexibility to choose the most suitable deployment infrastructure and all allowing the beans to make efficient use of the infrastructure provided. In other words -- write once run anywhere, efficiently.

For More Information

Enterprise JavaBeans downloads and specifications
jGuru's Enterprise JavaBean Fundamental short course
Recent Java Developer articles on Enterprise JavaBeans
The ECperf workload

About the Author

Akara Sucharitakul was one of the developers of the ECperf benchmark and other J2EE server based applications inside Sun. He has more than 5 years of experience with Java technology and has been working on J2EE technology since its early days.

Reader Feedback

Tell us what you think of this tutorial.

Duke

Have a question about programming? Use Java Online Support.

[ This page was updated: 25-Apr-2002 ]

Developer Sites:

Glossary | Help Pages

For answers to common questions and further contact
information please see the java.sun.com Help Pages.

Unless otherwise licensed, code in all technical materials herein
(including articles, FAQs, samples) is provided under this License.