JDBC Drivers: How Do You Know What You Need? (cont.) JDBC Driver Features
Sun Microsystems maintains a listing of JDBC drivers at: http://industry.java.sun.com/products/jdbc/drivers, shown in Figure 3. This web page offers a driver selector tool that allows you to generate a list of drivers that support specific features, including driver types, JDBC version support, and access to one or multiple specific databases. As of the writing of this article (March 2002) there were 155 drivers listed on this page! Note: The driver selector tool shown in Figure 3 hasn't yet been updated to account for JDBC version 3.0, so expect to see some changes in this tool in the months to come. What you see in Figure 3 might not match what is on the Sun Microsystems web page at a later date.
Let's consider the feature choices listed on the Sun Microsystems JDBC driver web page in a little detail, because, as we said, many people use this tool to select one driver over another:
- JDBC API version number (Any, 1.x or 2.x (select one)).
- Certified for J2EE: J2EE 1.3 or J2EE 1.2 (select any or all).
- Driver Type: 1, 2, 3, 4, All, or Any (select any or all).
- Supported DBMS. There are 83 choices, mostly vendor products, but a few of which are industry standards like JDBC, LDAP, ODBC, OLE DB Provider, Text (TSV), SQL/DS, and XML. One or a group of selections are supported.
- Required Features: DataSource, Connection Pooling, Distributed Transactions, and RowSets (pick any or all).
- Returns per page (specifies the number of matches returned).
|
Figure 3. Sun's JDBC Driver home page offers a selection tool that helps a developer select a driver based on its type and a number of other factors. |
Let's take a look at each feature in turn, and see if we can make sense out of each feature and their options.
JDBC API
The JDBC API is important because it determines the Java functionality available to a developer. While older Java applications may not be able to take advantage of the more advanced features provided by JDBC 3.0, any high-transaction, distributed application certainly would. What developers get when they use later versions of the API is any new DBMS and operating-system security enhancements, and the latest performance improvements such as advancements in connection pooling, statement pooling, RowSet objects, and so forth.
Certification
Remember that JDBC is a specification, and not a standardized piece of software. Vendors are free to implement their JDBC drivers to that specification as they see fit, and while some JDBC drivers are fully J2EE compliant; many other drivers are not. When considering one driver over another, the Certified for J2EE logo, as specified by the Sun Certification Test Suite (CTS) and administered by Key Labs, is one yardstick a company can use to measure a driver's quality. Any company who undertakes the certification process is more likely to pay attention to quality control in their product. You will find a listing of vendors who have endorsed the JDBC standard and have products in this area at: http://java.sun.com/products/jdbc/industry.html.
Required Features
The "Required Features" section shown in Figure 3 lists some features, introduced in JDBC 2.0, but required for JDBC version 3.0.
A DataSource is an object containing the connection information to a database that is managed by a JDBC driver. Data sources work with a JNDI (the Java Naming Directory Interface) service, and a connection is instantiated and managed independently of the applications that use it. Connection information, such as path and port number, can be quickly changed in the properties of the DataSource object without requiring code changes in the applications that use the data source. Currently 50 of the 155 listings (not all are drivers) on the Sun Microsystems web page support this feature.
JDBC supports connection pooling, which essentially involves keeping open a cache of database connection objects and making them available for immediate use for any application that requests a connection. Instead of performing expensive network roundtrips to the database server, a connection attempt results in the re-assignment of a connection from the local cache to the application. When the application disconnects, the physical tie to the database server is not severed, but instead, the connection is placed back into the cache for immediate re-use, substantially improving data access performance.
Opening a connection is the most resource-expensive step in database transactions. When an application creates a connection, multiple separate network roundtrips must be performed to establish that connection (for an Oracle connection, that number is nine). However, once the connection object has been created, there is little penalty in leaving the connection object in place and reusing it for future connections. This feature offers significant potential to improve data transfer performance and scalability, especially for application servers. This feature is supported by 50 of the listings on the Sun Microsystems web page.
In a distributed system, applications often must retrieve data using multiple transactions, and often from multiple data sources. Any transaction that requires the coordination of independent cooperating transactional systems is referred to as a distributed transaction. Aside from the performance characteristics of JDBC drivers, distributed transaction support is probably the most requested feature by Java database developers.
Distributed transactions require additional resources to be reliably supported, primarily because of the differences in latencies when retrieving data from different data sources, and because of interoperability issues. For example, a failed transaction is not easily differentiated from a slow transaction, requiring resource managers in a DTS be both registered and coordinated to handle ROLLBACK or COMMIT operations correctly without a lot of code development.
Typically, distributed transactions use a transaction manager to provide the coordination that the different resource managers need. With a transaction manager in a Distributed Transaction Processing (DTP) architecture providing the mediation between applications and resources, it's possible to provide applications with ACID transactions across multiple data sources.
In a situation where multiple steps are required to produce the desired result, there needs to be a software module, which is often a transaction manager, coordinating the process. A product like iPlanet Trustbase Transaction Manager is one example of this type of software. It's used in global banking and B2B systems to provide the services required to secure processing and routing based on a key structure, and includes cryptographic support and identity checking. Only 34 of the drivers on the Sun Microsystems JDBC driver web page offer this feature.
A RowSets object is the result set of a query containing rows from a tabular data source. RowSets have properties and event notification similar to JavaBeans and are a JavaBean component that can be created and used programmatically in a development tool. There are connected and disconnected rowsets. Disconnected rowsets are connected to a data source that is populated and do not require a JDBC driver when disconnected. These types of rowsets are small, and often are used to send data to a thin client. Disconnected rowsets are stored in memory along with their metadata and connection and execution instructions. A connected rowset maintains an open connection while the rowset is being used. Only 18 of the drivers now listed support the rowsets feature, which as this small number suggests, isn't an often used feature.
A note worth mentioning: On the Sun JDBC Driver site, Sun defines "support" for RowSets to mean actually packaging and shipping the RowSets with the JDBC driver. Thus, you will find that there are drivers on this site that, while not showing a check mark next to "RowSets," in fact do support multiple types of RowSets.
Not on the current list, but sure to be added as a searchable feature because it's in JDBC version 3.0, is a feature called prepared statement pooling. Essentially, statement pooling caches SQL queries that have been previously optimized and run so that, should they be needed again, they do not have to go through optimization pre-processing again. Statement pooling can be a significant performance booster and is something to look for in any JDBC driver.
Prepared statements are particularly valuable when the same SQL statement is executed many times. A query that populates the current status of a particular category of inventory, run many times during the day, would be one example. To execute the statement a second time, only the values of the statement's parameters need to be passed to it. All the optimization steps, such as checking syntax, validating addresses, and optimizing access paths and execution plans, are already cached in memory. Statement caching is a JDBC driver feature.
What's even better from a developer's standpoint is that statement pooling operates without a developer having to code for it (as does connection pooling). You get the performance enhancement when you buy a driver that supports this feature.
Statement pooling and connection pooling can work together in your applications, as long as your driver supports them. When a J2EE Enterprise JavaBean and session bean makes use of a connection from the connection pool, any connection that was previously used to run a SQL statement will have its statement pool already defined. Therefore, even the application's first attempt to prepare a statement for a particular connection will not require the statement preparation process. Statement pooling can be used on any connection, regardless of whether the connection came from a non-pooled data source, or from an application server that uses a pooled data source or a JDBC driver's connection pool.