Team Process Data Warehouse

From its inception, the Process Dashboard has used a specialized data repository to manage project metrics. With the release of Team Dashboard 2.0 and the Team Process Data Warehouse, selected metrics can also be accessed in a relational database via SQL.

Database Design Philosophy

The relational database is the result of several years of analysis and design. It is designed as a data warehouse, with explicit goals for scalability, extensibility, queryability, and performance.

The relational database is explicitly designed to flexibly support several modes of operation. At the low end it can publish the data for a single individual or a single team; but at the high end it can capture massive amounts of information from many teams across one or more organizations. The exact same database schema is used for all of these deployments, enabling the creation of common reporting logic that can be leveraged at either the team or organizational level.

For more information about the requirements and analysis that have driven the database design, feel free to review the following high-level documents:

Database Contents

The database contains a wide range of team project data. The following documents describe the tables and contents of the database:

The database is an area of very active development. It currently publishes a significant subset of team project metrics, but more will be added in the future.

The database design strives to follow best practices for data warehousing. Please note that these best practices differ significantly from the third-normal-form designs that are common for OLTP databases. Individuals who are unfamiliar with data warehouse design are encouraged to review the literature on dimensional modeling for more information.

Data Flow

It is important to note that the data in the database is read-only. You can query data in the database (using SELECT statements) to generate your own custom reports, perform your own custom analyses, and export data into external systems. However, you should not attempt to INSERT, UPDATE, or DELETE the data in the database. Such changes will not flow back into the Process Dashboard. (If you need to modify Process Dashboard data, consider contacting the development team for more information about the REST APIs that are available for this purpose.)

Modes of Operation

The database can be deployed in a variety of scenarios to meet different analysis and reporting needs. A few of these different modes are described below.

Enterprise Data Warehouse

At the high end, it is possible for an organization to create a centralized data warehouse to capture the data from all of the team projects within their organization. This warehouse can be used as the basis for organizational reporting and for data mining and analysis. The Enterprise Data Warehouse is capable of capturing the current state of each team plan, as well as historical information about how the plans and data have changed over time. A variety of different database platforms are supported, from the open-source MySQL to commercial offerings like Microsoft SQL Server and Oracle.

Organizations who are interested in using the Enterprise Data Warehouse should contact Tuma Solutions for more information.

Lightweight Embedded Database

Many teams are interested in using SQL to query and analyze their internal project data. For example, they may wish to build custom charts or reports, or they may wish to build analysis tools that support their Planning Manager or Quality Manager. Such tools only need access to data from a single team (not the entire organization). To support these usage scenarios, and to make the data warehouse available to as many people as possible, a lightweight zero-configuration database is embedded in Process Dashboard 2.0 and higher.

Access via Custom JSPs and Servlets

The simplest way to access this embedded database is to write custom reports as JSPs or servlets, package these into a WAR file, and deploy that WAR as a Team Dashboard add-on. Your JSPs will be able to use simple, expressive HQL queries to access a great deal of team project data. See the custom JSP documentation and examples for more information.

External Access via JDBC

You can also access the embedded database from an external process using JDBC. To enable external access, choose the “Database” option on the “Tools” menu. A toggle button on that window will allow you to enable connections.

This lightweight embedded database accepts inbound connections via two different communications protocols:

  • If you are connecting to the database from Java code, you will use JDBC. You will need a copy of the H2 database driver; then, the JDBC URL is displayed on the preferences window.
  • If you are connecting from a non-Java client that does not support JDBC, you can also connect using the PostgreSQL communications protocol. Native PostgreSQL drivers are available for many programming languages, including C/C++, Java, .Net, Perl, Python, Ruby, Tcl, and ODBC. PostgreSQL connection information is displayed on the preferences window.

The lightweight embedded database has been integrated into the dashboard to provide simple, point-and-click access to the database without the need for complex configuration. This streamlined simplicity does come with a few tradeoffs. Most notably, the PostgreSQL connection does not publish metadata about the structure and relationships of tables, views, etc. As a result, if you need to explore the contents of the database or use visual query builder tools, it is recommended to select a database exploration tool that supports JDBC. Once you’ve designed your SQL queries in that tool, you can easily run those queries over the PostgreSQL interface if desired from a non-Java application or script. If you need to use a specific database design tool that does not support JDBC, you should consider using the Enterprise Data Warehouse instead of the lightweight embedded database.

Most teams will be interested in analyzing the data for a team dashboard. When database connections are opened for a particular team dashboard, the data in the database is the same data that the team dashboard uses to calculate rolled-up team reports.

Curious individuals can also publish a database from their personal dashboard. That database will only contain data from the team projects that you have joined; it will not contain data from standalone personal work (such as data from a PSP course). In addition, the database will only contain the data that you have exported for team rollups. To update the database with your most recent data, it will be necessary to perform an “Export my Data Now” operation.

Lightweight Standalone Database

The embedded database described above provides a very simple way to access the data for a particular team dashboard. But if an organization is looking to consume data from an automated batch process, it may not be desirable to launch a Team Dashboard GUI first and click on a button in the Preferences dialog.

To meet the different needs of a batch process or other automated analysis task, a command-line-driven database launcher is also provided. To start it, issue the following statement from a terminal window or command prompt:

java -jar [path]tpidw.jar [location] [options]

Where [path] is the complete path to the directory where the Process Dashboard is installed, and [location] is the name of a team dashboard directory or a data backup ZIP file. Running the command without a [location] argument will print information about other options that are available.

When run in this way, the standalone launcher will create a lightweight database and publish the data for the given team dashboard, all without starting any user interface. It will accept inbound connections (using the same communication protocols described above for the lightweight embedded database) until it receives a “quit” command on STDIN.

The standalone launcher can also host WAR files containing custom reports in JSPs or servlet form, just like the Team Dashboard. (Run the standalone app with no [location] argument for help on the command line options to enable this mode.) This makes it possible to run custom reports without launching a full Team Dashboard window. For batch processing, this also makes it possible to query the data using HQL, which is much simpler than SQL. If your custom report outputs the results in a format like XML, JSON, or CSV, your batch process can consume the data easily over HTTP, eliminating the need to negotiate SQL/JDBC wire protocols.

Feedback

The embedded database and broader data warehouse are areas of intense development by the Process Dashboard team. To shape this ongoing development, your feedback and suggestions are strongly welcomed. Please contact the development team with your ideas, concerns, criticism, and suggestions.