Bitten: A Python framework for collecting software metrics from automated builds
Today's automated builds generate a tremendous volume of information about the state of software development projects. This starts with basic status indicators such as compilation errors and test failures, but is increasingly extended to include advanced software metrics such as dependency analysis, code coverage analysis or style checking.
Traditionally, continuous integration systems such as Tinderbox, Build Bot?, or Gump only record and display the data that the build system prints to the standard output and error streams. Thus all the information about a code base generated by the build often cannot be used to full extent.
To effectively provide value for the ongoing development and management of a project, data generated by builds needs to be collected in a central repository, and in a machine-readable format, to allow for analysis and presentation of the data even long after the actual build has been run. In addition to being able to adjust how data is analyzed and presented in retrospect, this approach is essential for historical reports that show how specific metrics are evolving over time -- which is often more valuable than the absolute values of these metrics at one specific point in time.
The goal of this work is to design and implement a distributed system for automated builds and continuous integration that allows the central collection and storage of software metrics generated during the build. The information collected this way needs to be structured and available in a machine-readable format, so that it can be analyzed, aggregated/correlated and presented after the build itself has completed.
The system is required to meet the constraint of neutrality towards programming languages and tool chains: at its core, it must not assume that any particular language or build tool is in use by a project. Rather, it should provide a generic framework for the execution of builds, and collection of data from builds, and for persisting this information in a central location to make it available for various kinds of reports. The system needs be extensible to support various specific languages and tool-chains in a meaningful manner.
The presentation layer of this system will be built on top of Trac ( http://projects.edgewall.com/trac/), a simple web-based application for managing software development projects, written in Python. Trac provides a view of the projects version control repository, a wiki for collaborative documentation and an issue tracker for managing defects and tasks. All of this is held together by a simple wiki syntax that can be used everywhere for linking to any kind of object (for example wiki pages, changesets and tickets), a “timeline” view that shows recent activity in all of those areas, and a generic search facility.
The design of the system will be based on distributed CI systems such as Tinderbox and Build Bot?: a central build orchestrator (or build master) is responsible for the coordination of several build slaves that do the actual work of executing builds. The orchestrator is a daemon that knows what to build and how to build it; it provides this knowledge as a build recipe to the build slaves, which report their status and results back to the orchestrator after the build - or parts of the build - have completed.
The system will be composed of three layers: the build slave, the build master, and the Trac plugin. There are three core aspects that all three layers deal with: the build itself, the generated data, and the status of the build. The main emphasis of this work will be on the second of these aspects: the conversion, collection, and presentation of the data generated by builds.
- Data conversion
- There is a large variety of different tools that generate data in different formats, including the build system itself, as well as any additional tools integrated with the build, such as unit testing frameworks, code coverage analyzers or style checkers. The data produced by these tools needs to parsed and converted so that it can be handled appropriately. This conversion is done by both the build slave (mainly to prepare the data for transmission to the master) and by the build master (to convert the data into a format suitable for storage and analysis.)
- Data collection
- The build master collects all the data reported back by the individual build slaves and writes that data to a persistent store. The way it is stored needs to be oriented towards the requirements of providing flexible reporting capabilities. All collected data is always tagged with the revision against which the build was made so that it's possible to correlate the information with other data such as repository activitity.
- Data presentation
- Presentation of the collected data is handled by a Trac plugin. This plugin has access to the database maintained by the build master and provides means to visualize the data or make it otherwise accessible through the web interface. Trac itself will be extended to expose additional extension points where necessary, for example to integrate software metrics and stastistics in various places, such as the timeline and the repository browser.
The initial focus of tool support will be on Python projects: the standard modules unittest and trace.py can be used for unit tests and code coverage, and third-party scripts such as pychecker and Py Lint? may be used for style checking and other metrics. The basic infrastructure should however be capable of dealing with projects using other languages and tool chains, such as Java projects using Ant or Maven, or C/C++ projects using make.
The build slave script is intended to be light-weight and have a minimum number of dependencies. It should not depend on Trac or frameworks such as Twisted. In a nutshell, every machine that has the necessary tool chain installed to perform the build itself should be able to perform builds without requiring the installation of any additional software apart from the Bitten client. Furthermore, while the default Bitten client will be implemented in Python, master and slave should be sufficiently decoupled to allow the use of an alternate client implementation (for example one written in Java).
To decouple the master and slave, an application protocol will be defined on top of the meta-protocol BEEP (Blocks Extensible Exchange Protocol, RFC 3080). BEEP was chosen because it provides peer-to-peer communication (so that both the client and the server can send requests to the other) and because of its relative simplicity compared to protocols such as XMPP.
Another important implementation consideration is how the collected data is to be stored in the central repository. The data generated by automated builds can almost always be mapped to the physical and/or the logical view of the code base, where the physical view corresponds to files and line numbers, while the logical view is composed of units such as packages, classes and functions. Specific metrics annotate either view with the extracted information. The attributes of the annotation, however, depend entirely on the type of metric. As this does not easily map to relational databases, the use of alternatives such as an XML database (for example Sleepycats' DBXML) needs to be considered.