Edgewall Software

Changes between Version 2 and Version 3 of Summer Of Code


Ignore:
Timestamp:
Jun 14, 2005, 10:40:35 PM (19 years ago)
Author:
cmlenz
Comment:

Focus on Python tool support

Legend:

Unmodified
Added
Removed
Modified
  • Summer Of Code

    v2 v3  
    99== Main objectives ==
    1010
    11 The goal of this work is to design and implementation of a distributed system for automated builds and continuous integration that allows the central collection and storage of software metrics generated during the build. The information collected this way needs to be structured and available in a machine-readable format, so that it can be analyzed, aggregated/correlated and presented after the build itself has completed.
     11The goal of this work is to design and implement a distributed system for automated builds and continuous integration that allows the central collection and storage of software metrics generated during the build. The information collected this way needs to be structured and available in a machine-readable format, so that it can be analyzed, aggregated/correlated and presented after the build itself has completed.
    1212
    1313The system is required to meet the constraint of neutrality towards programming languages and tool chains: at its core, it must not assume that any particular language or build tool is in use by a project. Rather, it should provide a generic framework for the execution of builds, and collection of data from builds, and for persisting this information in a central location to make it available for various kinds of reports. The system needs be extensible to support various specific languages and tool-chains in a meaningful manner.
    1414
    15 This system will be built on top of Trac (http://projects.edgewall.com/trac/), a simple web-based application for managing software development projects, written in Python. Trac provides a view of the projects version control repository, a wiki for collaborative documentation and an issue tracker for managing defects and tasks. All of this is held together by a simple wiki syntax that can be used everywhere for linking to any kind of object (for example wiki pages, changesets and tickets), a “timeline” view that shows recent activity in all of those areas, and a generic search facility.
     15The presentation layer of this system will be built on top of Trac (http://projects.edgewall.com/trac/), a simple web-based application for managing software development projects, written in Python. Trac provides a view of the projects version control repository, a wiki for collaborative documentation and an issue tracker for managing defects and tasks. All of this is held together by a simple wiki syntax that can be used everywhere for linking to any kind of object (for example wiki pages, changesets and tickets), a “timeline” view that shows recent activity in all of those areas, and a generic search facility.
    1616
    1717== Project scope ==
     
    2323 Data conversion:: There is a large variety of different tools that generate data in different formats, including the build system itself, as well as any additional tools integrated with the build, such as unit testing frameworks, code coverage analyzers or style checkers. The data produced by these tools needs to parsed and converted so that it can be handled appropriately. This conversion is done by both the build slave (mainly to prepare the data for transmission to the master) and by the build master (to convert the data into a format suitable for storage and analysis.)
    2424
    25  Data collection:: The build master collects all the data reported back by the individual build slaves and writes that data to some kind of persistent store, for example a relational database. The way it is stored needs to be oriented towards the requirements of providing flexible reporting capabilities. All collected data is always tagged with the revision against which the build was made so that it's possible to correlate the information with other data such as repository activitity.
     25 Data collection:: The build master collects all the data reported back by the individual build slaves and writes that data to a persistent store. The way it is stored needs to be oriented towards the requirements of providing flexible reporting capabilities. All collected data is always tagged with the revision against which the build was made so that it's possible to correlate the information with other data such as repository activitity.
    2626
    2727 Data presentation:: Presentation of the collected data is handled by a Trac plugin. This plugin has access to the database maintained by the build master and provides means to visualize the data or make it otherwise accessible through the web interface. Trac itself will be extended to expose additional extension points where necessary, for example to integrate software metrics and stastistics in various places, such as the timeline and the repository browser.
    2828
    29 The focus of tool support will be on Java and Python projects and the predominant build systems used with those languages. For Python projects, the standard modules unittest and trace.py can be used for unit tests and code coverage, and third-party scripts such as pychecker and PyLint may be used for style checking and other metrics. For Java projects, the integration of tools such as JUnit, Clover/jcoverage, and JMetric will be examined.
     29The initial focus of tool support will be on Python projects: the standard modules unittest and trace.py can be used for unit tests and code coverage, and third-party scripts such as pychecker and PyLint may be used for style checking and other metrics. The basic infrastructure should however be capable of dealing with projects using other languages and tool chains, such as Java projects using Ant or Maven, or C/C++ projects using make.
    3030
    3131== Implementation notes ==
     
    3535To decouple the master and slave, an application protocol will be defined on top of the meta-protocol BEEP (Blocks Extensible Exchange Protocol, RFC 3080). BEEP was chosen because it provides peer-to-peer communication (so that both the client and the server can send requests to the other) and because of its relative simplicity compared to protocols such as XMPP.
    3636
    37 Another important implementation consideration is how the collected data is to be stored in the central repository. The data generated by automated builds can almost always be mapped to the physical and/or the logical view of the code base, where the physical view corresponds to files and line numbers, while the logical view is composed of units such as packages, classes and functions. Specific metrics basically annotate either view with the extracted information. The attributes of the annotation, however, depend entirely on the type of metric. As this does not easily map to relational databases, the use of an XML database such as Sleepycats' DBXML will be evaluated.
    38 
    39 Furthermore, the presentation of the data by a Trac plugin will require some changes to Trac itself. Such changes are in order anyway to make Trac more flexible, and a real-world use case will help to push them.
     37Another important implementation consideration is how the collected data is to be stored in the central repository. The data generated by automated builds can almost always be mapped to the physical and/or the logical view of the code base, where the physical view corresponds to files and line numbers, while the logical view is composed of units such as packages, classes and functions. Specific metrics basically annotate either view with the extracted information. The attributes of the annotation, however, depend entirely on the type of metric. As this does not easily map to relational databases, the use of alternatives such as an XML database (for example Sleepycats' DBXML) needs to be considered.