An Empirical Study on How Developers Reason about Module Cohesion
Several cohesion metrics have been proposed to support development and maintenance activities. The most traditional ones are the structural cohesion metrics, which rely on structural information in the source code. For instance, many of these metrics quantify cohesion based how methods and attributes are related to each other within a given module. Recently, conceptual cohesion metrics have been proposed for computing cohesion based on the responsibilities a given module realizes. Besides different flavors of cohesion, there is a lack of empirical evidence about how developers actually perceive cohesion and what kind of cohesion measurement aligns with developers' perception. In this paper we fill this gap by empirically investigating developers opinion through a web-based survey, which involved 80 participants from 9 countries with different levels of programming experience. We found that: most of the developers are familiar with cohesion; and they perceive cohesion based on class responsibilities, thus associating more with conceptual cohesion measurement. These results support the claim that conceptual cohesion seems to be more intuitive and closer to the human-oriented view of software cohesion. Moreover, the results showed that conceptual cohesion measurement captures the developers' notion of cohesion better than traditional structural cohesion measurement.
Research Questions
- How do developers perceive module cohesion? How do they reason about it?
- To what extent structural cohesion and conceptual cohesion relate with how developers rate cohesion of modules?
Pairs of Classes used for Participants' Analysis:
- DB_Backend.java vs. DB_InsertUpdate.java
- Main_Config2.java vs. DB_Helpers.java
- RelationSpouse.java vs. RelationParentChild.java
Survey demonstration:
Please, follow this link (it is hosted on an external server).
Raw data collected:
Transformation of the ratings output to apply Fleiss Kappa test:
Participants' Explanations and Coded topics:
- Familiarity with cohesion - responses and topics.
- 1st scenario of comparison (DB_Backend vs. DB_InsertUpdate) - responses and topics.
- 2nd scenario of comparison (Main_Config2 vs. DB_Helpers) - responses and topics.
- 3rd scenario of comparison (RelationSpouse vs. RelationParentChild) - responses and topics.
Fisher exact test:
- Cohesion familiarity vs. Cohesion ratings (of each scenario).
- Programming experience (years) vs. Cohesion ratings (of each scenario).
- Cohesion familiarity vs. Academic degree.
- Academic degree vs. Cohesion ratings (of each scenario).
Contact:
- Bruno Carreiro da Silva -- brunocs at dcc dot ufba dot br
- Claudio Sant'Anna -- santanna at dcc dot ufba dot br
- Christina Chavez -- flach at dcc dot ufba dot br