Comment Cluster Analysis View

Comment Cluster Analysis View

This view presents a list of comments that have been identified by Symphony as being at the centers of groups of comments that may share a common theme.   Symphony identifies comment clusters by analyzing similarities between comments.   Those with a sufficient likeness are grouped as a cluster.   This view is used primarily for uncovering themes within your project.   The results may lead you to update your coding structure to accommodate the new theme, or it may reinforce the validity of your current coding structure if most of the comment in the cluster are already coded to the same code.

 

To identify comment clusters, Symphony performs an analysis to determine the extent to which comments are alike. This analysis is based primarily on the percent of words common between comments. As with the Word Analysis view, Word "stems" are utilized so that variations of the same word are recognized as "hits".

 

It is important to understand that Symphony does not perform any linguistic or context analysis. Accordingly, Symphony is not claiming that the Comments contained in a cluster do in fact express the same idea or even a related idea. Symphony does however identify the Comments with the greatest text similarities, and groups them together for you so you can examine them further to see whether some or all of the cluster does in fact express a related idea.

 

1 . Filter GroupFilter Group


The Filter Group allows you to specify a subset of the project for the analysis. By default, the analysis is performed on the entire project.

 

2 . Content Tree (Codes Only)Content Tree (Codes Only)


This Content Tree displays the current coding structure. It is provided on this view primarily simplify drag and drop coding of clustered Comments.

 

3 . Content EditorContent Editor


The Content Editor displays the currently selected Comment in the Cluster Details list.

 

4 . List of ClustersList of Clusters


This is the list of clusters. Each cluster displays the Comment that is considered to be at the center of the cluster. Clicking on a cluster updates the Cluster Details list to display the Comments making up the cluster.

 

5 . Cluster DetailsCluster Details


This list consists of the Comments that make up the cluster currently selected in the List of Clusters. Selecting a comment here displays the Comment in the Content Editor. The right-most column displays the strength of each Comment within the Cluster, and is represented as a number between 1 and 100. Clicking this column twice will sort the list in descending order, bringing the Comments with greatest similarities to the top of the list. The Comment that is the focus of the cluster always has a value of 100 because it is a perfect match.

 

6 . Cluster ScoreCluster Score


The cluster score is a measure of the strength of the cluster. It is composed of a combination of the number of Comments in the cluster, and the similarities of the Comments. Note that high scores can be produced when a large number of comments have moderate similarities. The score by itself is not meaningful. It is only important with regard to the scores of other clusters, in that the higher scores are more likely to contain a theme than those with lower scores.

 

7 . Comment ScoreComment Score


This value represents the extent of similarity each Comment in the cluster has with the focal Comment of the cluster. The focal Comment always has a score of 100, then the others go down from there. Clusters where all the related Comments have low scores (typically less than 30) are less likely to share a common theme.

 

Copyright (c) 2008 Active Java, LLC. All Rights Reserved.