The Grouper Program

The grouping program groups the notes of a melody into phrases.

As input, the program requires a "note list" and a "beat list". To create the appropriate input format, we recommend running a notefile through the meter program first, then using that to generate both the note list and the beat list, which can then be piped into the grouping program. Notes and beats must be listed in the file in chronological order (the meter program will insure this).

The grouping program operates on monophonic inputs only. If the input file contains notes that overlap, the program will move the offtime of the first note back to the ontime of the second. So this

        Note 0 600 60
        Note 500 1000 62

gets transformed to this

        Note 0 500 60
        Note 500 1000 62
If two notes have the same ontime, the program will output an error message and exit. (There may of course be empty space--rests--between notes.)

A grouping analysis is simply a series of group boundaries between notes. (Specifically, the group boundary is located at the onset of the note following the phrase boundary.) The program considers all possible analyses, and uses three criteria to select the best one. First, it calculates a "gap score" for each pair of notes. The gap score is the sum of the inter-onset interval and the offset-to-onset interval for the two notes. Phrases receive a bonus proportional to the gap score between the notes at the boundary. Second, the program assigns a penalty to each group, depending on its length in terms of number of notes. Groups with a certain optimal number of notes (the default is 8) receive a penalty of zero; deviations from this value in either direction incur penalties. The penalty is logarithmic, so that the penalty for a 16-note group is the same as for a 4-note group, but it's also weighted by the length of the group. Finally, each phrase gets a penalty depending on the metrical position of its beginning, relative to the metrical position of the previous phrase's beginning. (Metrical position is defined as the number of beats between the previous level 3 beat and the beat of the phrase beginning.) If the two phrases do not have the same beginning, a penalty is assigned (the penalty is all-or-nothing). A similar (smaller) penalty is assigned for "in-phase-ness" at metrical level 4.

Parameters. Here are the user-settable parameters for the grouping program.
verbosity=1 With verbosity=0, the output of the program simply consists of the note list, with "phrase" statements indicating the location of phrase boundaries. (Each phrase statement has a timepoint - the ontime of the note following the phrase boundary.) For the first two phrases of "Yankee Doodle", the output looks like this:
Note 0 245 67
Note 245 490 67
Note 490 735 69
Note 735 1015 71
Note 1015 1260 67
Note 1260 1505 71
Note 1505 1750 69
Note 1750 1995 62
Phrase 1995 
Note 1995 2240 67
Note 2240 2485 67
Note 2485 2765 69
Note 2765 3010 71
Note 3010 3500 67
Note 3500 3990 66
Phrase 3990 
If verbosity=1, a graphic display will be printed, with time on the vertical axis and pitch on the horizontal. An example is shown below (condensed in width), for the first two phrases of "Yankee Doodle". The metrical grid is shown at left; phrase boundaries are indicated with horizontal dashed lines.
                C3          C4          C5          C6          C7
     0 x x x x x . - - - - - . - - -+- - . - - - - - . - - - - - . - - - - - .
   105 x         .           .      |    .           .           .           .
   245 x x       .           .      +    .           .           .           .
   350 x         .           .      |    .           .           .           .
   490 x x x     .           .        +  .           .           .           .
   595 x         .           .        |  .           .           .           .
   735 x x       .           .          +.           .           .           .
   875 x         .           .          |.           .           .           .
  1015 x x x x   .           .      +    .           .           .           .
  1120 x         .           .      |    .           .           .           .
  1260 x x       .           .          +.           .           .           .
  1365 x         .           .          |.           .           .           .
  1505 x x x     .           .        +  .           .           .           .
  1610 x         .           .        |  .           .           .           .
  1750 x x       .           . +         .           .           .           .
  1855 x         .           . |         .           .           .           .
  1995 x x x x x . - - - - - . - - -+- - . - - - - - . - - - - - . - - - - - .
  2100 x         .           .      |    .           .           .           .
  2240 x x       .           .      +    .           .           .           .
  2345 x         .           .      |    .           .           .           .
  2485 x x x     .           .        +  .           .           .           .
  2625 x         .           .        |  .           .           .           .
  2765 x x       .           .          +.           .           .           .
  2870 x         .           .          |.           .           .           .
  3010 x x x x   .           .      +    .           .           .           .
  3115 x         .           .      |    .           .           .           .
  3255 x x       .           .      |    .           .           .           .
  3360 x         .           .      |    .           .           .           .
  3500 x x x     .           .     +     .           .           .           .
  3605 x         .           .     |     .           .           .           .
  3745 x x       .           .     |     .           .           .           .
  3850 x         .           .     |     .           .           .           .
  3990 x x x x x . - - - - - . - - -+- - . - -  - - - . - - - - - . - - - - - .
If verbosity=2, both note list and graphic display will be shown as well as other information.
mode = 0 The grouping program contains a way of testing the program against phrase information from another source. This can be controlled with the parameters "mode". If "mode" is set to 1, testing will be done; if mode = 0, the output will simply be as described above. The testing is done as follows. In the input data, phrase boundaries can be represented with a line inserted in the note list at the location of the boundary, containing only "|". The program reads in these externally-given phrase boundaries. In outputting the note list, the program prints out its own boundaries with "Phrase" statements as described above. In addition, however, if the program locates one of its own boundaries at a location where there is no external boundary, it prints out "FP" (false positive). If it omits a boundary where there is an external boundary, it prints "FN". (The "testing" mode will only work when the note list is being printed out: i.e. verbosity = 0 or 2.)
optimal_length=10.0 This is the optimal length for phrases. Phrases whose lengths are either more or less than this value will be penalized. The penalty is logarithmic: if OPL is the optimal phrase length, a phrase of length 2 * OPL will receive a penalty of 1, as will a phrase of length OPL / 2.
length_weight = 600 This is the weight assigned to the phrase length penalty just described.
gap_weight = 500 This is the weight assigned to the penalty for the "gap" between two notes at a phrase boundary.
phase_3_penalty = 300 This is a penalty assigned for two adjacent phrases being "out of phase" at metrical level 3. Two phrases are in phase if they begin the same number of beats after a level 3 beat.
phase_4_penalty = 150 This is a penalty assigned for two adjacent phrases being "out of phase" at metrical level 4.
mark_first_phrase = 0 This parameter determines whether the beginning of the first phrase (which is always the beginning of the melody) will be indicated by the program.