The Key Program

The key program takes a list of notes and beats, and produces a key analysis, dividing the piece into sections labeled with key names. It can also take harmonic information and produce a "Roman numeral analysis", showing each chord's function relative to the key.

As input, the key program requires a list of beat statements, chord statements (these are optional), and note statements. The note statements may be of two forms. First, traditional note statements of the form "Note [ontime] [offtime] [pitch]" can be used, just as with the other programs. (A list of note statements plus beat statements can be generated using the meter program.) Alternatively, "TPCNote statements" can be used, of the form "TPCNote [ontime] [offtime] [pitch] [TPC]". (TPCNote and Note statements cannot be combined in a single input file; this will produce an error.) In TPCNote statements, the TPC value is a "tonal pitch class" - an integer representing a point on the line of fifths. The line of fifths represents pitch classes, but also respects spelling distinctions; different spellings of the same pitch (e.g. Ab and G#) are located at different points on the line. The following conventions are assumed (these are the same conventions assumed in the harmony program):

...-2 -1  0  1  2  3  4  5  6  7  8  9 10 11 12 ...
... Ab Eb Bb F  C  G  D  A  E  B  F# C# G# D# A#...
For example, if TPCNote input is used, the input file for the beginning of "Yankee Doodle" might look like this (only the first few statements of each type are shown):
        Beat      0  4
        Beat    125  0
        Beat    250  1
        Beat    375  0
        Beat    500  2
        Beat    625  0
        Beat    750  1
        Beat    875  0
        .
        .
        .
        Chord      0    250  3
        Chord    250    500  3
        Chord    500    750  3
        Chord    750   1000  3
        .
        .
        .
        TPCNote      0    250  67   3
        TPCNote    250    500  67   3
        TPCNote    500    750  69   5
        TPCNote    750   1000  71   7
        .
        .
        .

The format just described - with Beat, Chord, and TPCNote statements - is exactly the format that will be outputted by the harmony program with verbosity=0; thus appropriate input files can easily be generated in this way. Beats must be in chronological order in the input; if chords are provided, they must be in chronological order in terms of their ontimes, must be non-overlapping, and must completely cover the piece (from the first beat to the last beat). These requirements should be enforced by the harmony program.

From this input, the program then produces a key analysis, dividing the piece into sections labeled with keys. With verbosity=1, the output of the program for Gavotte from Bach's French Suite No. 5 will look like this:


G  G  G  G  G  D  D  D  D  G  
G  G  Em Em Em Em Em G  G  G  
G  G  G  G  G  

Each letter indicates a segment, with the chosen key of that segment. For convenience, a new line is begun after each tenth segment. (These segments will be explained below.) Major keys are simply indicated by the letter name, minor keys by the letter name plus "m".

The program is based on Carol Krumhansl and Mark Schmuckler's key-profile model (as explained in Krumhansl's book Cognitive Foundations of Musical Pitch [Oxford University Press, 1990]). Each key has a key-profile: a vector representing the optimal distribution of pitch-classes for that key. The input piece is divided into segments, and within each segment, an input vector is generated representing the pitch-class content of the segment; the input vector is compared to key-profiles for each key, and the key is chosen that yields the best match. The program also has a way of handling modulation: in considering a key for a segment, a penalty is assigned if the key differs from the key of the previous segment. In this way, it will prefer to remain in the same key, other things being equal, but will change keys if there is sufficient reason to do so. (The details of how the key-matching process is done can be adjusted using the parameters. See the following section on "Implementing Key-finding Models".)

For the segments that it requires, the program uses one level of the metrical structure of the inputted piece. For each beat at that level, a segment is formed which extends to the following beat at that level. It seems to work best to use segments of 1 to 2 seconds; most often this will mean using beats of level 3 (though also sometimes level 2 or 4).

The program also has the option of producing a "Roman numeral analysis". Here, it uses the chord statements in the input - showing the root at each point - and the key information to produce Roman numeral symbols, indicating the function of chords relative to the key, as well as other information such as mode and extension. This is largely a matter of simple lookup. If a chord has root C, and the key is C major, then it must be a I chord. Pitch information is used to determine the mode of the chord (major, minor, or diminished), the extension (triad or seventh), and the inversion (which note is on the bottom of the chord). One symbol is shown at each beat at a given metrical level (specified by the user); however, if a chord is the same as the previous chord shown, only a dot is shown. Somewhat complex rules are used here to label underspecified chords (for example, if neither the major nor minor third is present), and to decide whether chord symbols should be shown (if a chord is the same as the previous chord shown except for its inversion, it will not be shown). At the moment, the system is capable of handling most diatonic chords and secondary dominants, but cannot handle most chromatic chords.

For the first five measures of the Bach Gavotte, the program produces this. (Vertical bars indicate segments; letters followed by colons indicate keys):

     G: I  .  |V6 .  vi .  |iii6 .  IV ii6 |V  V43 .  .  |I  .  .  vi |D: V42 .  ii42 V64 |

The key program can also be run without any harmonic information as input. In this case, no Roman numeral analysis will be performed (even if romnum=1; see below).

To run the key program, enter the "key" directory, and type

	key-profile [input-file]

Like the other programs, the key program must be compiled by typing "make" from within the key directory.

Implementing Key-Finding Models. Using the Melisma key program, one can implement several different key-finding models using the user-settable parameters. This relates to the way the key-matching process is done, and also to the specific key-profiles that are used. Regarding the key-matching process, there are four possibilities (the desired method can be chosen using the parameter "mode"):

1. The Krumhansl-Schmuckler algorithm (mode=0) (as specified in Krumhansl 1990), hereafter called the K-S model. In this method, the input vector for a segment represents the total duration of each pitch-class in the segment. The match between the input vector and each key-profile is calculating using the standard correlation formula.

2. The algorithm specified in Temperley 2001, hereafter called the CBMS model (mode=1). In this method, the input vector for a segment simply has 1 for a pitch-class if it is present at all in the segment (the duration and number of occurrences of the pitch-class are ignored) and 0 if it is not; the score for a key is given by the sum of the products of key-profile values and corresponding input vector values (which amounts to summing the key-profile values for all pcs present in the segment).

3. A variant of the CBMS model (mode=2). Under this method, the input vector value is 1/N (where N is the number of pcs in the segment) for pcs present in the segment, 0 otherwise. (In effect, this is like the original CBMS model, except that the score for a key is given by the average key-profile value for all the pcs present in the segment. This method is not recommended and will not be discussed further.)

4. A "Bayesian" key-finding algorithm (mode=3) (see Temperley, "A Bayesian Key-Finding Algorithm", in Music and Artificial Intelligence, Springer, 2002). Under this method, the key-profile values (Kpc) are taken as probabilities, indicating the probability of each pitch-class occurring in a segment of the given key. Choosing a key for a segment is then a matter of choosing the key which is most likely given the pitch-class set that is observed. For a given key, the score is the product of Kpc values for all pitch-classes present in the segment, times (1-Kpc) for all pcs not present.

For the KS model and CBMS model, the score for a total analysis of a piece (a labeling of each segment with a key) is given by the sum of key-profile scores for all segments, plus change penalties for all key-changes from one segment to the next. (In the Bayesian model, the overall score represent the product of key-profile scores and change penalties.) In each case, the chosen analysis will be the one yielding the highest overall score.

Each of the three models was proposed with certain key-profile values (one profile for major keys, another for minor keys). All three key-profiles sets are provided as possibilities in the parameter file. Of course, any key-profile values could be used with any model. (For the Bayesian model, it is assumed that parameters values will be interpretable as probabilities - i.e. numbers between 0 and 1.) The default setting of the "mode" parameter is for the CBMS model, and other default parameter values are set for optimal performance on this model. For other important points about the various key-finding models, see the discussion of parameters below.

Parameters. The user-settable parameters for the program are as follows. (NOTE: In some cases, the parameter file lists more than one statement for a parameter. However, all but one statement should be commented out; otherwise, the program will use the last statement seen.)
verbosity=1 The amount of information outputted by the program. With verbosity=1, the program displays the key choices for each segment. With verbosity=2, other information is also shown.
npc_or_tpc_profile=1 The program can either be run with a "TPC profile", in which TPC distinctions are recognized (so that, for example, Ab and G# have different profile values), or an NPC profile, in which they are not. If npc_or_tpc_profile=1, a TPC profile will be used; if 0, an NPC profile. NOTE: If npc_or_tpc_profile=1, then TPCNote statements (rather than Note statements) must be provided in the input; otherwise the program will output an error and quit. (The program cannot be expected to use TPC information if it is not included in the input!) If npc_or_tpc_profile=0, then either kind of input may be used; if TPC input is used, TPC labels will simply be ignored.
mode=1 The program can perform the key-matching searches in several different ways (as described in "Implementing Key-Finding Models" above); the mode parameter controls this. 0=the K-S model, 1=the CBMS model, 2=a variant of the CBMS model, 3=the Bayesian model. NOTE: If mode=0 or 3, then npc_or_tpc_profile must be set to 0, otherwise the program will output an error message and quit. (Neither the K-S model nor the Bayesian model is capable of dealing with TPC distinctions.)
major_profile =
5.0 2.0 3.5 2.0 4.5 4.0 2.0 4.5 2.0 3.5 1.5 4.0
minor_profile =
5.0 2.0 3.5 4.5 2.0 4.0 2.0 4.5 3.5 2.0 1.5 4.0
These two parameters list the values for the major and minor key-profiles. The values are listed in chromatic scale order: 1, b2, 2, b3, 3, 4, #4, 5, b6, 6, b7, 7. (If a TPC profile is being used, values for TPC's other than these are controlled by "default_profile_value", described below.) The values shown at left are those of the CBMS model; values for the KS and Bayesian models are also provided in the parameter file.
default_profile_value=1.5 This is the default key-profile value for all TPCs which are outside the line-of-fifths range represented in the key-profile statements above. (This only applies when a TPC profile is being used.)
change_penalty = 12.0 The penalty for changing keys from one segment to the next; higher penalties will make key-changes less frequent. The default value of 12.0 is optimal for the CBMS model (mode=1); the parameter file also lists optimal values for the KS model (mode=0) and Bayesian model (mode=3). (When mode = 0, 1 or 2, this is a simple additive penalty. When mode = 3, the change penalty, cp, indicates the probability of switching from one key to another, so it must be a number between 0 and 1, and higher values will make key-changes more frequent; cp/23 is applied for each key-changing segment, and (1-cp) for each non-key-changing segment.)
segment_beat_level=3 This is the level of the metrical structure from which the key segments are generated.
running=0 A "running" analysis shows the program's complete analysis for the piece so far at each segment. If running=1 this analysis will be shown.
romnum=0 This determines whether or not the Roman numeral analysis will be performed and displayed. If romnum=1, it will be performed.
beat_printout_level=2 This relates to the Roman numeral analysis. For each beat at this level, a chord symbol is printed out (unless the chord is the same as the previous chord, in which case just a dot is printed). Any chord which begins after a beat at this level, and does not extend past the following beat at this level, will not be shown.
romnum_type=0 This determines the type of Roman numeral analysis done: the "Kostka-Payne" type (romnum_type=0), or the "Aldwell-Schachter" type (romnum_type=1). A "Kostka-Payne" analysis explicitly shows the mode of each triad: upper-case symbols ("IV") for a major triad, lower-case ("iv") for a minor triad. In an "Aldwell-Schachter" analysis, only upper-case symbols are used, and the mode of chords is indicated by the key. A "IV" is a major triad if the key is major, minor if the key is minor. (A major triad which would normally be minor is indicated as "IV#3".)