Here are the summary numbers for the first batch of revisions Justin gave me compared to Anne's (over sections 2.1, 3.1, 4.1, 5.1, and 6.1). "Staff 1" is the upper staff. "Staff 2" is the lower. We are looking at the "kappa" score. A kappa score of 1.0 reflects perfect agreement. Anything over 0.8 is fantastic. No one is going to argue with agreement like that. But many people (like me and Barbara) will be suspect of anything below 0.67.
Staff 1 kappa: 0.878523692967741
Agreement: 1255/1262 One: 31 Other: 28
Staff 2 kappa: 0.409898477157359
Agreement: 914/945 One: 33 Other: 21
Overall kappa: 0.655041358347147
Agreement: 2169/2207 One: 64 Other: 49
Anne is "One," and Justin is "Other." These are the complete phrase counts annotated by each of you. Note that Justin has 12 fewer phrases marked on the lower staff than Anne, but only 3 fewer on the upper staff.
It strikes me that all of this may be trying to tell us that phrasing in these left-hand (accompanying) voices are inherently more ambiguous than in the conventionally "melodic" voices--or that two distinct principles are being applied here. It seems quite plausible that a bass line may have its own agenda, over which an independent melody can form its own thoughts. Justin seems to see the situation more like that, while Anne seems to have a stronger sense that accompanying lines reflect the phrasing of the lines they are accompanying.
I am not saying either of you is right or wrong, but we need to see if we can find a way to get you to agree. I am also curious to see if there is any discussion in the (music theory) literature of the interplay between melodic and accompanying phrases.
Also, we did not see this disagreement in Sonatina 1. Why would that be?
I attach a zip file with the phrasing each of you originally submitted (except for 1.1 and 1.2, which have Anne's corrections). Please follow the "Comparing Annotations" procedure at the bottom of this blog post to review the data in abcDE.
I have removed the sub-phrase and motive marks, so we don't have to struggle differentiating those little vertical lines. I would be interested in hearing Anne and Justin offer a rationale for their own and maybe each other's segmentation, with reference to specific examples.
We can of course sidestep this issue for the time being by ignoring the lower staff, but I am not quite ready to do that yet.