Thursday, January 10, 2019

Clustering of editorial advice

A possible enhancement

Again, say we have the following fingerings in our gold standard:
  1. 1234 131 (1 annotator vote) .1
  2. 1xx4 1x1 (3 votes) .3
  3. 12xx x3x (2 votes) .2
  4. 1x1x 1xx (4 votes) .4
How much credit should a model get for suggesting 1234 131? Or 1214 131? That is, how likely is it that the user will accept the advice?

Do we just sum over all the matches? So 1234 131 would get .1 + .3 + .2 = .6? And 1214 131 would get .3 + .2 + .4 = .9? But don't we have more evidence that 1234 131 is a good fingering?

We proposed amplifying fingering sequences based on how many actual annotations it has:
  1. 1234 131 (7 notes x 1 annotator = 7 votes) .189
  2. 1xx4 1x1 (4 x 3 = 12 votes) .324
  3. 12xx x3x (3 x 2 = 6 votes) .162
  4. 1x1x 1xx (3 x 4 = 12 votes) .324
Or should we reduce the contribution of a sequence if it is shared with multiple sequence sets ("clusters")? Otherwise, we are over-representing the signal from less discriminating voices and in general over-stating the likelihood that a user will be satisfied.

So 1234 131 would get .189 + .324/2 + .162/2 = .189 + .162 + .081 = .432. And 
1214 131 would have .162 + .081 + .324 = .567.

My confidence in someone who seems to agree with me is diminished every time I see this person agree with someone I disagree with. So maybe we divide the amplified weight by 1 plus the number of wrongsters we detect agreeing with an editor we see agreeing with the system suggestion.

So 1234 131 would get .189 + .324/2 + .162/2 = .189 + .162 + .081 = .432. And 
1214 131 would have .324/2 + .162/2 + .324 = .567. 

Or actually we should consider the amplified weight of the offending fingerings.

Conversely(?), what if I agree with someone and then I see this person disagree with someone I also agree with? Is agreement transitive. Does A=B and B=C imply A=C? This is not possible, unless I need more coffee. The system suggestion never includes wildcards, and the only way to disagree is by failing to match a non-wildcard element. If I (the system) agree with two people, the two people must agree with each other.





No comments:

Post a Comment