Wednesday, February 3, 2016

2016-02-03 status

Done

Data Collection

  • Moved abcDE editor to new release home on github.io.
  • Proved concept (mostly) of integrating new editor into Qualtrics survey.
  • Declared abcDE feature complete.
  • Wrote end-user and administrator documentation.

Data Analysis

  • Created new GitHub repo.
  • Played around with Bibtex.
  • Drafted some nullable hypotheses.
  • Spun wheels.

     Doing

    1. Discussing each descriptive chart in detail--its purpose, insight (if any).
    2. Determine predictive value of one fingering sequence choice on another.
    3. Performing Chi Square analysis of exploratory dataset to correlate abbreviated Parncutt fingerings with gender, reach, age, Hanon usage, technical practice, preparation actions, injury, etc.

    Struggling

    • New (?) Qualtrics and new abcDE editor are not getting along. 

    Sunday, January 24, 2016

    Czerny uber alles

    I think we should develop a corpus of Czerny's published fingerings and use it as an initial source of ground truth. To assist in this effort we will first flesh out the ideas of editorial conventions and full specifications and use these principals to create an MDC5 (abcDE) with partial input and auto-complete features. Then we see how hard it is to annotate WTC.

    Then Op. 821 and The Little Pianist.

    Null hypotheses


    1. There is no correlation between gender and fingering preferences.
    2. There is no correlation between hand size and fingering preferences.
    3. There is no correlation between scale fingering preferences and fingering preferences of non-scale musical fragments.
    4. There is no correlation between fingerings of one Parncutt fragment and another.
    5. Editorial fingering suggestions have no effect on fingering preferences.
    6. There is no correlation between fingering preferences and Hanon practice.
    7. There is no correlation between fingering preferences and technical study practice.
    8. There is no correlation between the age of the pianist and fingering preferences.
    9. There is no correlation between average pivot count in subject fingerings and fingering "difficulty" (variability).
    10. There is no correlation between pitch span within a fragment and fingering "difficulty" (variability).

    Wednesday, January 20, 2016

    2016-01-20 status

    Done

    Data Collection

    • Created new feature-rich manual data collection tool (formerly MDC5 but now rechristened abcDE) to allow input of partial fingering information.
    • Created detailed help page for same.
    • Defined file format, abcD, and implemented its recognition by the new editor.
    • Defined abcDF grammar for fingering sequences.
    • Created parser for abcDF.
    • Defined public interface to abcDE JavaScript library.
    • Deployed abcDE to nlp.cs.uic.edu.
    • Completed initial descriptive statistics for survey data.
    • Wrote script to identify note-wise fingering "consensus" in Survey I data.

       Doing

      1. Drafting paper for ISMIR 2016 to describe abcDE.
      2. Performing Chi Square analysis of exploratory dataset to correlate abbreviated Parncutt fingerings with gender, reach, age, Hanon usage, technical practice, preparation actions, injury, etc.
      3. Looking at how well selecting fingering a in Exercise A predicts selecting fingering b in Exercise B. That is, do people have common patterns of fingering preference?
      4. Evaluating Tableau for easier (and richer) data visualization. (SQLite support missing for OS X.)

      Struggling

      • What? Me worry?

      Wednesday, December 16, 2015

      2015-12-17 status

      Done

      Data Collection

      • Closed out initial program advance in TEM.
      • Created a number of database views to make survey data easier to process.
      • Created 50 bar charts using R and LaTeX to describe distribution of answers to 50 survey questions.
      • Calculated "course-grained" mean agreement of Parncutt sequences in both Parncutt et al. and in our survey. (Think Fleiss's Kappa with each fingering representing a category, 191 annotators, one example to classify, and no need to worry about chance agreement. Anyway, it made sense to me while I was doing it.)
      • Split 191 complete responses into two datasets ("exploratory" and "validation") for out-of-sample testing in an effort to mitigate effects of data dredging if/when our quest for correlation runs amok.

         Doing

        1. Arranging with BDE for Alex to receive CS 398 credit next term for BowTIE work.
        2. Performing Chi Square analysis of exploratory dataset to correlate abbreviated Parncutt fingerings with gender, reach, age, Hanon usage, technical practice, preparation actions, injury, etc.
        3. Looking at how well selecting fingering a in Exercise A predicts selecting fingering b in Exercise B. That is, do people have common patterns of fingering preference?
        4. Analyzing "consensus" of finger choice that follows abbreviated Parncutt fingering. How arbitrary were these sequences? Can we identify more suitable sequences in our data?
        5. Doing more basic descriptive statistical analysis of survey data.
        6. Evaluating Tableau for easier (and richer) data visualization. (SQLite support missing for OS X. 

        Struggling

        • Can we legitimately treat fingerings as categories? 
        • How can we conflate the unpopular fingerings meaningfully? 
        • Not sure how out-of-sample testing will complicate contemplated ad hoc category definition.

        Data dredge

        As I am about to embark on my quest to find correlation in the data, I am chastened by fears of data dredging. So I propose we do randomized out-of-sample testing. Toward this end, we will split the data into two subsets of approximately equal size.

        The following query gives us the identifiers for subjects who completed both parts of the survey and for whom at least some fingering data were recorded:
        select response_id
        from well_known_subject s
        inner join
        (select distinct subject
        from finger where fingers != '') f
        on s.response_id = f.subject
        We save this query as the "complete_response_id" view. There are 191 such response_ids.
        So we load the "exploratory_response_id" table like so:
        insert into exploratory_response_id
        select response_id
        from complete_response_id
        order by random()
        limit 96
        The 95 response_ids  not included in this table are stored in the "validation_response_id" view:
        select c.response_id
        from complete_response_id c
        where not exists (select response_
        from exploratory_response_id e
        where e.response_id = c.response_id)
        The actual (scrubbed) profile data will remain in the "subject" table. We will create views to provide access to the appropriate data ("exploratory_subject" and "validation_subject"), which will leverage the subject_latexable view of the subject data to use camel-case column names. This makes it unnecessary to remap the column names in R.

        Wednesday, December 2, 2015

        2015-12-02 status

        Done

        Data Collection

        • Paid lottery winner for Survey II and notified UPAY1099 through PEAR of two payments to winners. Received confirmation of receipt from Amazon. 
        • Loaded data from Survey II to SQLite. 
        • Developed workflow to create charts in R and render them in LaTeX. 
        • Did some manual cleanup for "well-known subjects" (people who completed both parts of the survey), on whom we will focus our analysis. (In most conservative interpretation of the protocol, subjects who walk away at any point are discarded, except for two subjects from whom I obtained permission to retain their nearly complete submissions. I manually set the "finished_2" field in the database, so they would be included.) There are 199 such well-known subjects, of whom we have actual fingering data for 191. 

        BowTIE

        • Met with Jackson and helped get his Android "Hello World" working. 
        • Arranged for Alex to come to my meeting with BDE this week to discuss CS 398 credit next term. 

        Doing

        • Relearning R and LaTeX. 
        • Doing basic descriptive statistical analysis of survey data.

        Struggling

        • Upgrade to El Capitan at your peril. Root is no longer root in Mac Land. Broke Perl and LaTeX environment. 
        • Working with Sherice to close out $200 cash advance.