Thursday, July 19, 2018

2018-07-19 status

Done

Administrivia

  1. Booked trip to ISMIR 2018 in Paris.

Model Building

  1. Finailized "Corrected Parncutt" implementation for everything by cyclic patterns.
  2. Confirmed inconsistencies in published Parncutt results:
    • Small-Span and Large-Span penalties are conflated.
    • Small-Span penalty definition is inconsistent.
    • Position-Change-Count and Position-Change-Size penalties are incorrect.
    • Penalty totals are incorrect.
    • Explanatory example has confusing/incorrect costs.
  3. Completed full regression test of code base.
  4. Met with Alex Demos and agreed to co-author paper for Music Scientiae on Parncutt, Corrected Parncutt, Improved Parncutt, and how to tell them apart. Will dry run some of this material in late-breaking paper at ISMIR.

    Doing

    1. Implementing support for, and clarifying definition of, cyclic pattern constraint in Parncutt. (Should also do this for Sayegh and Hart.)
    2. Double-checking pruning mechanism in Parncutt.
    3. Writing up findings on "Corrected Parncutt" model, initially for ISMIR submission.
    4. Adding mechanism to learn weights for "Improved Parncutt" rules from training data.

    Struggling

    1. How does one compare two ranked lists of sequences to a third and claim one of the two is more similar to the third in a statistically significant way? That is the big question.

    In Scope

    1. Implementing crude automatic segmenter.
    2. Developing staccato/legato classifier.
    3. Demonstrating improved Parncutt via #1 and #2.
    4. Debugging Sayegh model, which produces results inconsistent with training data.
    5. Developing better test cases for Sayegh.
    6. Updating abcDE to support manual segmentation.
    7. Completing and polishing abcD for entire Beringer corpus.
    8. Defining initial benchmark corpora and evaluation methodology.
    9. Implementing convenience methods for reporting benchmark results.
    10. Moving Beringer corpus to MySQL database.
    11. Enhancing Parncutt, following published techniques and pushing beyond them.
    12. Enhancing Hart and Sayegh to return top n solutions.
    13. Re-weighting Parncutt rules using machine learning and TensorFlow. (This seems like a good fit.)
    14. Adding support to abcDE for annotating phrase segmentation.
    15. Debugging Dactylize 88-key circuit.
    16. Collecting fingering data from JB performances in Elizabethtown.
    17. Completing Dactylize II circuit.
    18. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
    19. Defining procedure for sanity test of production automatic data collector (including Beringer data).
    20. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
    21. Implementing end-to-end machine learning experiment, using Beringer abcD data.
    22. Submitting papers to TISMIR. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

    Friday, June 22, 2018

    2018-06-22 status

    Done

    Model Building

    1. Re-implemented Parncutt cost functions for both hands.
    2. Identified possible inconsistencies in published Parncutt model description.
    3. Added feature to track more granular cost details in Dactyler models.
    4. Tracked individual rule costs to facilitate analysis of Parncutt results.
    5. Completed successful regression test for hacked-up Didactyl code. The APIs they are a-changin'.

      Doing

      1. Testing Parncutt cost functions.
      2. Reproducing original Parncutt results.
      3. Adding mechanism to learn weights for Parncutt rules from training data.

      In Scope

      1. Implementing crude automatic segmenter.
      2. Developing staccato/legato classifier.
      3. Demonstrating improved Parncutt via #1 and #2.
      4. Debugging Sayegh model, which produces results inconsistent with training data.
      5. Developing better test cases for Sayegh.
      6. Updating abcDE to support manual segmentation.
      7. Completing and polishing abcD for entire Beringer corpus.
      8. Defining initial benchmark corpora and evaluation methodology.
      9. Implementing convenience methods for reporting benchmark results.
      10. Moving Beringer corpus to MySQL database.
      11. Enhancing Parncutt, following published techniques and pushing beyond them.
      12. Enhancing Hart and Sayegh to return top n solutions.
      13. Re-weighting Parncutt rules using machine learning and TensorFlow. (This seems like a good fit.)
      14. Adding support to abcDE for annotating phrase segmentation.
      15. Debugging Dactylize 88-key circuit.
      16. Collecting fingering data from JB performances in Elizabethtown.
      17. Completing Dactylize II circuit.
      18. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
      19. Defining procedure for sanity test of production automatic data collector (including Beringer data).
      20. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
      21. Implementing end-to-end machine learning experiment, using Beringer abcD data.
      22. Submitting papers to TISMIR. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

      Wednesday, June 13, 2018

      2018-06-13 status

      Done

      Methodology

      • Contemplated a survey to find ground truth for interchangeable digits. (Discussed briefly with AFL and JB over email.)

      Model Building

      • Implemented trigram nodes in networkx for revamped Parncutt.

        Doing

        1. Learning Cytoscape for graph visualization to help debug graph code.
        2. Fixing cost functions in Parncutt.
        3. Validating Parncutt cost functions for left hand.
        4. Adding mechanism to learn weights for Parncutt rules from training data.
        5. Reading some fingering pedagogy (C. P. E. Bach, Couperin, Rami Bar-Niv) to develop a vocabulary for talking to pianists.

        Struggling


        1. A body at rest tends to stay at rest.
        2. Does this topic make sense for my new career situation?


        In Scope

        1. Implementing crude automatic segmenter.
        2. Developing staccato/legato classifier.
        3. Demonstrating improved Parncutt via #1 and #2.
        4. Debugging Sayegh model, which produces results inconsistent with training data.
        5. Developing better test cases for Sayegh.
        6. Updating abcDE to support manual segmentation.
        7. Completing and polishing abcD for entire Beringer corpus.
        8. Defining initial benchmark corpora and evaluation methodology.
        9. Implementing convenience methods for reporting benchmark results.
        10. Moving Beringer corpus to MySQL database.
        11. Enhancing Parncutt, following published techniques and pushing beyond them.
        12. Enhancing Hart and Sayegh to return top n solutions.
        13. Re-weighting Parncutt rules using machine learning and TensorFlow. (This seems like a good fit.)
        14. Adding support to abcDE for annotating phrase segmentation.
        15. Debugging Dactylize 88-key circuit.
        16. Collecting fingering data from JB performances in Elizabethtown.
        17. Completing Dactylize II circuit.
        18. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
        19. Defining procedure for sanity test of production automatic data collector (including Beringer data).
        20. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
        21. Implementing end-to-end machine learning experiment, using Beringer abcD data.
        22. Submitting papers to TISMIR. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

        Friday, March 23, 2018

        2018-03-22 status

        Done

        Administrivia

        • Requested suspension of TAA support until next January (for tax purposes).
        • Answered interview questions for College of Engineering story on Fifty for the Future award.

        Model Building

        • Completed Sayegh implementation.
        • Learned how to modify TatSu AST. 
        • Implemented "pivot alignment" evaluation method.
        • Refactored code for better reuse in new models, especially for segmenting input.
        • Fixed how first and last fingering were being constrained to ensure model preferences for second and penultimate fingerings were not ignored.
        • Drafted ISMIR abstract.
        • Drew a diagram of the subproblems in the domain.

          Doing

          1. Writing up what we have done so far.
          2. Reimplementing Parncutt model in framework using networkx.
          3. Developing better test cases for Sayegh.
          4. Implementing crude automatic segmenter.
          5. Updating abcDE to support manual segmentation.
          6. Completing and polishing abcD for entire Beringer corpus.
          7. Defining initial benchmark corpora and evaluation methodology.
          8. Implementing convenience methods for reporting benchmark results.

          Struggling

          1. Sayegh model produces results that do not seem consistent with training data provided.

          In Scope

          1. Moving Beringer corpus to MySQL database.
          2. Enhancing Parncutt, following published techniques and pushing beyond them.
          3. Enhancing Hart and Sayegh to return top n solutions.
          4. Re-weighting Parncutt rules using machine learning and TensorFlow. (This seems like a good fit.)
          5. Adding support to abcDE for annotating phrase segmentation.
          6. Debugging Dactylize 88-key circuit.
          7. Collecting fingering data from JB performances in Elizabethtown.
          8. Completing Dactylize II circuit.
          9. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
          10. Defining procedure for sanity test of production automatic data collector (including Beringer data).
          11. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
          12. Implementing end-to-end machine learning experiment, using Beringer abcD data.
          13. Submitting papers to TISMIR. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

          Thursday, March 22, 2018

          2018-03-01 status

          Done

          Administrivia

          • Talked through a few challenges with the NLP Lab.
          • Did a little forum shopping. As a backup plan for ISMIR conference, the TISMIR journal is accepting submissions. 

          Model Building

          • Completed "reentry" evaluation method for strike fingers.
          • Tested edge cases for evaluation and advising methods.
          • Rejected TensorFlow for Sayegh implementation. We just need a trellis graph, a nasty for loop for training, and Viterbi.
          • Stubbed in support for phrase segmentation in modeling framework.
          • Implemented Sayegh training algorithm.
          • Implemented methods to store and recall trained models for reuse.

            Doing

            1. Implementing Sayegh trellis-graph model from scratch, using Python's networkx.
            2. Defining initial benchmark corpora and evaluation methodology.
            3. Implementing convenience methods for reporting benchmark results.
            4. Completing and polishing abcD for entire Beringer corpus.

            Struggling

            1. The otherwise slick parser module I am using (TatSu) produces an immutable AST. This is cramping my style and promises to get worse as we move along.
            2. The Parncutt code is a disaster under Python 3. Lot of rework needed here.

            In Scope

            1. Reimplementing Parncutt model in framework using networkx.
            2. Moving Beringer corpus to MySQL database.
            3. Enhancing Parncutt, following published techniques and pushing beyond them.
            4. Enhancing Hart and Sayegh to return top n solutions.
            5. Re-weighting Parncutt rules using machine learning and TensorFlow. (This seems like a good fit.)
            6. Adding support to abcDE for annotating phrase segmentation.
            7. Debugging Dactylize 88-key circuit.
            8. Collecting fingering data from JB performances in Elizabethtown.
            9. Completing Dactylize II circuit.
            10. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
            11. Defining procedure for sanity test of production automatic data collector (including Beringer data).
            12. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
            13. Implementing end-to-end machine learning experiment, using Beringer abcD data.
            14. Submitting papers to ISMIR 2018. Abstracts due March 23. Papers due March 30. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

            Friday, February 16, 2018

            2018-02-18 status

            Done

            Administrivia

            • Found ISMIR LaTeX template.

            Model Building

            • Modified music21 code base to ignore ornaments and had pull request accepted.
            • Moved to the alpha release of music21 (with some trepidation)
            • Set up grammar, parser, and convenient Abstract Syntax Tree (AST) for fingering language (abcDF) in Python.
            • Leveraged new parser to implement evaluation methods.
            • Implemented Hamming evaluation method for strike fingers.
            • Implemented "natural" evaluation method for strike fingers.
            • Implemented "pivot" evaluation method for strike fingers.
            • Implemented first (striking) finger constraint for Hart model.
            • Drafted "reentry" evaluation method for strike fingers.

              Doing

              1. Debugging and testing "reentry" evaluation method. 
              2. Creating more test cases for "infrastructure" code.
              3. Studying TensorFlow paradigms for connectionist models.
              4. Implementing Sayegh  model (via TensorFlow or from scratch).

              Struggling

              1. The otherwise slick parser module I am using (TatSu) produces an immutable AST. This is cramping my style and promises to get worse as we move along.
              2. The Parncutt code is a disaster under Python 3. Lot of rework needed here.
              3. Generally pulling hair out debugging the "reentry" code, which should be trivial. I am doing something stupid.

              In Scope

              1. Re-implementing Parncutt model in framework. (The graph class I was using does not seem to exist in Python3, so this needs to be reworked.)
              2. Debugging Dactylize 88-key circuit.
              3. Collecting fingering data from JB performances in Elizabethtown.
              4. Completing Dactylize II circuit.
              5. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
              6. Creating abcD for complete Beringer corpus.
              7. Moving Beringer corpus to MySQL database.
              8. Enhancing Parncutt, following published techniques and pushing beyond them.
              9. Defining procedure for sanity test of production automatic data collector (including Beringer data).
              10. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
              11. Implementing end-to-end machine learning experiment, using Beringer abcD data.
              12. Submitting papers to ISMIR 2018. Abstracts due March 23. Papers due March 30. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

              Friday, January 26, 2018

              2018-01-26 status

              Done

              Model Building

              • Refactored Python algorithm ("Dactyler") architecture.
              • Implemented unit test framework.
              • Migrated Hart algorithm to new framework.
              • Enhanced Hart model to support repeated notes with same pitch.
              • Put announced ISMIR 2018 deadlines (March 23 and March 30) on NLP calendar.

                Doing

                1. Revamping my corpus module to deal with multiple voices and polyphony in abc/abcD input.
                2. Supporting first-note fingering constraint in Hart model to enable the "auto-correcting" or "re-entrant" evaluation method suggested by CR.
                3. Implementing DValuation base class to support evaluation methods.
                4. Implementing DHamming edit-distance evaluation method.
                5. Implementing DNatural edit-distance method.
                6. Implementing DPivot edit-distance method.
                7. Implementing DReEntry evaluation method.

                In Scope for Semester

                1. Re-implementing Parncutt model in framework. (The graph class I was using does not seem to exist in Python3, so this needs to be reworked.)
                2. Debugging Dactylize 88-key circuit.
                3. Collecting fingering data from JB performances in Elizabethtown.
                4. Implementing Sayegh model.
                5. Completing Dactylize II circuit.
                6. Developing method to align performance data with symbolic data. I think this is going to be essential if we are to use Dactylize data moving forward and a key part of its proof of concept. I plan to have something for this at the ISMIR demo session (September 22 deadline).
                7. Creating abcD for complete Beringer corpus.
                8. Moving Beringer corpus to MySQL database.
                9. Enhancing Parncutt, following published techniques and pushing beyond them.
                10. Defining procedure for sanity test of production automatic data collector (including Beringer data).
                11. Defining corpora for Dactylize data collection (WTC, Beringer, ??).
                12. Implementing end-to-end machine learning experiment, using Beringer abcD data.
                13. Submitting papers to ISMIR 2018. Abstracts due March 23. Papers due March 30. Ideas: a follow-up demo paper describing Dactylize data collected; a full-length paper describing application of evaluation method to models developed; a full-length description of enhanced and/or novel models, demo of method to align collected performance data with symbolic score.

                Struggling

                1. music21 support for abc is either buggy, or I can't read. Having a hard time splitting an abc file into right- and left-hand parts, something that should be trivial.
                2. The Parncutt code is a disaster under Python 3. Lot of rework needed here.
                3. Contemplating using Hart as one of the two proof-of-concept models for the methodology paper. I might even be planning on it at this point.