Tuesday, September 6, 2011

Paper Reading #4: Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

Reference Information
Gestalt: Integrated Support for Implementation and Analysis in Machine Learning
Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, James A. Landay
Presented at UIST'10, October 3-6, 2010, New York, New York, USA



Author Bios
  • Kayur Patel is a computer science PhD student at the University of Washington.
  • Naomi Bancroft was an undergraduate at the University of Washington.Currently works for Google.
  • Steven M. Drucker is a principal research at Microsoft Research and is an affliate professor of the Unversity of Washington.
  • James Fogarty is an Assistant Professor at the University of Washington and is a key member of the university's Human-Computer Interaction group.
  • Andrew J. Ko is an Assistant Professor at the University of Washington and directs the USE research group at that university. His PhD is from Carnegie Mellon in Computer Science

Summary
Hypothesis
The authors believe Gestalt can improve user's ability to find and fix errors in machine learning code by implementing a classification pipeline,analyzing data as it moves through that pipeline, and easily transitioning between implementation and analysis.

Methods
The participants were asked to use a general purpose development environment to create, edit, and execute scripts.  They were given an API that could be used to reproduce all of Gestalt's visualizations.  The baseline condition and Gestalt used the same data table structure to store data, but in the baseline participants had to write code to connect data, attributes, and classifications.  Participants were asked to build solutions for two problems, sentiment analysis and gesture recognition.  The solutions had five bugs built into them and the participants were supposed to locate and fix as many bugs as possible within an hour.


Results
Users found and fixed far more bugs with the Gestalt environment than in the MATLAB one. Some used the visualization scripting feature of Gestalt. Participants enjoyed the connectedness of Gestalt. Most time was spent analyzing for errors.

Contents
Gestalt is a general-purpose tool for the application of machine learning and supports the implementation of a classification pipeline, analysis of the data in the pipeline, and  transitions between implementation and analysis. Analysis currently requires extensive developer time, but it performed repeatedly through the learning process. Two possible applications are sentiment analysis and gesture recognition. As hiding steps in the pipeline prevents generality, for flexibility, Gestalt operates similarly to an IDE. Information is stored in a relational data table to eliminate the need for data conversion. Gestalt allows developers to write code to produce generalized visualizations.

Discussion
I feel that the paper presented an interesting idea and, by its own contents, a completely successful resulting product.  I completely agree that there is huge room for development in translating imprecise human input into something that a computer can understand. However the lack of user trials leaves much to be wanted from the paper and I would much like to read a follow up paper detailing more extensive user studies and results.

No comments:

Post a Comment