Synthesis lectures on information concepts, retrieval, and services,
Volume Designation
#66
ISSN of Series
1947-9468 ;
GENERAL NOTES
Text of Note
Title from PDF title page (viewed on June 4, 2019).
INTERNAL BIBLIOGRAPHIES/INDEXES NOTE
Text of Note
Includes bibliographical references (pages 105-127).
CONTENTS NOTE
Text of Note
1. Introduction -- 1.1. Human computers -- 1.2. Basic concepts -- 1.3. Examples -- 1.4. Some generic observations -- 1.5. A note on platforms -- 1.6. The importance of labels -- 1.7. Scope and structure
3. Quality assurance -- 3.1. Quality framework -- 3.2. Quality control overview -- 3.3. Recommendations from platforms -- 3.4. Worker qualification -- 3.5. Reliability and validity -- 3.6. Hit debugging -- 3.7. Summary
Text of Note
4. Algorithms and techniques for quality control -- 4.1. Framework -- 4.2. Voting -- 4.3. Attention monitoring -- 4.4. Honey pots -- 4.5. Workers reviewing work -- 4.6. Justification -- 4.7. Aggregation methods -- 4.8. Behavioral data -- 4.9. Expertise and routing -- 4.10. Summary
Text of Note
5. The human side of human computation -- 5.1. Overview -- 5.2. Demographics -- 5.3. Incentives -- 5.4. Worker experience -- 5.5. Worker feedback -- 5.6. Legal and ethics -- 5.7. Summary
Text of Note
6. Putting all things together -- 6.1. The state of the practice -- 6.2. Wetware programming -- 6.3. Testing and debugging -- 6.4. Work quality control -- 6.5. Managing construction -- 6.6. Operational considerations -- 6.7. Summary of practices -- 6.8. Summary
Text of Note
7. Systems and data pipelines -- 7.1. Evaluation -- 7.2. Machine translation -- 7.3. Handwritting recognition and transcription -- 7.4. Taxonomy creation -- 7.5. Data analysis -- 7.6. News near-duplicate detection -- 7.7. Entity resolution -- 7.8. Classification -- 7.9. Image and speech -- 7.10. Information extraction -- 7.11. RABJ -- 7.12. Workflows -- 7.13. Summary
Text of Note
8. Looking ahead -- 8.1. Crowds and social networks -- 8.2. Interactive and real-time crowdsourcing -- 8.3. Programming languages -- 8.4. Databases and crowd-powered algorithms -- 8.5. Fairness, bias, and reproducibility -- 8.6. An incomplete list of requirements for infrastructure -- 8.7. Summary.
0
8
8
8
8
8
8
8
SUMMARY OR ABSTRACT
Text of Note
Many data-intensive applications that use machine learning or artificial intelligence techniques depend on humans providing the initial dataset, enabling algorithms to process the rest or for other humans to evaluate the performance of such algorithms. Not only can labeled data for training and evaluation be collected faster, cheaper, and easier than ever before, but we now see the emergence of hybrid human-machine software that combines computations performed by humans and machines in conjunction. There are, however, real-world practical issues with the adoption of human computation and crowdsourcing. Building systems and data processing pipelines that require crowd computing remains difficult. In this book, we present practical considerations for designing and implementing tasks that require the use of humans and machines in combination with the goal of producing high-quality labels.