УДК 004.02

МЕТОДЫ ОБЕСПЕЧЕНИЯ КАЧЕСТВА В СИСТЕМАХ КРАУД-ВЫЧИСЛЕНИЙ: АНАЛИТИЧЕСКИЙ ОБЗОР

А.В. Пономарев

Аннотация


В настоящее время все большее распространение получают крауд-вычисления, то есть, привлечение к задачам обработки информации широкого круга людей, взаимодействующих посредством инфокоммуникационных технологий. Тем не менее, практическое применение этой технологии в значительной мере сдерживается неопределенностью качества получаемых результатов. В этих условиях задача систематизации сведений об используемых на данный момент методах обеспечения качества и идентификации перспективных направлений их развития является особенно актуальной. В статье обсуждаются результаты систематического обзора журнальных публикаций полнотекстовых баз ScienceDirect и IEEE Xplore, вышедших после 2012 года. Выделены наиболее распространенные на данный момент направления в обеспечении качества, используемые модели и принимаемые допущения, обозначены границы применимости методов. Отмечено, в частности, что наибольшее распространение получили методы, основанные на согласовании оценок, полученных от разных участников, и методы, основанные на применении теоретико-игровых моделей.

Ключевые слова


крауд-вычисления; человеко-машинные вычисления; краудсорсинг; социальные вычисления; человеко-машинные системы; человеческие факторы; систематический обзор литературы

Полный текст:

PDF

Литература


  1. Li C. et al. Noise filtering to improve data and model quality for crowdsourcing // Knowledge-Based Syst. 2016. vol. 107. pp. 96–103.
  2. Bernstein A., Klein M., Malone T.W. Programming the global brain // Commun. ACM. 2012. vol. 55. no. 5. pp. 41–43.
  3. Kazai G., Kamps J., Milic-Frayling N. Worker types and personality traits in crowdsourcing relevance labels // Proc. 20th ACM Int. Conf. Inf. Knowl. Manag. (CIKM ’11). USA: ACM Press. 2011. pp. 1941–1944.
  4. Kitchenham B. Procedures for Performing Systematic Reviews // Keele. 2004. vol. 33. no. 2004. pp. 1–26.
  5. Kitchenham B., Charters S. Guidelines for performing Systematic Literature Reviews in Software Engineering. Technical Report EBSE-2007-01. 2007. 65 p.
  6. Zhang J., Sheng V, Li Q., Wu J., Wu X. Consensus algorithms for biased labeling in crowdsourcing // Information Sciences. 2017. vol. 382. pp. 254–273.
  7. Matsunaga A., Mast A., Fortes J.A.B. Workforce-efficient consensus in crowdsourced transcription of biocollections information // Future Generation Computer Systems. 2016. vol. 56. pp. 526–536.
  8. Han K., Zhang C., Luo J. Taming the Uncertainty: Budget Limited Robust Crowdsensing Through Online Learning // Biological Cybernetics. 2016. vol. 24. no. 3. pp. 1462–1475.
  9. Dang D., Liu Y., Zhang X., Huang S. Crowdsourcing Worker Quality Evaluation Algorithm on MapReduce for Big Data Applications // IEEE Transactions on Parallel and Distributed Systems. 2016. vol. 27. no. 7. pp. 1879–1888.
  10. Duan L., Oyama S., Sato H., Kurihara M. Separate or joint? Estimation of multiple labels from crowdsourced annotations // Expert Systems with Applications. 2014. vol. 41. no. 13. pp. 5723–5732.
  11. Kara Y.E., Genc G., Aran O., Akarun L. Modeling annotator behaviors for crowd labeling // Neurocomputing. 2015. vol. 160. pp. 141–156.
  12. Kubota T., Aritsugi M. Assignment strategies for ground truths in the crowdsourcing of labeling tasks // The Journal of Systems and Software. 2017. vol. 126. pp. 113–126.
  13. Otani N., Baba Y., Kashima H. Quality control of crowdsourced classification using hierarchical class structures // Expert Systems with Applications. 2016. vol. 58. pp. 155–163.
  14. Vempaty A., Varshney L.R., Varshney P.K. Reliable Crowdsourcing for Multi-Class Labeling // IEEE Journal of Selected Topics in Signal Process. 2014. vol. 8. no. 4. pp. 667–679.
  15. Vuurens J.B.P., de Vries A.P. Obtaining High-Quality Relevance Judgments Using Crowdsourcing // IEEE Internet Comput. 2012. vol. 16. no. 5. pp. 20–27.
  16. Zhang J., Sheng V., Wu J., Wu X. Multi-Class Ground Truth Inference in Crowdsourcing with Clustering // IEEE Transactions on Knowledge and Data Engineering. 2016. vol. 28. no. 4. pp. 1080–1085.
  17. Zhang J., Wu X., Sheng V.S. Imbalanced Multiple Noisy Labeling // IEEE Transactions on Knowledge and Data Engineering. 2015. vol. 27. no. 2. pp. 489–503.
  18. Dai P., Lin C., Weld D. POMDP-based control of workflows for crowdsourcing // Artificial Intelligence. 2013. vol. 202. pp. 52–85.
  19. Hirth M., Hoßfeld T., Tran-Gia P. Analyzing costs and accuracy of validation mechanisms for crowdsourcing platforms // Mathematical and Computer Modeling. 2013. vol. 57. no. 11–12. pp. 2918–2932.
  20. Lee J., Lee D., Hwang S. CrowdK: Answering top- k queries with crowdsourcing // Information Sciences. 2017. vol. 399. pp. 98–120.
  21. Lou Y. et al. Use of Ontology Structure and Bayesian Models to Aid the Crowdsourcing of ICD-11 Sanctioning Rules // Journal of Biomedical Informatics. 2017. vol. 68. no. C. pp. 20–34.
  22. Ding S., He X., Wang J. Multiobjective Optimization Model for Service Node Selection Based on a Tradeoff Between Quality of Service and Resource Consumption in Mobile Crowd Sensing // IEEE Internet of Things Journal. 2017. vol. 4. no. 1. pp. 258–268.
  23. Gong Y. et al. Optimal Task Recommendation for Mobile Crowdsourcing With Privacy Control // IEEE Internet of Things Journal. 2016. vol. 3. no. 5. pp. 745–756.
  24. Miao C., Yu H., Shen Z., Leung C. Balancing quality and budget considerations in mobile crowdsourcing // Decision Support Systems. 2016. vol. 90. pp. 56–64.
  25. Tran-Thanh L., Stein S., Rogers A., Jennings N. Efficient crowdsourcing of unknown experts using bounded multi-armed bandits // Artificial Intelligence. 2014. vol. 214. pp. 89–111.
  26. Dai W., Wang Y., Jin Q., Ma J. An Integrated Incentive Framework for Mobile Crowdsourced Sensing // Tsinghua Science and Technology. 2016. vol. 21. no. 2. pp. 146–156.
  27. Gao Y., Chen Y., Liu K.J.R. On Cost-Effective Incentive Mechanisms in Microtask Crowdsourcing // IEEE Transactions on Computational Intelligence and AI in Games. 2015. vol. 7. no. 1. pp. 3–15.
  28. Wang B.C., Lin C.W., Chen K.T., Chen L.J. An analytical model for generalized ESP games // Knowledge-Based Systems. 2012. vol. 34. pp. 114–127.
  29. Wen Y. et al. Quality-Driven Auction-Based Incentive Mechanism for Mobile Crowd Sensing // IEEE Transactions on Vehicular Technology. 2015. vol. 64. no. 9. pp. 4203–4214.
  30. Yang G., He S., Shi Z., Chen J. Promoting Cooperation by the Social Incentive Mechanism in Mobile Crowdsensing // IEEE Communications Magazine. 2017. vol. 55. no. 3. pp. 86–92.
  31. Zhao D., Li X., Ma H. Budget-Feasible Online Incentive Mechanisms for Crowdsourcing Tasks Truthfully // IEEE/ACM Transactions on Networking. 2016. vol. 24. no. 2. pp. 647–661.
  32. Zhu X. et al. A Fair Incentive Mechanism for Crowdsourcing in Crowd Sensing // IEEE Internet of Things Journal. 2016. vol. 3. no. 6. pp. 1364–1372.
  33. Nicholson B., Sheng V.S., Zhang J. Label noise correction and application in crowdsourcing // Expert Systems with Applications. 2016. vol. 66. pp. 149–162.
  34. Shamir L., Diamond D., Wallin J. Leveraging Pattern Recognition Consistency Estimation for Crowdsourcing Data Analysis // IEEE Transactions on Human-Machine Systems. 2016. vol. 46. no. 3. pp. 474–480.
  35. Burmania A., Parthasarathy S., Busso C. Increasing the Reliability of Crowdsourcing Evaluations Using Online Quality Assessment // IEEE Transactions on Affective Computing. 2016. vol. 7. no. 4. pp. 374–388.
  36. Mok R.K.P., Chang R.K.C., Li W. Detecting Low-Quality Workers in QoE Crowdtesting: A Worker Behavior-Based Approach // IEEE Transactions on Multimedia. 2017. vol. 19. no. 3. pp. 530–543.
  37. Morris R.R., Dontcheva M., Gerber E.M. Priming for Better Performance in Microtask Crowdsourcing Environments // IEEE Internet Computing. 2012. vol. 16. no. 5. pp. 13–19.
  38. Allahbakhsh M. et al. Quality Control in Crowdsourcing Systems: Issues and Directions // IEEE Internet Computing. 2013. vol. 17. no. 2. pp. 76–81.
  39. Chittilappilly A.I., Chen L., Amer-Yahia S. A Survey of General-Purpose Crowdsourcing Techniques // IEEE Transactions on Knowledge and Data Engineering. 2016. vol. 28. no. 9. pp. 2246–2266.
  40. Li G., Wang J., Zheng Y., Franklin M.J. Crowdsourced Data Management: A Survey // IEEE Transactions on Knowledge and Data Engineering. 2016. vol. 28. no. 9. pp. 2296–2319.
  41. Liu C.H. et al. QoI-Aware Energy-Efficient Participatory Crowdsourcing // IEEE Sensors Journal. 2013. vol. 13. no. 10. pp. 3742–3753.
  42. Liu C.H. et al. Toward QoI and Energy Efficiency in Participatory Crowdsourcing // IEEE Transactions on Vehicular Technology. 2015. vol. 64. no. 10. pp. 4684–4700.
  43. Yung D., Li M.-L., Chang S. Evolutionary approach for crowdsourcing quality control // Journal of Visual Languages and Computing. 2014. vol. 25. no. 6. pp. 879–890.
  44. Zhang B. et al. Privacy-preserving QoI-aware participant coordination for mobile crowdsourcing // Computing Networks. 2016. vol. 101. pp. 29–41.
  45. Zhang X. et al. Robust Trajectory Estimation for Crowdsourcing-Based Mobile Applications // IEEE Transactions on Parallel and Distributed Systems. 2014. vol. 25. no. 7. pp. 1876–1885.
  46. Sheng V.S., Provost F., Ipeirotis P.G. Get another label? improving data quality and data mining using multiple, noisy labelers // Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 08). 2008. pp. 614–622.
  47. Dawid A.P., Skene A.M. Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm // Applied Statistics. 1979. vol. 28. no. 1. pp. 20–28.
  48. Raykar V.C. et al. Learning from crowds // Journal of Machine Learning Research. 2010. vol. 11. pp. 1297–1322.
  49. Karger D.R., Oh S., Shah D. Iterative learning for reliable crowdsourcing systems // Proceedings of the 24th International Conference on Neural Information Processing Systems. 2011. pp. 1953–1961.
  50. Zhang Y., Chen X., Zhou D., Jordan M.I. Spectral methods meet EM: A provably optimal algorithm for crowdsourcing // Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS’14, Advances in Neural Information Processing Systems. 2014. vol. 2. pp. 1260–1268.
  51. Demartini G., Difallah D.E., Cudré-Mauroux P. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking // Proceedings of the 21st international conference on World Wide Web (WWW ’12). 2012. pp. 469–478.
  52. Gupta M.R., Chen Y. Theory and Use of the EM Algorithm // Foundations and Trends in Signal Processing. 2010. vol. 4. no. 3. pp. 223–296.
  53. McLachlan G.J., Krishnan T. The EM algorithm and extensions: 2nd ed. // Wiley-Interscience, 2008. 359 p.
  54. Sheshadri A., Lease M. Square: A benchmark for research on computing crowd consensus // Proceedings of the 1st AAAI Conference on Human Computations and Crowdsourcing. 2013. pp. 156–164.
  55. Quoc Viet Hung N., Tam N.T., Tran L.N., Aberer K. An Evaluation of Aggregation Techniques in Crowdsourcing // Web Information Systems Engineering – WISE 2013. 2013. LNCS 8181. pp. 1–15.
  56. Khosravifar B., Bentahar J., Gomrokchi M., Alam R. CRM: An efficient trust and reputation model for agent computing // Knowledge-Based Systems. 2012. vol. 30. pp. 1–16.
  57. Fang H., Guo G., Zhang J. Multi-faceted trust and distrust prediction for recommender systems // Decision Support Systems. 2015. vol. 71. pp. 37–47.
  58. Wahab O.A., Bentahar J., Otrok H., Mourad A. A survey on trust and reputation models for Web services: Single, composite, and communities // Decision Support Systems. 2015. vol. 74. pp. 121–134.
  59. Yu H. et al. Filtering trust opinions through reinforcement learning // Decision Support Systems. 2014. vol. 66. p. 102–113.
  60. Liu S., Yu H., Miao C., Kot A.C. A fuzzy logic based reputation model against unfair ratings // Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems. 2013. pp. 821–828.
  61. Auer P., Cesa-Bianchi N., Fischer P. Finite-time Analysis of the Multiarmed Bandit Problem // Machine Learning. 2002. vol. 47. no. 2/3. pp. 235–256.
  62. Bubeck S., Cesa-Bianchi N. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. 2012. 138 p.
  63. Restuccia F., Das S.K., Payton J. Incentive Mechanisms for Participatory Sensing: Survey and Research Challenges // ACM Transactions on Sensor Networks. 2016. vol. 12. no. 2. Article no. 13.
  64. Camerer C.F. Behavioral game theory: Predicting human behavior in strategic situations // Advances in Behavioral Economics. 2011. pp. 374–392.
  65. Von Ahn L., Dabbish L. Labeling images with a computer game // Proceedings of Conference on Human Factors in Computing Systems. 2004. pp. 319–326.
  66. Teng C.M. Correcting noisy data // Proceedings of the Sixteenth International Conference on Machine Learning. 1999. pp. 239–248.
  67. Фаликман М.В., Койфман А.Я. Виды прайминга в исследованиях восприятия и перцептивного внимания. Часть 1 // Вестник Московского университета. Серия 14. Психология. 2005. № 3. С. 86–97.
  68. Frenay B., Verleysen M. Classification in the Presence of Label Noise: A Survey // IEEE Transactions on Neural Networks and Learning Systems. 2014. vol. 25. no. 5. pp. 845–869.
  69. Eickhoff C., de Vries A.P. Increasing cheat robustness of crowdsourcing tasks // Information Retrieval. 2013. vol. 16. no. 2. pp. 121–137.


Андрей Васильевич Пономарев - к-т техн. наук, старший научный сотрудник лаборатории интегрированных систем автоматизации, Федеральное государственное бюджетное учреждение науки Санкт-Петербургского института информатики и автоматизации Российской академии наук (СПИИРАН).
Область научных интересов: системы коллективного интеллекта, социальные вычисления, краудсорсинг, рекомендующие системы, технологии поддержки принятия решений, дискретная оптимизация.
Число научных публикаций: 46.

Адрес (E-mail): ponomarev@iias.spb.su
Почтовый адрес: 14-я линия В.О., 39, Санкт-Петербург, 199178
Телефон: +7(812)328-8071
Факс: +7(812)328-4450




DOI: http://dx.doi.org/10.15622/sp.54.7