CSC8710: Research Seminar on Scientific Workflows

Big Data Workflow Papers


1.     Paper 1: (DC) Maria Alejandra Rodriguez, Rajkumar Buyya: A Responsive Knapsack-Based Algorithm for Resource Provisioning and Scheduling of Scientific Workflows in Clouds. ICPP 2015:839-848. (the WRPS algorithm) Download. Youtube video

2.     Paper 2: (BDC) Maciej Malawski, Gideon Juve, Ewa Deelman, Jarek Nabrzyski: Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. SC 2012:22. (CloudSim and workflow generator are used for experiments). Download (the DPDS and SPSS algorithms) Youtube video. (level-based deadline distribution, the objective function: maxi- mizing the number of completed workflows from an ensemble under both budget and deadline constraints, limiations: all VMS are the same, homogeneous resource model, so that task placement decisions do not impact the runtime of the tasks. so that task placement decisions do not impact the runtime of the tasks (including data transfer time), data transfer time is fixed. Very interesting but strong assumption: These priorities are absolute in the sense that completing a workflow with a given priority is more valuable than completing all other workflows in the ensemble with lower priorities combined.)

3.     Paper 3: (OM) Cui Lin, Shiyong Lu: SCPOR: An elastic workflow scheduling algorithm for services computing. SOCA 2011:1-8, Download. (The SCOPOR algorithm) Youtube video

4.     Paper 4: (OM) Cui Lin and Shiyong Lu, SHEFT: An Elastic Workflow Scheduling Algorithm for Cloud Computing, Technical Report TR-BIGDATA-12-2011-LL, Department of Computer Science, Wayne State University, May, 2011. Download. (The SHEFT algorithm)Youtube video

5.     Paper 5. (OM) Haluk Topcuoglu, Salim Hariri, Min-You Wu: Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing. IEEE Trans. Parallel Distrib. Syst. (TPDS) 13(3):260-274 (2002). Download. (The HEFT algorith and the CPOP algorithm) CPOP youtube

6.     Paper 6: Nabeel Mohamed, Nabanita Maji, Jing Zhang, Nataliya Timoshevskaya, Wu-chun Feng: Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows. TrustCom 2014: 739-746 Download. Youtube video

7.     Paper 7: Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen: A data placement strategy in scientific cloud workflows. Future Generation Comp. Syst. (FGCS) 26(8):1200-1214 (2010). Download.

8.     Paper 8: (BDC) Hamid Arabnejad, Jorge G. Barbosa, Radu Prodan: Low-time complexity budget-deadline constrained workflow scheduling on heterogeneous resources. Future Generation Comp. Syst. (FGCS) 55:29-40 (2016). Download. Youtube video(The DBCS algorithm, no optimization, aims to quickly find a feasible solution that satisfies both budget and deadline constraints, for a bounded number of heterogeneous resources, advantages: low complexity planning time O(n^2*p))

9.     Paper 9: Jianwu Wang, Daniel Crawl, Ilkay Altintas, Weizhong Li: Big Data Applications Using Workflows for Data Parallel Computing. Computing in Science and Engineering (CSE) 16(4):11-21 (2014). Download.

10.                        Paper 10: (DC) Saeid Abrishami, Mahmoud Naghibzadeh, Dick H. J. Epema: Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds. Future Generation Comp. Syst. (FGCS) 29(1):158-169 (2013). Download. (the IC-PCP algorithm). Youtube video

11.                        Paper 11: Andrey Kashlev, Shiyong Lu: A System Architecture for Running Big Data Workflows in the Cloud. IEEE SCC 2014:51-58. Download.

12.                        Paper 12: Andrew Wylie, Wei Shi, Jean-Pierre Corriveau, Yang Wang: A Scheduling Algorithm for Hadoop MapReduce Workflows with Budget Constraints in the Heterogeneous Cloud. IPDPS Workshops 2016: 1433-1442 for Running Big Data Workflows in the Cloud. IEEE SCC 2014:51-58. Download.

13.                        Paper 13: Andrey Kashlev, Shiyong Lu, and Artem Chebotko, Typetheoretic Approach to the Shimming Problem in Scientific Workflows, IEEE Transactions on Services Computing (TSC), 8(5), pp.795-809, 2015. Download.

14.                        Paper 14: Xubo Fei, Shiyong Lu, and Cui Lin, "A MapReduce-Enabled Scientific Workflow Composition Framework", IEEE International Conference on Web Services (ICWS), pp.663-670, Los Angeles, CA, 2009 Download.

15.                        Paper 15:Somayeh Kianpisheh, Nasrollah Moghadam Charkari, Mehdi Kargahi: Reliability-driven scheduling of time/cost-constrained grid workflows. Future Generation Comp. Syst. (FGCS) 55:1-16 (2016). Download. (reliability paper)

16.                        Paper 16: Goncalves, Carlos, Luis Assuncao, and Jose C. Cunha. "Data analytics in the cloud with flexible mapreduce workflows." In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on, pp. 427-434. IEEE, 2012. Download.

17.                        Paper 17: Jia Liu, Li, Miao Zhang, Rajkumar Buyya, and Qi Fan. "Deadline-constrained coevolutionary genetic algorithm for scientific workflow scheduling in cloud computing." Concurrency and Computation: Practice and Experience 29, no. 5 (2017). Download.

18.                        Paper 18: (BC) Zheng, Wei, and Rizos Sakellariou. "Budget-deadline constrained workflow planning for admission control." Journal of grid computing 11, no. 4 (2013): 633-651. Download. (The BHEFT algorithm, consider existing work load allocation, L1) Youtube Video

19.                        Paper 19: (BC) Arabnejad, Hamid, and Jorge G. Barbosa. "A budget constrained scheduling algorithm for workflow applications." Journal of Grid Computing 12, no. 4 (2014): 665-679. Download. (The HBCS algorithm) (L1: a bounded number of of heterogeneous resources). Youtube video

20.                        Paper 20: Prodan, Radu, and Marek Wieczorek. "Bi-criteria scheduling of scientific grid workflows." Automation Science and Engineering, IEEE Transactions on 7, no. 2 (2010): 364-376. Download. (The DCA algorithm)

21.                        Paper 21: (BDC) Yu, Jia, and Rajkumar Buyya. "Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms." Scientific Programming 14, no. 3-4 (2006): 217-230. Download. (The GA algorithm, Evolutionary approaches). Youtube video.

22.                        Paper 22: Tsai, Chun-Wei, and Joel JPC Rodrigues. "Metaheuristic scheduling for cloud: A survey." Systems Journal, IEEE 8, no. 1 (2014): 279-291. Download.

23.                        Paper 23: (BC) Sakellariou, Rizos, Henan Zhao, Eleni Tsiakkouri, and Marios D. Dikaiakos. "Scheduling workflows with budget constraints." In Integrated research in GRID computing, pp. 189-202. Springer US, 2007. Download. (The LOSS1 algorithm) Youtube video (L1: does not consider data transfer cost and data storage cost, not suitable for big data

24.                        Paper 24: Kunal Agrawal, Anne Benoit, Loic Magnan, Yves Robert: Scheduling algorithms for linear workflow optimization. IPDPS 2010:1-12. Download.

25.                        Paper 25: Singh, Gurmeet, Carl Kesselman, and Ewa Deelman. "A provisioning model and its comparison with best-effort for performance-cost optimization in grids." In Proceedings of the 16th international symposium on High performance distributed computing, pp. 117-126. ACM, 2007. Download. (Evolutionary approaches)

26.                        Paper 26: Talukder, A. K. M., Michael Kirley, and Rajkumar Buyya. "Multiobjective differential evolution for scheduling workflow applications on global Grids." Concurrency and Computation: Practice and Experience 21, no. 13 (2009): 1742-1756. Download. (Evolutionary approaches)

27.                        Paper 27: Yu, Jia, Michael Kirley, and Rajkumar Buyya. "Multi-objective planning for workflow execution on grids." In Proceedings of the 8th IEEE/ACM International conference on Grid Computing, pp. 10-17. IEEE Computer Society, 2007. Download. (Evolutionary approaches)

28.                        Paper 28: De Oliveira, Daniel, Kary ACS Ocana, Fernanda Baiao, and Marta Mattoso. "A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds." Journal of Grid Computing 10, no. 3 (2012): 521-552. Download.

29. Paper 29: Khalifa, Ahmed E., Iman Elghandour, and Nagwa El-Makky. "IncReStore: Incremental computation of mapreduce workflows." In Data Engineering Workshops (ICDEW), 2016 IEEE 32nd International Conference on, pp. 39-46. IEEE, 2016. Download

30.                        Paper 30: Song, Aibo, Zhiang Wu, Xu Ma, and Junzhou Luo. "CAT: A Cost-Aware Translator for SQL-query workflow to MapReduce jobflow." Data and Knowledge Engineering 102 (2016): 42-56. Download.

31.                        Paper 31: Data, Big, and C. Catlett. "A cloud framework for big data analytics workflows on azure." Cloud Computing and Big Data 23 (2013): 182. Download. Youtube Video

32.                        Paper 32: Vahi, Karan, Mats Rynge, Gideon Juve, Rajiv Mayani, and Ewa Deelman. "Rethinking data management for big data scientific workflows." In Big Data, 2013 IEEE International Conference on, pp. 27-35. IEEE, 2013.  Download. Youtube video

33.                        Paper 33: Juve, Gideon, Ewa Deelman, Karan Vahi, Gaurang Mehta, Bruce Berriman, Benjamin P. Berman, and Phil Maechling. "Scientific workflow applications on Amazon EC2." In 2009 5th IEEE International Conference on E-Science Workshops, pp. 59-66. IEEE, 2009. Download.Youtube video

34.                        Paper 34: Kranjc, Janez, Roman Orač, Vid Podpečan, Nada Lavrač, and Marko Robnik-Šikonja. "ClowdFlows: Online workflows for distributed big data mining." Future Generation Computer Systems (2016). Download.Youtube Video

35.                        Paper 35: Perovšek, Matic, Janez Kranjc, Tomaž Erjavec, Bojan Cestnik, and Nada Lavrač. "TextFlows: A visual programming platform for text mining and natural language processing." Science of Computer Programming 121 (2016): 128-152. Download. Youtube Video  

36.                        Paper 36: Rak, Rafal, Andrew Rowley, William Black, and Sophia Ananiadou. "Argo: an integrative, interactive, text mining-based workbench supporting curation." Database 2012 (2012): bas010.. Download. Youtube video

37.                        Paper 37: Kano, Yoshinobu, Paul Dobson, Mio Nakanishi, Jun'ichi Tsujii, and Sophia Ananiadou. "Text mining meets workflow: linking U-Compare with Taverna." Bioinformatics 26, no. 19 (2010): 2486-2487. Download. Youtube Video

38.                        Paper 38: Kano, Yoshinobu, Makoto Miwa, K. Bretonnel Cohen, Lawrence E. Hunter, Sophia Ananiadou, and Jun’ichi Tsujii. "U-Compare: A modular NLP workflow construction and evaluation system." IBM Journal of Research and Development 55, no. 3 (2011): 11-1. Download. Youtube Video 

39.                        Paper 39: Xiao Fu, Riza Theresa Batista-Navarro, Rafal Rak, Sophia Ananiadou: Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows. J. Biomedical Semantics 6: 8 (2015). Download. Youtube Video

40.                        Paper 40:    Mehedi Hasan, Alexander Kotov, April Idalski Carcone, Ming Dong, Sylvie Naar, Kathryn Brogan Hartlieb, “A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories”. Journal of Biomedical Informatics 62: 21-31 (2016). Download.

41.                        Paper 41: Ishtiaq Ahmed, Rahman Ali, Donghai Guan, Young-Koo Lee, Sungyoung Lee, TaeChoong Chung: Semi-supervised learning using frequent itemset and ensemble learning for SMS classification. Expert Syst. Appl. 42(3): 1065-1073 (2015). Download.

42.                        Paper 42 (DC): Saeid Abrishami, Mahmoud Naghibzadeh, Dick H. J. Epema: Cost-Driven Scheduling of Grid Workflows Using Partial Critical Paths. IEEE Trans. Parallel Distrib. Syst. 23(8): 1400-1414 (2012). Download. (The PCP algorithm). Youtube video

43.                        Paper 43 : Lin, Cui, Shiyong Lu, Xubo Fei, Darshan Pai, and Jing Hua. "A task abstraction and mapping approach to the shimming problem in scientific workflows." In IEEE International Conference on Services Computing, pp. 284-291, 2009. Download.

44.                        Paper 44 (Survey): Smanchat, Sucha, and Kanchana Viriyapant. "Taxonomies of workflow scheduling problem and techniques in the cloud." Future Generation Computer Systems 52 (2015): 1-12. Download.