|
|
Goal
The goal of this project is to develop a hierarchical
scientific workflow using the Taverna system to explore the notion of “provenance”.
Description
Provenance management is essential for scientific
workflows to support scientific discovery, reproducibility, result
interpretation, and problem diagnosis, while such a facility is usually not
necessary for business workflows. Provenance metadata captures the derivation
history of a data product, including the original data sources, intermediate
data products, and the steps that were applied to produce the data product. The
provenance management problem concerns about the efficiency and effectiveness
of the collection, representation, storage, querying, and visualization of
provenance metadata.
In
this project, you will extend the scientific workflow you have implemented for
project 1 with the following additional requirements using the
Taverna system. You can also design and implement a totally new scientific
workflow. In either case, the workflow needs to satisfy the following features:
1)
The workflow should contain at least 10 tasks and has to be hierarchical
(i.e., containing composite tasks).
2)
At least two of the workflow tasks must be transactions and at least two of
the workflow tasks must be Web services.
One
should be able to run your workflow multiple times and then export a provenance
file called provenance.txt with all workflow run provenances collected
in the file. Please acquire additional papers from the TA if you
need to understand more about provenance.
The
format of provenance.txt is not fixed, but you need to clearly define your
format so that other people can understand it (preferably RDF or XML format but
not mandatory).
Submission
Send
a zip file to TA via Digital
Dropbox in Blackboard with all source codes and necessary files, including
file provenance.txt. The zip file should include “read.txt” to explain how
to compile and run your system, including the information that TA must know in
order to grade, such as your teammates. The zip file should be named after your
name, for example, for project x if your name is “David Smith”, then your file
should be named as “david_smith_projectx.zip”. The name(subject) of your
submission in digital dropbox should be “Projectx_WSU access ID”, for example :
Projectx_aq1111.