Website Search
Find information on spaces, staff, and services.
Find information on spaces, staff, and services.
After you have set up your programming environment and set up all the accounts to run the process as indicated in this guide’s Prerequisites page, you are ready begin the process outlined by the step-by-step tutorial. Described below is a high level breakdown of how each step in the process works. You can find a description of how the codebase works in the README on the wos-findbyexport GitHub page.
Figure 1: Web of Science Find by Export File jobs represented as a directed acyclic graph (DAG).
The file you submit to the computing cluster defines the overall process as a Directed Acyclic Graph (DAG). It provides an overview of the entire process at a macro level as it will be carried out on the CHTC servers. You can view the DAG file in the chtc-recipes GitHub repository: wos-findbywosexport.dag. The DAG file is simply a file that schedules the order in which the jobs are submitted to the CHTC servers.
In the course of running through this schedule four main processes are completed (see steps one through four in the diagram above). One important feature of this process is that every stage must be completed before the next one can begin. For example, even though some sub-jobs are completed before others in Step 2, the process waits until all Step 2 jobs are completed to move to Step 3. This is a crucial feature of the process and what defines the unidirectional and discretized nature of the DAG file which schedules the entire process: