Integrating intelligent methods for scheduling in grid computing systems

Date of Completion

January 2004


Artificial Intelligence|Computer Science




Grid Computing systems aim to enable the sharing, selection, and aggregation of a wide variety of resources including supercomputers, storage systems and data sources, that are geographically distributed and owned by different organizations. These resources can collaborate for solving large-scale computational and data intensive problems. However, without good scheduling, the benefits of the Grid system can be unrealized. ^ Grid scheduling is defined as the process of making scheduling decisions involving allocating jobs to resources over multiple administrative domains. Scheduling an application in a Grid environment is significantly complicated because of the heterogeneous nature of a Grid system and the potential fluctuations in resources like CPU load, memory usage, and available network bandwidth. ^ The thesis presents an integrated architecture for scheduling in Grid systems based on AI techniques including the use of mining association rules for knowledge discovery, constraint representation of the scheduling problem, and the use of heuristic search methods of genetic algorithms and hill-climbing with Tabu list. The architecture also employs methods from other fields including relational database systems for back bone information and statistical methods of weighted averaging and nonlinear regression. The goal of the integrated system is to effectively utilize the information services of the Grid to extract knowledge that helps the scheduler make better decisions and to use good representation and heuristic methods to reduce the scheduling problem search space. ^ We define three modules within the architecture for information gathering and knowledge discovery, applications run time prediction, and intelligent scheduling. The functions of the three modules and their interactions within the integrated architecture are defined. Results of experiments to test the integrated scheduler and its modules within different environments and objectives are presented. ^