发明名称 |
Technique for stateless distributed parallel crawling of interactive client-server applications |
摘要 |
A distributed computing system includes worker nodes and a master node including a processor coupled to a memory. Each worker node crawls a portion of an interactive client-server application. The memory includes a master state graph, including the results of crawling. The master node is configured to examine the master state graph to determine a number of reconverging traces, receive a result from a job from a worker node if the number of reconverging traces is below a threshold, and add the result to the master state graph without attempting to remove duplicate states or transitions. A trace includes states and transitions representing valid. A reconvergent trace includes a trace including a reconvergent state, which is a state that can be reached through two or more distinct traces. The result containing states and transitions is associated with crawling a first portion of the interactive client-server application. |
申请公布号 |
US8880588(B2) |
申请公布日期 |
2014.11.04 |
申请号 |
US201012957381 |
申请日期 |
2010.11.30 |
申请人 |
Fujitsu Limited |
发明人 |
Prasad Mukul Ranjan |
分类号 |
G06F15/16;G06F11/36;G06F9/50;G06Q10/06 |
主分类号 |
G06F15/16 |
代理机构 |
Baker Botts L.L.P. |
代理人 |
Baker Botts L.L.P. |
主权项 |
1. A distributed computing system, comprising:
a plurality of worker nodes, each configured to crawl a portion of an interactive client-server application comprising a dynamic web application; and a master node comprising a processor coupled to a memory, the memory comprising a master state graph, the master state graph comprising:
the results of at least of the worker nodes crawling a portion of the interactive client-server application; anda screen transition graph model of the interactive client-server application; wherein the master node is configured to:
examine the master state graph to determine a number of reconverging traces, wherein:
a trace comprises an alternating sequence of states and transitions representing valid behavior of the interactive client-server application;a reconvergent trace comprises a trace comprising a reconvergent state; anda reconvergent state is a state that can be reached through two or more distinct traces of the web application's behavior; anddetermine that the number of reconverging traces is below a threshold; andbased on the determination that the number of reconverging traces is below a threshold:
receive a first result from the execution of a first job from a first worker node, the result containing states and transitions associated with crawling a first portion of the interactive client-server application; andadd the first result to the master state graph without attempting to remove duplicate states or transitions. |
地址 |
Kawasaki-shi JP |