发明名称 UNIFORM RESOURCE LOCATOR CANONICALIZATION
摘要 A computer-implemented method includes receiving a plurality of uniform resource locators (URLs), where the URLs identify content files, and where the URLs include at least one parameter. Fingerprints of the content files are identified. A first entropy of values of the fingerprints conditional on values of a first parameter is determined, and a second entropy of values of the first parameter conditional on values of the fingerprints is determined. It is determined that the first parameter is irrelevant to the identification of a unique content file by the URLs based, at least in part, on the first and second entropy values.
申请公布号 US2013144834(A1) 申请公布日期 2013.06.06
申请号 US20080177111 申请日期 2008.07.21
申请人 LLOYD MATTHEW;BERGAN THOMAS;GOOGLE INC. 发明人 LLOYD MATTHEW;BERGAN THOMAS
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址