发明名称 METHOD FOR THE FULLY MODIFIABLE FRAMEWORK DISTRIBUTION OF DATA IN A DATA WAREHOUSE TAKING ACCOUNT OF THE PRELIMINARY ETYMOLOGICAL SEPARATION OF SAID DATA
摘要 Method for the fully modifiable framework distribution of data in a data warehouse taking account of the preliminary etymological separation of said data is based on the framework model of data. It is about the totality of the entity-objects, that relate to a particular abstract domains, is distributed into five groups in an automated way: atomic, composite and weak entity-objects, as well as artifacts i.e. entity-copies the data of which are conventionally placed in warehouse, and a group of indefinite entity-objects, the semantics of which is the subject to further specification. The method provides for the option of replenishment of the algorithms groups and criteria for the separation, each of which allows for a more accurate classification of a particular entity-object to the above-mentioned groups. And their using consistently makes it possible to speed up the process and reduce the fifth group—the group of indefinite entity-objects, which have contradictory characteristics—they can be equally assigned to different groups. A few algorithms were shown. This is an algorithm based on using the dictionary of entity-objects, which is available in public networks and is constantly replenished, and on functional dependencies between the data from the entity-objects, which allows us to compare the entity-objects with each other; an algorithm for tracking some repeating entity-objects in binary pairs, the algorithm of the statistic analysis of the determinized or multi-valued dependencies, as well as the algorithms of successive approximations modifications on the connections' framework-template. This pre-separation of the entity objects set in the abstract domains makes it possible to simultaneously use both the relational properties and, for example, object-oriented model of data distribution. This provides the option to account for some artifacts, for which multiple domains masks are formed in the warehouse, each of which is assigned an identification key corresponding to its structure. Effectuating the Cartesian products of masks among themselves on an“each on each”principle, a complete set of composite entity objects is obtained. After that, they set aside some semantically incompatible ones from the obtained tables—for example, the result of multiplying two weak entity-objects that have a common ancestor. Thus, a logical and physical data schemas, which are equivalent to each other. This enables using of relational capabilities in a physically distributed data warehouse separated onto different servers. The method also solves the issue of standardization of data warehouse schemes creation.
申请公布号 US2011307440(A1) 申请公布日期 2011.12.15
申请号 US201113215250 申请日期 2011.08.23
申请人 PANCHENKO BORYS EVGENIJOVICH 发明人 PANCHENKO BORYS EVGENIJOVICH
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址