Microsoft Word fidis-wp2-de models doc



Yüklə 0,65 Mb.
Pdf görüntüsü
səhifə8/30
tarix24.04.2018
ölçüsü0,65 Mb.
#40095
1   ...   4   5   6   7   8   9   10   11   ...   30

FIDIS 

Future of Identity in the Information Society (No. 507512)

 

D2.3 

  

[Final], Version: 2.0 



File: fidis-wp2-del2.3.models.doc 

Page 18 

 

2.3.2  The extraction from data sources and from processes 

In this case, the values associated to the attributes originate from two different sources: (1) 

databases; and (2) processes.  

In the first case, the databases may be governmental (such as police or tax), human resource 

databases (enterprise resource planning and knowledge management systems such as payrolls, 

or training information) or health file databases (managed by hospitals or by social security 

units). 

In the second case, the data can originate from a series of processes that can be used to 

capture the data (and that will be stored in databases). Examples of such processes include e-

commerce systems (such as Amazon) and fidelity programs that can capture the history of 

different transactions associated with each of the customers, or virtual community systems 

that can capture the history of activities of the different members (such as age in the 

community, and number of posting). 

The type 1 IMS (organisational function), presented previously, represents a typical category 

of systems that employs this method, although it can also be used in the type 3 IMS 

(individual function). 

The personal data that is present in databases or captured via a set of processes is mostly 

outside the user’s control (the possibilities of correction by the end user are often limited). 

These data are also often very regulated by some legislation specifying the type of data that 

can be represented, the possible usage of this data, including combining databases. 

Even if this mode of collection of personal data appears to be more intrusive to people’s 

privacy, it is not without some advantages, even for the people themselves. First, the data 

captured via this means can be considered much more reliable, since it directly reflects the 

activities of people, and not only the perception of these activities. Second, because this data 

collection is automatic, it can be considered less demanding for the end-users. 

The values of many attributes that can be recorded in this way include characteristics that 

have a certain level of permanence, while other categories of person’s information can 

include all the transactions (commercial or not) in which the people have been engaged. 

 

2.3.3  Data calculated and inferred from other attributes 

In this case, unknown values associated to particular attributes originate from the calculation 

of other attributes (typically the ones that have been extracted from the previous two 

methods). This category is relatively similar to the category previous described, however, it 

differs in the level of sophistication of the systems that make use of it. Notably, these are 

more frequently used in Type 3 IMS (individual function) applications that use it to provide 

some level of adaptability (for instance in e-learning systems or e-commerce systems). 

The reliability of these calculated attributes is generally less accurate than for non-calculated 

attributes. For instance in Amazon the assertion “a customer that has bought a book about 

children is interested by children and is likely to buy other books about children” is only 

correct in average, since they may only have bought this book once in order to offer a present 

to somebody else. 




FIDIS 

Future of Identity in the Information Society (No. 507512)

 

D2.3 

  

[Final], Version: 2.0 



File: fidis-wp2-del2.3.models.doc 

Page 19 

 

The level of control on these calculated attributes is often limited by the simplicity of the 

algorithm used, and the way it was configured for the calculation. Thus, people that read the 

value of these attributes usually have, at best, only a vague idea about the underlying 

principles that have been used. For instance, a calculated attribute could be a level of risk that 

a bank could calculate on a particular client, which results from a combination of values of 

attributes such as the gross salary of the person, the assets such as real-estates that the person 

may own, his family status, or the postal code of his place of living or even his ethnic origin. 

Another application is certain e-commerce websites, where the preferences of a customer are 

determined automatically. 

 

2.3.4  Data extracted via mining the information 

The extraction of values via data mining techniques could appear similar to the previous 

calculated methods. They differ however in that the algorithms are being applied globally to 

the data of (very large) groups of people, and not on the data set that is associated with a 

single person. The algorithms used are also of a more statistical and probability based nature, 

and often rely on the use of Heuristics. Finally, these algorithms may also be used to help the 

creation process of the user model itself, and in particular help to determine the set of 

attributes required to “summarise” the problem (for instance, in a banking application, an 

algorithm may determine that the knowledge of the age and of the postal code information 

represent sufficient information to discriminate a reliable customer from an unreliable one, 

with a limited risk of error). 

Type 2 IMS (profiling function), presented previously, represent a typical category of systems 

that employs this method. 

The types of attributes that are extracted via mining typically include people related categories 

such as social categories or life styles. These attributes can be considered to be more abstract 

and less directly associated to the individuals. 

At a more micro-level, these attributes can represent some user characteristics and behaviours 

that can be automatically extracted from the use of some Information Systems. For instance 

such attributes, in the context of an e-commerce system, can reflect reliability characteristics 

(likeliness of fraud), and, in the context of a virtual community, can reflect the level of 

participation (such as the activity of the people in SourceForge.net). 

 

 



 

 

 




Yüklə 0,65 Mb.

Dostları ilə paylaş:
1   ...   4   5   6   7   8   9   10   11   ...   30




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə