Desperately seeking user models in information retrieval systems : benefits and limits of cognitivist and marketing approaches

Rosalba Palermiti and Yolla POLITY

Groupe RI3 (Recherche Intelligente et Interactive de l'Information)
Université Pierre Mendès France
IUT2 de Grenoble, Département Information-Communication
Rosalba.Palermiti@iut2.upmf-grenoble.fr
Yolla.Polity@iut2.upmf-grenoble.fr

 


Cet article a été publié en 1995 dans la revue anglaise The new review of information and library research, vol 1, 1995, p.57-65.

 

 Abstract

Benefits and limits are examined in this comparison of two different approaches to the study of users of information retrieval systems. The cognitivist point of view relies on the concept of "user modelling" for man-machine communication and the marketing point of view tends to enlarge the range of information products and services. Suggestions are made to go beyond the problem of accessibility and to focus on the analysis of the social and professionnal practices in which information retrieval tasks are embedded.

 

 

In recent years new technological developments have continually increased the range of information products available on the market. The first automated information systems were conceived in the fifties under a policy aimed at a more efficient organization of scientific and technical information. Adopted by the private sector in the seventies, information systems spread to all fields of knowledge and cultural institutions, whereas they were at first only intended for a public comprised of academic, scientific and military experts. Information became the "gray oil", the new raw material of the eighties (BOURE and DARREON (1)) , forecasting the coming of a new society where information plays a vital and forceful role, particularly for business "insiders". In similar fashion, microcomputer science initiated the era in which greater volumes of information were made available to a variety of users, especially the mass market. The persisting myth of the democratization of information available to all was born at this time.

Reality, however, has not come up to expectations, since users are still too often simply potential. Several possible reasons can be found for this lack of interest in information retrieval systems on the part of those seeking information: poor knowledge of hardware, user-unfriendly interfaces, difficulties in formulating queries, little knowledge of requests and practices, etc. In order to provide users with more suitable information products it therefore seems vital to study them more thoroughly, to find out their needs, to characterize their groups, to uncover their difficulties and errors.

Two differing approaches to the users of information systems reflect the interests of the research community: the cognitivist point of view, basically psychological, which tries to perfect man-machine interfaces by using the "user model" concept and the marketing point of view, basically entreprenurial and commercial. The latter has gradually encroached upon cultural institutions and for the information field is expressed by a wider range of products offered.

1. The cognitivist approach

Until the eighties, information retrieval systems required the user to master a vast amount of heterogeneous knowledge: command languages for requesting, displaying or printing, Boolean logic for developing a research strategy based on multicriteria, database structuring, documentary languages and indexing terminology. Only the specialists, information officers and librarians, could, sometimes reluctantly, acquire proper training. Driven by the need to broaden their market, information system designers pursued their efforts to render these systems more accessible to the end users ( cf POLITY (2)). Early improvements were inspired by interface ergonomics and cognitive psychology, which conceived and suggested the integration of the "user model" into man-machine interface designs.

1.1. User models based on cognitive psychology

When applied to man-machine interaction, cognitive psychology aims to improve the performance and accessibility of the systems while taking into account the cognitive models. Possible applications include artificial intelligence for designing expert systems, computer assisted learning and even more generally all types of man-machine interfaces. According to DANIELS (3), it is possible to identify a number of types of cognitive models: mental models refer to the user's model of the system, conceptual models are those which are presented to the user by the system designer, and user models which refer to the computer's model of the user. User models can then be split into two major categories: empirical quantitative models and analytical-cognitive models.

These models can also be classified according to three main dimensions (cf RICH (4)): models of a single "typical" user versus collections of individual models; explicit models defined by the designer versus models inferred by the computer on the basis of the user's behaviour; long term user characteristics such as areas of interest and expertise versus short term characteristics such as the subject of the last sentence typed.

1.1.1. Empirical quantitative models

These models are based on the idea that users fit into homogeneous groups describing their performances when using a given system. Performances are detected through observation and assessment techniques for certain tasks and in various environments. These empirically detected performance data are to be used to construct abstract formalisations and to define skill groups such as "experts vs beginners". Performances covered include handling of controls, mastery of the subject covered, knowledge of conceptual models (structure of fields), etc. In the context of document retrieval systems, these concerns have produced a vast amount of literature describing the difficulty of connection to these systems, the use of commands or Boolean operators, the presentation and selection of data and search terms. These user models do not express beliefs, reasoning or cognitive processes, but rather external performances. The models created in this way are rigid. They play a role in the life cycle of systems when an existing device is being assessed for improvement. They also can be used to assist designers in producing new systems.

1.1.2 The analytical cognitive models

This approach attempts to construct a more qualitative behavioural modelling. It studies human reasoning according to the task the operator is trying to perform. The models aim to detect, implicitly and through linguistic features, the purposes, strategies, plans or beliefs of the user, so that the system may issue predictions and draw inferences. Therefore, these models are not rigid but dynamic, able to evolve according to different tasks or categories of users, and adaptative, by using the cognitive features detected in the user behaviour. Their role is to help the system cooperate, to indicate hypotheses, and to single out areas of interest. Although some effective models have been created in certain fields and for specific tasks which can be modelled easily, this is not the case for information systems, probably since information research activities are specific, and user aims are different every time. In such systems, what types of knowledge, facts, ideas, heuristics, or procedures should the user model contain? Should it be the behaviour of the expert information scientist or the end user that is being modelled? There is no consensus of opinion on this point right now. However, some systems have been built on such assumptions and have integrated cognitive models: THOMAS (cf ODDY (5)), a system which creates its own image of the user's interests and which suggests new documents according to this image and the user's reactions, or GRUNDY (cf RICH (4)) which uses stereotypes as useful mechanisms for building models of individual users by explicitly asking every user to introduce himself (age, sex, education, etc).

1.2. The contributions of cognitive ergonomics

In the last decade interface ergonomics and cognitive psychology have undeniably contributed considerably to the evolution of user accessibility to information systems. The main idea was to avoid discouraging the user during connection operations and obliging him to learn many command languages. Search languages have also been unified (Common Command Language) along with the creation of new user interfaces. The principal servers have created simplified consultation modes (Knowledge Index for Dialog, After Dark for BRS), which are based on the principle of filling out forms or using menu trees: the form type interface makes it possible for the user to fill in certain fields (such as the author, title, date, etc.) without necessarily having to know the name of the field as required by command language interfaces. Although this method leads to fewer data entry errors, it does however seem unnecessarily lengthy when the user has acquired some familiarity with the method. On the other hand, the menu type interfaces suggest that the user select his desired option from a list. The numerous defects and limits of these interfaces have often been pointed out (too slow or inflexible), but they do provide some increased efficiency and considerable comfort for occasional, limited or targeted searches.

Automatic indexing by selecting words from titles, abstracts or full texts has standardized free language query procedures. But the user must then confront the problem of expressing his need for information, or in other words, how to formulate his question. Some commercial systems possess the so-called evolved interfaces which enable the user to query in natural language so as not to restrict him to Boolean logic. However, this possibility has too often been provided to the detriment of research efficiency (reducing the three standard AND/OR/BUT operators to the single AND operator). The appearance of multi-windowing has made it possibile to display indexes, consult indexing terms, and present the user with help tools to formulate his request. These tools are without a doubt great assets performing several functions: showing what is available in the database, matching the inquiry with existing articles, creating the desire to examine articles the user had not previously considered. They are now standard procedures in CD-ROM type applications.

 

2. The marketing approach

The product of a society based on abundance and consumption, marketing has been directly linked from the start to a business world imbued with competitive practices. Its aim is to match the industrial supply with the demand for consumer goods by providing methods and analytical techniques that can direct production and its strategies according to the market and consumer needs. When applied to the information sector, marketing may be generally defined as a mode of thought giving precedence to the relationships between an organization and its environment. Marketing puts special emphasis on the analysis of the public's needs in order to provide a suitable supply of information products. Without going into great detail on the subject of marketing techniques and their penetration into cultural and public institutions (cf SALAÜN (6)) it would be useful to stress several points, such as the rapid development of the supply and demand for cultural goods, the importance given to the economic value of information as seen in the concepts of "information management", "technological or strategic watching" or "value analysis", and the extent to which the information economy and the cultural industries in general play a major role in our societies.

2.1. Target strategies and product ranges

The organisational, technological and professional aspects of information institutions have developed rapidly in recent years. An important shift has occurred from information management based primarily on a collection of written documents featuring preservation to an information flow, communicated, digitized, mediatized and targetted to the user. Computer science did not at first break away from the idea of preservation, since emphasis was placed more on the enormous storage capacities of computers than on access possibilities. Today, however, the vast range of information resources, the libraries and data banks, are deeply involved in efforts to open up their collections to the public. They have techniques and tools to perform studies on visitor frequency and categorization of the population according to socio-professional origins, age, sex, etc. in order to know this public better and also to perform sociological studies on behaviour, use, practices, especially in public reading.

When exploited, fine-tuned and customized by marketing strategists this body of knowledge about consumers can be applied to the actions necessary for strategic decision making. Definition of homogeneous groups in these studies provides market segmentation and the possibility to pinpoint targets. Marketing traditionally suggests three different targeting strategies (SALAÜN,op.cit.): undifferentiated marketing which gives a wide coverage at low cost and is used for mass market products, differentiated marketing which tries to match a different service to each group of demands identified, and focused marketing which concentrates its efforts on one segment of the general public.

2.2. Contributions from the marketing approach

Knowledge about users of information retrieval systems is still in the embryonic stage, whereas rapid technological developments favour the ever widening range of supply. From the online database reserved for information scientists to the CD-ROMs or to the videotex service that may be interrogated by the end user on the Minitel system from his own home, the information industry is trying to amortize production costs by diversifying types of support. Today one can search FRANCIS, BN-Opale or ELECTRE online, by videotex or on CD-ROM and find the same contents everywhere. Encyclopedias are no longer simply paper documents, but also exist on CD-ROM. Library catalogues may be consulted online (OPACS) or on Minitel. The supply has been multiplied enormously and exists even before a demand is fully evident. Moreover, because these products have been made in advance of the market, they are still used relatively little.A good example is also provided by the Internet system offering a wealth of products that cannot reasonably be touched on a regular basis by the average user, if only for lacck of time. The designers have room to maneuver by offering a broad range of products and thereby increasing their chances of reaching a larger public.

To whom is this offer addressed, who can be reached, who really can use it? Counting the users could be one of the first operations to perform, but it is not an easy task. Should the number of consultations be counted, or the time spent on consultations? Does the user have a professional or a personal need? Is he a young, student, male, executive, researcher, expert, or infrequent user? We possess very little quantified data, and even less qualitative data, on the present and potential users of information retrieval systems.

3. Limits caused by the specificity of information research activities

The exceptional development of the information supply has not significantly changed the use of computer based systems, especially by end users. Despite efforts made by designers, we are obliged to note that the commercial hopes for increasing the use of these systems by a wider public have not been met. The few existing figures reveal that the number of customers assumed to be end users has not significantly increased in proportion to the professional librarians or information officers (for example, their consultations represent only 12% of DIALOG totals). Despite significant contributions from the cognitive and marketing approaches briefly examined here, both are lacking and both have an impact only at the level of access. In fact, these approaches consider the content of these systems to be unchangeable and established. Improving user access to information retrieval systems is a good development. Extending the range of products offered in order to increase distribution modes is also significant. Obtaining statistics about the public is also very useful. But the content of these information providers must also match the needs of the users.

To paraphrase the remarks of Martine POULAIN (7)) on the subject of culture, "requiring an increase in numbers of those practising any cultural activity becomes central, to the detriment of the question of the effects (...) Requiring "ever more" becomes primordial in cultural policy matters. Reading provides a good example of this shift. The study of these issues refers very little to reading and its modes, even less to its content: since now only the number of readers is considered....".Whereas, what the user desires is not necessarily an increase in the supply which also implies a choice and assessment of unknown contents, but rather the correct information, containing certain features (general or specialized), highly technical or vulgarized, exhaustive or selective, updated, etc.

One telling example should be sufficient: a recent article published in Archimag, (July-August 1993) on a comparison of business files available on the Minitel shows that the data is not always fresh, or even worse, that some databases do not date their information, that the consultation is not consistently reliable, and that another only updates files once a year for the annual printout. Diversifying these supports or consultation facilities can not in itself suffice to make up for these deficiencies.

The information search paradox comes into play at another level: it is difficult for a user to state his needs in precise terms, since he is looking for an answer he does not yet know, so how can he state what he is looking for? In this sense, any information search must first involve a clarification of the need, a negotiation which is carried out in classical research by an information specialist acting as intermediary between the user and the sources. Some expert system type information retrieval systems have tried to simulate the skills of this intermediary, but these models are still a sort of "black box", sometimes confused with the activity models themselves. Interaction modes with an automated system and the new search languages offered after cognitivist ergonomic studies, including free language queries, are limited by the incapacity of the systems to transcend the question reply unit and manage a man-machine dialogue in natural language. In fact, all the experts agree that only natural language dialogue is supple enough to permit this type of negotiation.

The obvious failure noted for the use of information retrieval systems shows us that taking into account the users must surpass the problems of access. The specific feature of information searching is that it fits into a socio-professional or personal activity which requires that the use of an automated system be placed within a wider scope of information activities. Our knowledge about these practices in this context and on the user's aims is very limited. Nevertheless, the use of an information retrieval system is only a portion of the activity performed by an individual seeking information. It permits him to collect data of all different sorts (references, facts, full texts, etc.), but he uses these data to solve a larger problem, independent and external to the system that he has used to obtain the information. It is certain that we must go beyond the mere structure of information retrieval systems towards an analysis of behaviour and social practices.

Therefore, user models, often merely in the prototype phase, only pertain to cognitive functions of individual users and do not deal with the social and organisational context in which tasks are performed. Some authors in the field of artificial intelligence, involving the transmission of knowledge for the development of expert systems, deal nevertheless with the context and environment of the expertise (POITOU,(8)). Such contexts are taken into account by using methodological tools and techniques belonging to ethno-methodological sociology, work analysis and also discourse analysis. Investigations are carried out on the individual who is to be modelled and on various agents of an institution that more or less determines through its own activities, those of the individual being studied.

Does cognitive science offer the best paradigm to cope with this crucial questioning in the information retrieval domain? Many authors are expessing their doubts.

MIEGE (9) states: "At first based on the isolated individual who is interacting with a technical device, cognitive sciences are turning towards the disclosure of organisational and general factors. This hegemonic objective of the cognitive sciences provides a disquieting prospect for Information and Communication Sciences. Since intellectual technologies are thought to play a major role in the cognitive processes, practically all communication activities are thereby reduced to cognitive processes depending on the cognitive sciences".

As FROHMANN (10) shows in his study of the cognitivist viewpoint in the field of information systems, it is important to be careful, because on the academic ground, the cognitive viewpoint consolidates the power relationships which set up information as merchandise and people as information consumers in the context of a market economy.

It is therefore vital to restore the sociological dimensions to the public/product problem as related to information systems. We think that it is no longer a question of measuring progress to be made or developing strategies to combat resistance to technological devices, but to understand what makes sense and value for each and every one. And as stated by POULAIN, op.cit.. "it is a social illusion to believe that suggestion means use, and opportunity makes the thief".

 

References

1. BOURE, R. & DARREON, J.L. Quand l'information était du pétrole gris... Variations autour d'une métaphore, Cahiers du Lerass, 29, 1993.

2. POLITY , Y. Evaluation des différents modes de recherche en langage naturel , Documentaliste-Sciences de l'information, 31 (3), 1994, 136-142.

3. DANIELS, P.J. Cognitive Models in Information Retrival - An Evaluative Review. Jounal of Documentation, 42 (4) 1986, 272-304

4. RICH , E. User modelling via stereotype, Congnitive Science, 3, 1979, 329-354.

5. ODDY , R.N. Information retrieval through man-machine dialogue, Jounal of Documentation, 33, 1977, 1-14.

6. SALAÜN , J. M. Marketing des bibliothèques et des centres de documentation. Paris: Cercle de la Librairie, 1992.

7. POULAIN , M. Des lecteurs, des publics et des bibliothèques. In: Histoire des bibliothèques. Paris: Cercle de la Librairie, vol.4, 1992, 528-543.

8. POITOU , J.P. Définition d'une méthodologie de recueuil et d'extraction des connaissances au service des systèmes experts en amont de la formalisation des connaissances.Aix en Provence: Centre de recherche en psychologie cognitive (CRPC), 1991.

9. MIEGE, B.: Les étapes de la pensée communicationnelle (3): Les interrogations actuelles, Cahiers du LERASS, 31, 1994, 185-196.

10. FROHMANN, B. The power of images: a discourse analysis of the cognitive viewpoint. Journal of Documentation, 48 (4), 1992,.365-386.