Managing learning content and digital formats
Bruno Bachimont, Stéphane Crozat, Romain Mallard
Université de Technologie de Compiègne
Innovation Unit ICS (Content and Knowledge Engineering) & lab Heudiasyc (UMR CNRS 6599)
This paper presents an original contribution towards learning-content-management and reuse, based on a content-centred approach. This approach, founded on documentary engineering concepts, promotes the adaptation of content formats, in order to manage them throughout their lifecycle. Our hypothesis has been implemented in software – called SCENARI1 – which combines a publishing chain and a Learning Content Management System (LCMS).
In the first part of the paper we will describe our pedagogical hypothesis that situates content manipulation at the centre of the learning process. We then submit a reinterpretation of reuse in the context of format heterogeneity, and demonstrate both how e-learning standards can allow content to be replaced, and how a documentary approach based on logical structuring makes content adaptation possible. Finally we will demonstrate an overall solution for learning-content- management through the use of SCENARI software.
Content as a learning reference
Our main hypothesis is that learning is a social process consisting of sharing knowledge. As a consequence, the main issues are:
How to allow a learner to create a reference point for his or her own knowledge.
How to integrate the learner within a community in which he or she can share this knowledge with other people who have similar aims or interests.
The social community of learners and knowledge holders is based on knowledge that people agree to consider as building blocks of their identity as knowledge holders (for example textbooks, works of art, plays, experiments, etc.). Those “blocks” are materialized as content on physical media, (such as books, graphics, sounds, hypermedia, etc.), which are the meeting point for learners, teachers and knowledge holders. Crucially, content must remain the same for all users if they are to have a shared reference point. Shared points of reference can be useful in informing learners what they should learn and why, and teachers what should be taught and why.
As a consequence our approach is “content-centred” rather than “learner-centred.” This does not mean that cognitive and psychological aspects are neglected; but since learning is fundamentally about knowledge, learners should adapt themselves to knowledge rather than adapting knowledge to learner. Therefore, our approach is based on two main research questions:
How does one materialise knowledge in identifiable and referable layouts? This issue refers to modelling content - characterising what aspects of knowledge are invariant despite differences in writing or formulating knowledge. For example, a mathematical theorem can be written according to different formulations, but some properties remain the same in all these formulations: the theorem can be recognized as such and this gives it the status of reference knowledge.
How does one present knowledge in an adapted layout to the learner, so that he or she may consider knowledge in an appropriate and intelligible formulation? The main issue here is to adapt knowledge according to its invariants, so that it remains the same and can be perceived and recognised as the same.
1 Système de Conception des Enseignements Numériques, Adaptables, Réutilisables et Interactifs (design system for adaptable, reusable and interactive digital training contents). Learning as a content manipulation
Starting from these theoretical hypotheses, we can explore them from a knowledge and document engineering point of view. Acquisition of knowledge can be modelled as a process of reading (in its widest sense, including listening, looking, etc.) and a process of writing (also in its widest sense, including speech, drawing, etc.). Learning is thus understood to be a particular way of manipulating documents (texts, but also live speeches, schemas, etc.) in order to appropriate the knowledge contained within these documents. Accordingly, the two main research axes become, from a technological point of view:
The modelling of content: Since digital content is calculated it is possible to separate the content from its control, in the sense that a representation of the content can control the content automatically. A consequence of content modelling is its potential to represent the logical structure of the content, independently of its presentation on a specific media. This independence provides several advantages, such as automatic publishing on various media, and for various contexts, from a single source, by modifying the presentation of the content according to the required pedagogical intentions. For example, it is possible to model content using logical XML representation and to develop generators that can publish this content on paper and slides for face-to-face learning, and publish the same content on Web media for e-learning. Through the separation of content from its publishing format and context, content can be considered as an independent entity and becomes both the reference and sharable object in the learning process [Bachimont 1994].
The connection between content and tools to manipulate content: Since digital content is dynamic it can be modified during the whole pedagogical process, from formalisation by a teacher-author to manipulation by a learner-reader. Digital properties thus provide the manipulation tools (e.g. tools for answering, reformulating) along with the contents themselves in a single integrated digital media. This integration is particularly interesting in a knowledge-sharing environment, since the process of learning involves not only reading, but also production activities (rewriting, reformulating, synthesising, answering, annotating, broadening, etc.) [Stiegler 1994]. A part of our R&D program is to imagine and then implement various tools that allow production activities within a variety of contexts. For example, we developed and implemented “Interaction Sheets” (IS), as an extension of “Style Sheets” (SS). Whereas SS allow content to be published alongside particular layouts which make correct reading possible, IS allows the publication of content together with a suitable layout for reading and provides the tools for content manipulation [Crozat & al. 2002].
Our hypothesis is that reference knowledge is one of the foundations of learning. Reference knowledge is materialised as physical content. In order that a community of knowledge can be built on a set of contents, the contents must be understood as the materialisation of invariants of knowledge presented in an intelligible layout. This allows content to be read and appropriated collectively by the ‘knowledge community’.
Digital document engineering allows the invariants of content to be modelled using logical forms (such as XML). The intelligibility of content must be theorised from a global point of view, which should include activities based around the content (and not only around reading). The dynamic property of digital content allows the integration of activities and content within the same media, and therefore contributes to the structuring of the learning community within particular reference points.
A hypothesis of reuse
A definition of reuse
We propose to define the reuse of digital content as a shift of content across three axes: shifts in time, shifts in space, and shifts in pedagogical intention.
Shifts in time refer to evolution of content (maintenance, enhancement, etc.). Content should never be understood as static, since it is the starting point for reference knowledge. Knowledge constantly changes, either because invariants do change (through the scientific refutation of prior knowledge), or because the ways in which invariants are presented change (pedagogy is not a formal science). The less expensive it is to reuse content over time, the more durable that content is.
Shifts in space refer to the transfer of content to different contexts in which they can be exploited. A typical example is of the sale of content from one organisation to another. It also refers to the problems encountered by organisations delivering distance teaching, since their learners are spread over several different contexts (homes, businesses, training centres, etc.). A shift in space implies that the same content can exist in several different environments at the same time, with different properties (in order to fit that particular environment). However, when dealing with digital content the problem is more complex, since the diversity of technical environments means that the content must have a range of properties in order to adapt to these environments. The less expensive it is to reuse content in a variety of spaces, the more widely they can be disseminated.
Shifts in pedagogical intention refer to the multiple use of content, meaning that the same reference knowledge can be used within several different pedagogical practices. For instance, the same content can be used for face-to-face learning and distance learning, for initial academic training and for continuous professional training, etc. This illustrates that the same content can be utilised for several different purposes and, as a consequence, that several different ways of utilising it exist in order to fulfil those purposes. The less expensive it is to reuse content within different pedagogical environments, the more numerous the contexts become. Heterogeneity as a fundamental problem for reuse
A fundamental problem for the reuse of pedagogical digital content is heterogeneity of format. This heterogeneity is engendered on the one hand by the variety of formatting possibilities offered by digital media and, on the other hand, by the pedagogy which is intrinsically heterogeneous in order to be specifically adapted to the variety of contexts it deals with. We distinguish between three formats in order to characterise a pedagogical digital content:
The pedagogical format can be defined as the translation of a pedagogical intention onto a chosen media so that the content will be understandable by the learner. This translation specifically involves two main areas: the scripting which prescribes the way the contents will be read (or navigated) and the definition of activities which prescribes the way the learner will interact with the content (or rewrite it).
The graphical format can be defined as the way the content is presented and organised on a medium, particularly regarding the invariants of absolute knowledge and intelligibility. Presentation has to be thought of within a multimedia context, including texts, images, audiovisuals, etc.
The technical format can be defined as the way in which the content is encoded on a digital support so that it can be managed by algorithms.
Heterogeneity happens at all three levels, since many technical formats can be used to represent the same content, many graphical formats can be calculated from the technical format, and many pedagogical formats can be applied to these graphical formats. The management of this combination of heterogeneous formats is one of the central points to consider when dealing with the reuse of content. Reuse as a problem of exchange and adaptation
In order to manage heterogeneity we introduce the concepts of exchange and adaptation.
Exchange of content is the process which allows a shift of content from one technical system to another (e.g. shifting from a production environment to a user environment). Exchange is necessary in the context of reuse since evolving technical systems (shifts in time) and differing distributions of technical systems (shifts in space) require exchanging content. Technical heterogeneity makes the management of exchanges very complex. How, therefore, does one ensure that two different technical systems will be able to integrate and interpret the same content in the same way? The only solution is for both content and user platforms to speak a common language, making their integration possible. Exchange within a heterogeneous context implies a technical standardisation (our emphasis is not pedagogical standardisation).
Although the exchange of documents is necessary to reuse, it cannot be seen as sufficient, since it does not deal with either graphical or pedagogical heterogeneity. If we consider two pieces of content, in the same technical format (DHTML for instance), they contain both information, and the way this information must be presented and interacted with. If these two contents are to be integrated within the same pedagogical media they will be expected to have consistent graphical format (because of ergonomic requirements [Scapin & al. 1997]) and consistent pedagogical formats (to ensure consistent pedagogical intentions).
In order to reuse content as we defined it, it is necessary to combine the concept of exchange with the concept of adaptation. Adaptation relates to the documentary concept of digital publication which allows the calculation of not only one, but several materialisations from one single logical expression. This process allows the construction of media to be printed (e.g. PostScript), used as slides (e.g. PDF), as interactive software (e.g. HTLM), and so on, from one single source (XML for instance). Adaptation is the solution to the need to shift from one technical, graphical or pedagogical format to another, depending on the context. It provides the means to maintain the heterogeneity of pedagogical content, whilst simultaneously being able to respond to the constant need to develop this content in the context of temporal, spatial or pedagogical shifts. Standards and exchange
Since 1988, and AICC’s first work, questions around the standards of pedagogical digital content have been asked. Some results have been obtained (from projects such as SCORM, IMS, ARIADNE, etc.) and a large amount of work is still in progress. The purpose of such work is to manage technological problems in order to create a potential demand for these technologies within international markets. The aim is to provide generic solutions that fit user requirements as well as being realistic from an economical point of view. There are four main aspects from e-learning standards that deal with content:
The packaging of content specifies how content should be formatted so it can be recognised by any system (file names, file organisation, etc.). “IMS Packaging” is currently the main system in use.
The indexing of content specifies which meta-data must be associated with the content, so that it can be searched and found by any system: currently “IEEE Learning Object Metadata”.
The structuring of content specifies how the content must be scripted in order to be produced correctly by any system: currently “SCORM Content Aggregation Model” and “IMS Learning Design”.
The tracking of activities related to content specifies how the content must communicate with systems in order to inform them about the learner activity it records while used: currently “SCORM Run Time Environment”.
If these aspects of content standardisation were complete, we would already have a solution for technical standardisation that allows exchange of content. However, it must be pointed out that even if these four systems were to offer a perfect solution for content exchange, they would not be able toadapt content since adaptation is always a specific problem.
Content adaptation cannot be standardised without standardising pedagogical and graphical formatting. Graphical and pedagogical standardisation are, if not impossible, at least not appropriate, since their purpose is to be specific and context-dependent, and not standardised. Document engineering and adaptation
How should problems of adaptation be solved if not through standardisation? The first element of the solution, well known in the field of documentary engineering, consists of dissociating the storage format from the publishing formats, and to set up software that automatically transforms content from its storage format to one or more publishing formats. Let us note that a storage format has very different properties (generic, logical, richly indexed, very structured, etc.), to the publishing ones (readable, executable, specific, etc.). The second element of the solution is therefore to find a good “pivot format” that is able to provide the storage format for all the publishing formats. Such a format has to promote “logical” representation of the information, describing not how the information must be presented on a support device, but what the inbuilt characteristics of the information are. The main requirement is to remain neutral toward all publishing formats.
Related to numbers.
A publishing format is dedicated to a support device, and provides information about the particular representation of that content. It is difficult to switch from one publishing format to another.
<definition> <notion>Digital</notion> <explication>Related to numbers</explication> </definition>
A logical storage format is independent from a support device, and provides general information about the structure of that content. It is easy to generate any publishing format from a logical storage format.
Figure 1: Publishing versus storage format
Research about content technologies has provided powerful solutions to the adaptation problem over the last five years, with the arrival of XML. XML provides a structure capable of representing content in a logical and neutral format. It is the best candidate for an (almost) universal storage format. It can only be “almost” universal, since it is not adapted to represent binary content (images, videos, etc.). Despite this, it remains the best candidate, able to index this binary content (and standard solutions already exist to encode binary content, such as JPEG, MPEG, etc.). Moreover, XML comes with powerful and generic software solutions for publishing content in any public format (meaning any format in which encoding is public, such as HTML, RTF, PDF, etc.) from XML structured content.
Managing pedagogical digital content implies managing their reuse. Content is in a dynamic form when it evolves over time, when it has to be deployed in several technical spaces and when it must be utilised in different pedagogical contexts. Content is written on media according to technical, graphical and pedagogical formats. The variety of these formats leads to a heterogeneity, making the question of reuse complex. Total homogenisation is impossible, since graphical and pedagogical formats have to remain heterogeneous to fit the specificity of each context. On the other hand, random heterogeneity is not acceptable, since consistency of graphical and pedagogical format have to be ensured within a defined context. The general solution we submit is the standardisation of technical formats in order to allow exchange of content, and the adoption of a documentary approach in order to allow adaptation of content. A combination of exchange and adaptation is a way to manage heterogeneity, allowing consistency to be maintained even in a context of reuse.
Learning content management
Learning Content Management Systems
Until recently an e-learning platform was more or less restricted to Learning Management Systems (LMS). Such a platform is in charge of managing the training, including the learners and teachers, their material and digital needs, planning and communication tools, any follow up to the training, etc. This system was based on a logistic-centred approach which considered content as a secondary problem.
The concept of Learning Content Management Systems (LCMS) arose in order to promote a content-centred approach beside the logistic aspects. The LCMS is in charge of managing content, and particularly content specification, production, publication, maintenance and reuse.
The objective of an e-learning platform is to allow massive (distance or otherwise) learning using digital technologies. This learning is “massive” in the sense that it deals with large amounts of students, and large quantities of training are offered in terms of content. LMS manage student volume and LCMS manage volume in terms of content. LMS were considered more important at the beginning of e-learning developments, since they were needed to initiate experiments. However, now LMS have been implemented widely, LCMS are as important, since they are a precondition to developing content in a realistic way and so to providing meaningful training programs.
There are two main reasons why LMS and LCMS do not need to be the same platform, i.e. a global digital platform that would assume both training and content aspects.
Firstly, they assume different functions:
LMS are expected to manage activities during training.
LCMS are expected to manage content production prior to training.
Since these functions have few common points, it is obvious that they could be different systems, and it is unlikely that a single system would be efficient at both aspects.
Secondly, they need different technological frameworks:
Actors utilising the systems are different (learners and teachers for LMS and authors and multimedia producers for LCMS). The number of actors using each system also differs (very few actors for LCMS, a dozen for instance, and a large amount for LMS, up to several thousand learners).
Cycles of use vary (the length of a lecture program for LMS, such as a few weeks, and the longevity of an institution for LCMS, possibly many years).
Technological formats and constraints are dissimilar (durable and modifiable content for LCMS, executable and interactive content for LMS, high quality communication tools for LMS, high quality versioning and storage for LCMS, etc.).
e-Learning digital platform
LCMS : content management
LMS : training management
Figure 2: e-Learning platform composed with LMS and LCMS
To adopt a documentary approach that allows the adaptation of content (and not only management) as we defined it previously, the general concept of LCMS is not sufficient. It is necessary to attach to it the concept of a publishing chain. A publishing chain is a technical and methodological system
that manages production and publication of documents, based on a separation between production and publishing formats [Bachimont & al. 2002].
Figure 3: Publishing chain general principles
Publishing chain, LCMS & LMS
The following schema aims at summarising the global solution we propose for learning content management throughout the lifecycle of that content, from initial production to use in operational context.
Figure 4: Lifecycle of learning content in a global documentary approach
The benefit of such a process is that reuse could take place at any level of the lifecycle without reconstituting all the steps before and after this level. For example:
If content must be changed (because the domain evolved for instance), only the logical content needs to be modified at the production step. Modelling will not have to be repeated (if the content still fits the existing models), or scripting and activating (because the pedagogical principles have not changed), and moreover, all the publication and deployment steps will be automatically done with the new content without any human work.
If the deployment context changes (because of a LMS change for instance), and if the LMS does not work to exactly the same standards (if it did, there would be no work at all) only the standardisation step would have to be reproduced, by specifying new means of
communication between the content and the LMS. No work will be demanded of the content experts, the publishing designers, etc.
If the publishing was to be changed (because a new media publication was needed, a paper version existed and slides were required for instance), only the publishing step for the transformation from pedagogical to usable contents would need to be worked on. Authors and deployment experts would not need to be mobilised. SCENARIlcms
Whereas many LMS are available on the marketplace2, the LCMS domain has been developing for only one or two years3, consisting of historic actors such as ARIADNE (ariadne.unil.ch) and commercial actors such as learn eXact (www.giuntilabs.com). The existing learning content management solutions mainly address the problem of reusing content from an exchange perspective, without adaptation possibilities. The GenDoc project should be mentioned here, [David 2003] which aimed at integrating a publishing chain to ARIADNE system.
We have been involved since 1997 in researching the introduction of documentary technologies within the field of education [Bachimont & al. 1997]. This research has been developed in a thesis [Crozat 2002] and gave birth to a system called SCENARI [Bachimont & al 2002]. This system was initially a publishing chain dedicated to learning content, tested in many real use contexts, such as the PSA (motor industry), Axa (insurance), SNCF (French National Railways), various SMEs and some French universities. The publishing chain was strengthened in order to become a LCMS with the help of a French Ministry of Industry R&D program (program name: PRIAMM, project name: CHAPERON, global budget: 1,2 M€). SCENARIlcms is today an operational experimental platform that combines the adaptation functions of a publishing chain, with the management and exchange functions of a LCMS.
Figure 5: Overview of SCENARI software
2 Many studies about LMS have been realised in France, Europe and USA, including Préau (preau.ccip.fr), Algora (algora.org), Thot (thot.cursus.edu), Canal Eife-l (eife-l.org).
3 Some reports on the subject begin to appear, for instance from Brandon Hall (brandonhall.com).
In order to promote reuse, LCMS have been developed, alongside LMS, to help in managing content. This system essentially deals with exchange of content. On the other hand, publishing chains are documentary systems that make content adaptation possible by separating logical from publishing formats, and providing the tools that allow automatic transformation from logical to publishing formats. Since no systems exist that combine publishing chains and LCMS functions, and since we formerly developed a publishing chain for learning content – SCENARI – we decided two years ago to develop such a system on the basis of our existing platform. This paper is based on the results we obtained whilst realising this system.
SCENARI has been deployed for three years in real life environments. We plan to keep on enlarging user communities in order to validate our hypothesis more widely and to observe to what extent reuse activities are facilitated allowing adaptation functions along with exchange ones. This enlargement is oriented toward France and other French-speaking countries as a first step (Mediterranean countries in particular). In a second phase we plan to include linguistic and cultural aspects in order to be able to test the system in an international and multi-lingual environment. Intercultural adaptation for learning content localisation is certainly a major issue for reuse, and the documentary perspective we promote suggests interesting results in this field.
BACHIMONT B, “Herméneutique matérielle et artéfacture : des machines qui pensent aux machines qui donnent à penser”, PhD thesis in Epistemology, Ecole Polytechnique, France, 1996.
BACHIMONT B, CAILLEAU I, CROZAT I, MAJADA M, SPINELLI S, “Le procédé SCENARI : Une chaîne éditoriale pour la production de supports numériques de formation" (The SCENARI solution: a publishing chain to produce educational digital media), TICE’2002, France, 2002 (abstract available in English).
BACHIMONT B, CHARLET J, “PolyTex : un environnement pour l'édition structurée de polycopiés électroniques multisupports”, EuroTex'98, France, 1998.
CROZAT S, “Elément pour la conception industrialisé des supports pédagogiques numériques" (Elements for industrial design of instructional digital medias), PhD thesis in Computer Sciences, Université de Technologie de Compiègne, France, 2002.
CROZAT S, TRIGANO P, “Structuration et scénarisation de documents pédagogiques numériques dans une logique de massification”, Sciences et Techniques Educatives, vol.9, N°3, Ed° Hermès, 2002.
DAVID JP, “Modélisation et production d’objets pédagogiques”, Sciences et Techniques Educatives, Hors Série “Ressources numériques, XML et education”, Ed° Hermès, 2003.
SCAPIN D, BASTIEN C, “Ergonomic criteria for evaluating the ergonomic quality of interactive systems”, Behaviour & Information Technology, n°16, 1997.
STIEGLER B, “La technique et le temps ”, édition Galilée, 1994.