A Report on the International Learning Object Metadata Survey

Norm Friesen, PhD September 3, 2004

Introduction

A wide variety of projects and organizations are currently making digital learning resources ("learning objects") available to instructors, students and designers via systematic, standards-based infrastructures. One standard that is central to many of these efforts and infrastructures is known as "Learning Object Metadata" --"IEEE 1484.12.1-2002," or "LOM" for short. "Metadata" refers to systematically created and formatted descriptions of resources --whether these be intended for learning, informational or other purposes. The LOM standard has become the most widely used solution for classifying and describing digital resources intended specifically for learning and education.

Characteristics of Learning Object Metadata Surveyed

Of course, the "LOM" standard is only one way of describing digital and online resources. Other metadata standards and methods have been developed for this purpose, including Dublin Core and RSS (or Rich Site Summary). One thing that is common to all of these standards and methods is that each defines the function and structure a number of "data elements." Examples of these elements include the "title," "author" or "location" of the resource. RSS, for example, focuses on three of these data elements --"title," "link," and "description"-- while Dublin Core specifies only 16 metadata elements. The Learning Object Metadata standard, on the other hand, includes 76 data elements, covering a wide variety of characteristics attributable to learning objects, including their "size," their level and type of "interactivity," and the "educational context" to which they are best suited.

The LOM defines all of its data elements in interrelationships that are both hierarchical and iterative. At the top of the hierarchy of LOM elements are nine broad "category" elements, General, Lifecycle, Meta-metadata, Technical, Educational, Rights, Relation, Annotation and Classification. These category elements contain sub-elements, which, in turn, often contain further sub-elements. Many of the category elements, sub-elements, and the layers of elements that can come beneath them can be repeated. This results in complex hierarchical and iterative structures, which allow for over 16,000 possible, concatenated element repetitions in total. Some of the sub-elements in the LOM can be assigned any type of alphanumeric values (e.g. the title element). Other elements are associated with a limited set of pre-defined values (e.g. describing educational context such as "school," "higher education" or "training"). In this last case, the set of values is often referred to as a "vocabulary" or a "controlled vocabulary." Still other elements in the LOM contain descriptions of persons (e.g. authors, editors, etc.) that are specially formulated and formatted using a specification known as "vCard."

Given its relative size and complexity --as well as the fact that it is the first technical e-learning standard to be widely adopted-- the implementation of the LOM presents an excellent opportunity for study and research. By looking at how it has been implemented in projects and in specific metadata records, it is possible to learn valuable lessons about e-learning standards implementation, and about how to further develop and refine standards to meet implementers' and educators' needs.

This paper presents the basic findings of an international survey of the implementation of the LOM standard. This survey was undertaken as a part of ongoing Canadian work in an international e-learning standardization forum, the ISO/IEC (International Standards Organization/International Electrotechnical Commission) subcommittee on "Information Technology for Learning, Education and Training."

This survey was carried out in two phases. The first stage entailed the manual analysis of very small sets of randomly selected metadata records from a variety of collections or projects. The second stage involved the statistical, aggregate analysis of much larger sets of sample records, taken from five large collections from widely varying regions, including the European Union, Canada, and China. The findings of both stages of the survey were consistent and mutually reinforcing. Only general findings and conclusions are reported in this paper; more detailed survey data and analysis is publically available in the original survey reports, as submitted to the ISO/IEC committee (Friesen & Nirhamo, 2003; Friesen, 2004).

Survey Questions

The survey of LOM implementation was guided by three specific questions. Each question relates to the data elements of the LOM, and to the way each element is understood and used (or alternatively, not used). These questions --along with contextualizing explanations-- are provided here:

  1. "Which elements are being designated for use in LOM implementations?" As a first step in implementing the LOM, organizations, projects, consortia and even national entities will frequently designate a particular set of LOM elements for use in their respective domains. Such "localized" sets of elements are called an "application profiles," and these profiles are often created as a process separate from technical implementation, as a matter of policy: elements are explicitly recommended, required, or excluded from use. These policies are often applied to both e-learning content development and the creation of infrastructures to support the exchange of this content. Such an element set can include custom elements or element "extensions," adding new elements to the 76 already in the LOM. But more often, they choose a subset of LOM elements, reducing the number of LOM elements --often by as much as one-half.
  2. "Which elements are actually used in metadata records?" Regardless of the elements required, recommended or excluded in application profiles and policy documents, those that are actually used provide additional information about element utility and metadata requirement: Of those elements actually populated, some may be utilized consistently, and may be repeated in order to be assigned a range of appropriate values. Others may appear only once, and be assigned an apparently arbitrary value.
  3. "What values are assigned to these elements?" Finally, when elements are used, it is important to see how they are actually applied to the needs of individual projects and resources. Quantifying the "kinds" of values assigned to elements can be difficult in some instances; but those elements with "controlled vocabularies" and value sets that are otherwise constrained (e.g. through the use of vCard) can be analyzed quite readily.

    Findings

The findings of the survey can take the form of answers to each of the three questions raised above:

  1. "Which elements are being designated for use in LOM implementations?" The findings of the survey show that the elements designated for use in application profiles in many cases overlap with those already designated in the smaller, simpler metadata element sets represented by Dublin Core and even RSS. In addition, the survey showed that educational elements in the LOM --those aspects of the datamodel that obviously add special value for educational applications-- are frequently not designated for mandatory use in application profiles. Given some of the findings discussed below, this raises the question as to whether the challenges and costs presented by LOM implementation are readily offset by the benefits that it is able to provide --especially compared to alternative metadata solutions such as Dublin Core.
  2. "Which elements are actually used in metadata records?" The answer is essentially the same as in the first question --with some qualifying details to be found in the survey data. The elements actually populated in the metadata records studied can be characterized as focusing on the intellectual content of the resource. Many of these elements have rough or exact equivalents in the Dublin Core Metadata element set. The same can be said for those elements which describe the resource in terms of its characteristics as a media and Internet file: they are well-utilized and also correspond to elements in the Dublin Core element set. Those elements which attempt to describe the resource as a software "object" or to associate with it an educational context or level are much less frequently utilized. This is reinforced by vocabulary values which are used to identify contributions to the creation of the resource: The roles of author and publisher were well-utilized (together constituting over 95% of the roles or values chosen); but roles associated with software, instructional design, or even media development (e.g. initiator, terminator, graphical designer, instructional designer) were ignored.
  3. "What values are assigned to these elements?" The answer is again in keeping with the answers to the first two questions: In many cases, elements with controlled vocabularies were assigned values that reflected traditional, even print-oriented understandings of the resource as published asset, rather than as modular software object. These include not only the roles of contributors to the object (as mentioned above), but also the many values which can be assigned to indicate the resource's technical format (45% of which were indicated as "text/html").

There were also a number of findings that pointed to issues other than those raised in the questions above:

  1. The process of combining metadata records from a wide variety of collections into a single "collection" for aggregate analysis itself produced a surprising result: It was discovered that it is very difficult or (given limited resources) impossible to readily import these various records into a single database and then to perform database queries to discover divergent and common characteristics. This has seems to have been the case in other, more limited survey efforts as well (e.g. Najjar, Ternier, & Duval, 2003). LOM structures, with their hierarchical and iterative interrelationships, make data portability difficult to realize using conventional and low cost technologies. Data portability and reuse is presumably the raison d'être of the LOM. The difficulties that the LOM presents to educational implementations in this regard are not at all positive indicators for the hope of increased sharing and reuse between implementations and across jurisdictions.
  2. Finally, very little of the complexity and detail that vCard information can supply about contributors are actually exploited (almost 90% of the vCard fields are unused in all instances studied). Any advantage that the inclusion of vCard might present in LOM records is far outweighed by the difficulties of implementation, and the under-utilization of vCard fields.
  3. Only a small number of the potential element iterations and vocabulary values were used overall. This is unfortunate, given the difficulties that these nested iterations and vocabulary choices can present to systems developers and record creators, the fact that few are used is cause for some concern.

Conclusion

What does all of this mean for learning objects and for the many projects and initiatives where learning object metadata is being implemented? First, on a positive note, the survey findings show that there is considerable convergence between implementations in element choice and utilization. Implementers have consistently opted for the use of roughly the same sub-set of elements, focusing on the description of the intellectual content of the resource. However, the fact that these same elements are also included in other, simpler metadata solutions raises an important question: Namely, "what is the value added by the multiplicity and complexity of elements and element structures in the LOM?" The fact that a range of elements and many of the possible element iterations in the LOM remain unused means that their value is not being realized. At the same time, the price paid for this complexity and multiplicity, in terms of implementation work and data portability issues, is not a small one. All of this suggests that a very considerable return on investment will be required for profit ultimately to accrue to learners and end-users.


References

Friesen, N. (2004). Final Report on the "International LOM Survey." http://dlist.sir.arizona.edu/archive/00000403/

Friesen, N., Nirhamo, L., & Knoppers, J. (2003). International Survey of Learning Object Metadata Implementation for ISO/IEC. http://mdlet.jtc1sc36.org/doc/SC36_WG4_N0029.pdf

Najjar, J., Ternier, S. & Duval, E. (2003). "The Actual Use of Metadata in ARIADNE: An Empirical Analysis", Proceedings of the 3rd Annual Ariadne Conference. http://www.cs.kuleuven.ac.be/~najjar/papers/EmpiricalAnalysis_ARIADNE2003.pdf