XML and bibliographic formats

Main Article Content

Giovanna Granata

Abstract

The increasing diffusion of the Internet and the ever more widespread availability of bibliographic information that can be accessed on the network through the World Wide Web has in recent times magnified the problem of exchange of bibliographic data, thus contributing to a greater popularity of the MARC format which however, born as it was in a different context and in a by now distant era, has, together with its undoubted advantages, also limits and contradictions.

The main limits of the MARC format are due to the strictness of the ISO 2709 standard that it uses as a low level support. In the first place, in fact, it is hard to decipher and manipulate the ISO 2709 structure; in the second place it only permits two levels of logical connection operation: fields and sub-fields, limiting the structuring of the data and consequently rendering the design of high level format necessarily enumerative. This last problem has deeply influenced the evolution of the MARC format which over time has diversified into numerous national versions, which often have grown chaotically to fit different cataloguing habits and different types of library materials. This has actually ended up by rendering the exchange of data increasingly difficult to the extent of requesting an intermediate structure that is able to facilitate communication between the various formats of the same family. However the UNIMARC format, created especially for this purpose, has not managed to establish itself as a support for interchange, even if, by replacing some national varieties due to its better organicity, it has contributed to reducing the problem.

The lack of flexibility of the ISO 2709 standard, added to which its imperfect visibility on the Web due to its use of special characters, suggests that it be replaced with a more functional version for the new situation of circulation and exchange of information.

Some proposals in this sense have already been presented within the sphere of SGML, without however great success due to the excessive generality of the standard which renders its use difficult. The recent diffusion and greater simplicity of use of XML suggest making similar attempts with this language which is being established as a reference for new browsers in the place of HTML. On the contrary to the latter it uses a marked "extensibility", thanks to the possibility of creating specific DTDs, and a greater abstractness, inasmuch as it confines itself to defining just the logical structure of the texts. The print output is in fact referred to another instrument, XSL, that is able to associate the information contained in a document and its meaning, described in the DTD, with a particular form of display.

The conversion into XML of a MARC record involves considerable advantages: greater legibility and ease of manipulation of the data, but above all greater simplicity of the procedures of conversion from one format to another. In fact, once special DTDs have been defined for each of the format types, XSL can be used as a powerful transformation tool for "rewriting" an XML-UNIMARC record in the corresponding XML-USMARC, or in any other format the structure of which is known. However, in order to avoid the production of a style sheet for every possible combination of formats, reference must be made to an intermediate structure passage, using UNIMARC, which was in fact created for this purpose or, rather, using as an internal support the recent FRBR model which due to its structure and abstractness is better suited to integration in XML.

Article Details

Section
Articles