State-of-the-art situation
User Generated Content is nearly standardized, more and more young and older people create video content and publish it on the Web. Tomorrow, people will communicate on the Web, with video and audio just as everyone does so today with text.
Professionals and non-professionals need each other. They have to share documents and they need to learn how to work together more and more.
These important changes are a source for creativity that has to be sustained and encouraged.
So it deals with finding in the state of the art topics useful for the design of a processing model of audiovisual documents that convey administration, description and rendering data of an audiovisual document.
OAIS
The value of digital audiovisual documents is also clearly dependant of their persistence and their accessibility in evidence. The MediaMap project intends to obtain these properties “by initial construction” (ex ante) rather than “ex post”. The general model selected for the project is the OAIS model: see: http://nost.gsfc.nasa.gov/isoas/
The OAIS model recognises three key “packages”, i.e. wrapped (logically or physically) data including the required preservation data of the “Digital Audio Visual Document” embedded in the package:
- The DIP: Dissemination Information Package
- The AIP: Archival Information Package
- The SIP: Submission Information Package
The OAIS general schema is presented below.

The OAIS Reference Model (ISO Standard 14721) was developed by the Consultative Committee for Space Data Systems (CCSDS) as a work item under the ISO Technical Committee 20, Sub-committee 13. It is a framework for understanding and applying concepts needed for long-term digital information preservation (where long-term is long enough to be concerned about changing technologies). It is also a starting point for a model addressing non-digital information.
MediaMap constructs SIPs and exploit DIPs. The model specifies packages in a very general and abstract way, which means a large complexity of the “Ingest” and “Access” modules.
MEMNON (coordinator of the MEMORIES project FP6-035300) brings in MediaMap its experience of the AXIS dynamics, an implementation of OAIS, which, in particular, constructs the SIP’s in such a way, that the complexity of the “Ingest” can be largely reduced and organised in a modular and tailorable way (the “Interoperability Wickets”). That complexity is illustrated by the analysis made in: http://public.ccsds.org/publications/archive/651x0b1.pdf
The construction of the SIP in AXIS is based on:
- A data model backbone (called AXIS-Backbone) including a facility for wrapping the Digital Audio-Visual Document and the Profiles defining the identification system(s), the entities, the structures, the formats, the ontology and the associated semantic definition.
- The construction of the SIPs empowered by three main orthogonalities:
- Data vs. Carrier: the moment a carrier technology, applied in a profile used in an AIP, starts getting obsolete, a technology watch can give a warning, allowing to organise an automatic transfer of the data to a different carrier without changing the data format.
- Substance vs. Data: the moment a data representation technology, applied in a profile used in an AIP, starts getting obsolete, a technology watch can give a warning, allowing to organize an automatic transcoding of the data to another format known to the profile, in such a way that the new substance appears at least as good as the old one. This transcoding is done without loss of subjective quality.
- Logical vs. Physical: the “logical versus physical” orthogonality means that the boundaries of the carriers do not interfere with the boundaries of the data representing a logical entity (opus, clip,…).
The main new issues of the MediaMap project in relation to the persistence, the tailorability and the accessibility in evidence will be:
- To demonstrate that the main sources of updates, which provide Descriptive Information for the AIPs, are a Community of users and not an Administration.
- To construct the Dissemination Information Packages with the same properties as the one given for the SIP by the AXIS properties. Indeed, the DIP includes the modelling of the access representation, which is not required in the SIP. This new dimension is a real challenge. Fortunately, Belgavox and MEMNON (partly) have access to the results of the aceMedia project http://www.acemedia.org/aceMedia (FP6-001765).
In this functional model, a ‘Collection Profile’ defines how to bundle several standards (e.g. MPEG-4, METS, RDF, Dublin Core, SMIL, ID3, XML, MPEG21, UDF, PDF, etc.) to construct representations of the information having specific properties in mind.
The ‘Autonomous Assets Entities’ (AAE) are instances of ‘Collection Profiles’ defined by standards.
The concept of AAE is the central component of the management of the persistence and of the flexibility in exploitation. They are the elements containing media, metadata, and information on the context that are going round all along the bus.
Open Semantic Bus

The technologies issued from the WWW will influence increasingly most aspects of the modern economies and societies in the way information is handled in a computerised environment. But today the main tools for supporting information retrieval are still keyword-based search engines, and have serious limitations in terms of recall, precision, and content across various web Pages. The machines capture and manipulate it only at the syntactic level (HTML). This Hyper Text Mark-up language relies on a set of predefined tags, which control the appearance of a web page (such as font, line breaks, hyperlinks etc.). Though computerised, it is still a “Gutenberg” tool!
The central idea of the Semantic Web initiative is to make the meaning of web content machine-accessible and process-able. This enables the development of sophisticated tools that provide a much higher functionality in supporting human activities on the Web. The Semantic Web relies on the combination of the following technologies:
- Explicit metadata: They allow web pages to carry their meaning on their sleeves,
- Ontologies: They describe the main concepts of a domain and their relationships,
- Logical reasoning: it makes it possible to draw conclusions from combining data with ontologies.
Semantic web

XML allows users to define the structure of a web page, which becomes thus machine process able, and separates content from formatting, a nice property that is useful for differentiate appearance and structure based on the same content. In the design of the Semantic Web, XML provides the basic layer for syntactic manipulation.
RDF defines a layer residing on top of XML. It is a semantic language for describing resources which provides the means of taking care of the meaning of data. Its basic building block is a statement, a triple consisting of an Entity (called resource), a Property, and a Value (which may be another resource). As a consequence, RDF is defined as an XML vocabulary.
In RDF, the user has to define his own terminology in a schema language called RDF Schema. In essence, RDF Schema is a primitive ontology language offering Subclass and sub-properties, domain, range and restrictions on properties. RDF and RDF Schema provide the basic core languages for the Semantic Web.
But the expressive power of RDF and RDF Schema is deliberately very limited: RDF is (roughly) limited to binary ground predicates, and RDF Schema is (roughly) limited to subclass and sub-properties hierarchies, with domain and range restrictions of properties.
Assets requires a number of characteristics and extensions with more expressiveness such as:
- Disjointness of classes,
- Boolean combinations of classes,
- Cardinality restrictions,
- Special characteristics of properties,
- Local scope of properties.
The new W3C OWL standard (Web Ontology Language) is laid on top of RDF/S (schema), and seeks to find a balance between expressive power and efficient reasoning support. Reasoning is important because it allows one to (a) check for consistency of ontology and knowledge; (b) check for unintended relationships between classes and (c) automatically classify instances in classes.
Semantic Web technology can be applied for data integration, web services, multimedia collection indexing, and device interoperability and tools for describing analogue objects. (The semantic web vision” by Franck van Harmelen and Grigoris Antoniou, http://www.ics-forth.gr/isl/swprimer/index2.php)
Video capture tools
Concerning the Video capture tools, the products proposed on the market miss to include simultaneously the requested features proposed in the MediaMap’s camera 2.0 and Mate.
If we exclude ergonomics aspects, 4 major enhancements are proposed in camera 2.0 (Mate): Wireless Networking (WiFi) for a direct link between the camcorder and the servers, metadata capture and inclusion which includes an adapted keyboard, real non linear access to video and audio data, GPS positioning. These features must exist with usual camcorder features including lenses and zoom, anti-vibration, HDD storage, etc.
Networking aspects
A few manufacturer propose a Bluetooth interface build in the camcorder the big majority of digital camcorder propose at least an USB interface in order to download video elements to a PC. Camcorders and webcams are not really considered as network terminals as mobile phones.
Metadata management
Only some recent professional camcorders are managing metadata in the stream. Amateur camcorders only propose titling and dating clips for an authoring creation. Anyway in both cases no keyboard is built in the camera. Some manufacturers propose interactivity through PDA or mobile phone running specific software.
Positioning
Some camcorder allows direct capture of global positioning as well as X, Y, Z movements trough an external device.
What about Mobile phone
Some constructors are proposing video capture and have open Operating System allowing supporting applications as Mate or camera 2.0’s software, including process, character capture and network transfer. If we exclude actual processing power requirements, the limitation is due to the “Swiss knife” effect. Mobile phone is able to execute lot of features but processing power, size of memory; quality of optical part, lightweight and small size does not allow quality video production. Anyway, mobile phone is already and will be an incontrovertible Web 2.0 capture tool.