NewsCodes: G2 Knowledge Items by IPTC

The IPTC delivers all Controlled Vocabularies (CVs or 'schemes' in NewsML-G2 terminology) under the IPTC's authority - branded as NewsCodes - on the CV server at http://cv.iptc.org/newscodes. This server makes the data available in different formats, one of them is the NewsML-G2 Knowledge Item; details of this format are described on this page. (Find out more about all the available formats on the CV Server Guidelines page.)

Overview:

  • NewsML-G2 Knowledge Item in a Nutshell
  • Presentation of a concept
  • Presentation of a CV

NewsML-G2 Knowledge Item in a Nutshell

The Knowledge Item is one of the Items defined by the NewsML-G2 exchange format based on XML. Its purpose is delivering one to many concepts; a concept is a real-world entity or an abstract term for categorization.

A Knowledge Item includes these major sections:

  • A GUID and a version for this Knowledge Item in attributes of the <knowledgeItem> root XML element.
  • Metadata about the item as a whole in the <itemMeta> XML element.
  • Metadata about the content of the item as a whole in the <contentMeta> XML element.
  • Metadata about specific concepts as parts of the content in <partMeta> XML elements.
  • The set of all delivered concepts in the <conceptSet> XML element.
  • Metadata about the CV in the <schemeMeta> XML element.

The NewsML-G2 standard provides a framework for many different types and formats of content and flexible ways to cover syntactic variants. The sections below describe how the IPTC expresses concepts and their relationships to other concepts.
A deeper introduction to the Knowledge Item and CV management is provided below; additional information can be found in NewsML-G2 Guidelines document (chapter 12).

Presentation of a concept

Each concept is presented by a <concept> XML element which has a similar structure to that shown below. The full concept definition can be downloaded as part of the Media Topics Knowledge Item.

<concept id="medtop20000006" modified="2010-12-14T21:53:19+00:00"> <conceptId qcode="medtop:20000006" created="2009-10-22T02:00:00+00:00" /> <type qcode="cpnat:abstract" /> <name xml:lang="en-GB">film festival</name> <name xml:lang="de">Filmfestival</name> <name xml:lang="ar" dir="rtl">مهرجان افلام</name> <name xml:lang="es">Festival de cine</name> <name xml:lang="fr">Festival de cinéma</name> <definition xml:lang="en-GB">National and international motion pictures festivals, selections, festival juries, nominations, awards etc. </definition> <definition xml:lang="de">Artikel über nationale und internationale Festivals der bewegten Bilder, deren Auswahl, Nomierungen, Preise, Auszeichnungen und Juries</definition> <broader qcode="medtop:20000005" /> <related qcode="medtop:20000005" rel="skos:broader" /> <related qcode="subj:01005001" rel="skos:exactMatch" /> <related uri="http://cv.iptc.org/newscodes/mediatopic/" rel="skos:inScheme" /> <related uri="https://www.wikidata.org/entity/Q220505" rel="skos:exactMatch" /> </concept> 

Explanation of the components describing the <concept> itself:

  • concept @id attribute: an identifier which is internal to the specific Knowledge Item document.
  • concept @modified attribute: indicates the date and time of the last modification of this concept.
  • conceptId @qcode attribute: defines the URI identifying this concept, using the short QCode format. This is the unique identifier for the concept.
  • conceptId @created attribute: defines the date and time when this concept was first defined by the authority of this CV. This timestamp does not relate to the creation of the entity or categorization term represented by this concept.
  • conceptId @retired attribute: defines the date and time when this concept was marked as retired by the authority of this CV, indicating that this concept should no longer be actively used.
  • type: defines the type of the concept, for most IPTC CVs this will be the Abstract Concept type.
  • name with @xml:lang attribute: provides a name for this concept in the language defined by the tag in the xml:lang attribute.
  • definition with @xml:lang attribute: provides a natural language definition for this concept in the language defined by the tag in the xml:lang attribute.
  • note with @xml:lang attribute: provides a note about this concept in the language defined by the tag in the xml:lang attribute.
Explanation of the components describing relationships of this concept to other concepts: 
  • broader: identifies a concept with broader semantics. The qcode attribute defines the URI identifying the broader concept, using the short QCode format.
  • narrower: identifies a concept with narrower semantics. The qcode attribute defines the URI identifying the narrower concept, using the short QCode format.
  • related: identifies a concept and the relationship of this concept with respect to the target concept. 
    The qcode attribute defines the URI identifying the related concept, using the short QCode format. Alternatively, the uri attribute can be used to define the URI, using the full URI format. 
    The rel attribute defines the URI identifying the relationship of this concept with respect to the target concept, using the short QCode format.
The following relationships are used by the IPTC; all of them are based on the W3C's SKOS standard:
QCodeDescription of the relationship
skos:exactMatchA SKOS relationship: the target concept's semantics and the semantics of this concept have an exact match
skos:closeMatchA SKOS relationship: the target concept's semantics and the semantics of this concept have a close match
skos:broadMatchA SKOS relationship: the target concept's semantics and the semantics of this concept have a broad match
skos:narrowMatchA SKOS relationship: the target concept's semantics and the semantics of this concept have a narrow match
skos:broaderA SKOS relationship: the target concept's semantics are broader than the semantics of this concept
skos:narrowerA SKOS relationship: the target concept's semantics are narrower than the semantics of this concept
skos:inSchemeA SKOS relationship: this concept is a member of the CV (scheme) identified by the URI in the @uri attribute.
Notes on the use of some of these relationships:
  • skos:broadMatch vs. skos:broader, skos:narrowMatch vs skos:narrower: the skos:*Match relationships are used if the target concept is from a different CV, but skos:broader and skos:narrower are used only with concepts from the same CV.
  • skos:broader vs. <broader>, skos:narrower vs <narrower>: these SKOS relationships and the NewsML-G2 XML elements express exactly the same relationship. For the convenience of parsing the concepts of a Knowledge Item, both syntax variants are used, e.g.:
    <broader qcode="medtop:20000005" /> <related qcode="medtop:20000005" rel="skos:broader" />
  • skos:inScheme: for IPTC NewsCodes the URI of this property is identical to the Scheme URI of the NewsCodes vocabulary.

Presentation of Facets of a Concept

Facets of a concept is a feature applied to concepts already at a time without the Semantic Web: facets are used to narrow down the semantics of a concept and in many cases the same facet can be applied to many concepts, e.g. age and colour of things, gender of persons, size of a team, distance of a sport competition.
This feature is used at IPTC for sport competition disciplines of the Media Topics http://cv.iptc.org/newscodes/mediatopic/ 

This is how the facet of a concept is presented:

  • Prerequisite: one to many facets are defined - for sport competition disciplines see http://cv.iptc.org/newscodes/asportfacet
  • Any sport competition Media Topic may use one to many facets corresponding to its kind of competition, see as example "alpine skiing" and its ikos:hasFacet relationship - http://cv.iptc.org/newscodes/mediatopic/20001057
  • Facets related to a Media Topic are inherited by its narrower terms. See the "competition discipline" Media Topic (http://cv.iptc.org/newscodes/mediatopic/20000822)  and its many facets. Any of them can be used with any narrower term, one of the disciplines.
  • A facet may be related to one to many concepts as optional values of the facet relationship, see as example "alpine skiing type" and its ikos:hasObject relationship http://cv.iptc.org/newscodes/asportfacet/alpineskiingtype. One of them should be selected for faceting a Media Topic.
  • A facet may also have a literal value, e.g. for team size, the distance of the competition, or age class.

Example1 : The competition of an alpine skiing slalom by women should be expressed.

Example 2: The competition of a 10,000m long-distance run by men should be expressed.

The following relationships are used by the IPTC to present faceted concepts, see the IKOS vocabulary http://cv.iptc.org/newscodes/ikos/ 
QCodeDescription of the relationship
ikos:hasFacetThis concept has the target concept as a facet. (This relationship is used by the faceted concept, e.g. Media Topic)
ikos:isFacetOfThis concept is a facet of the target concept. (This relationship is used by a facet.)
ikos:hasObjectThe target concept may be used as object of an RDF triple with this concept as predicate. This could be understood as this facet concept has the target concept as value. (This relationship is used by a facet.)
ikos:isObjectOfThis concept may be used as object of an RDF triple with the target concept as predicate. This could be understood as this concept is the value of the target facet concept. (This relationship is used by a facet object/value.)
 
 

Presentation of data about the CV

Data about the CV provided by this Knowledge Item is presented by a <schemeMeta> XML element looking like this one:

<schemeMeta uri="http://cv.iptc.org/newscodes/mediatopic/" authority="http://www.iptc.org" preferredalias="medtop"> <definition xml:lang="en-GB">Indicates a subject of an item.</definition> <name xml:lang="en-GB">Media Topic</name> <note xml:lang="en-GB">The Media Topic NewsCodes is IPTC's new (as of December 2010) 1100-term taxonomy with a focus on text. The development started with the Subject Codes and extended the tree to 5 levels and reused the same 17 top level terms. The terms below the top level have been revised and rearranged. Each Media Topic provides a mapping back to one of the Subject Codes. </note> <related qcode="medtop:01000000" rel="skos:hasTopConcept" /> <related qcode="medtop:02000000" rel="skos:hasTopConcept" /> <!-- … and more … --> <related qcode="medtop:17000000" rel="skos:hasTopConcept" /> <schemeMetaExtProperty rel="ikos:availLang" value="ar" /> <schemeMetaExtProperty rel="ikos:availLang" value="fr" /> <schemeMetaExtProperty rel="ikos:availLang" value="en-GB" /> <schemeMetaExtProperty rel="ikos:availLang" value="es" /> <schemeMetaExtProperty rel="ikos:availLang" value="de" /> </schemeMeta> 

Explanation of the components of <schemeMeta>:

  • schemeMeta @uri attribute: a URI identifying this CV.
  • schemeMeta @authority attribute: a URI identifying the authority of this CV.
  • schemeMeta @preferredalias attribute: indicates the string value preferred by the CV authority for the scheme alias to be used for this CV.
  • name with @xml:lang attribute: provides a name for this CV in the language defined by the tag in the xml:lang attribute.
  • definition with @xml:lang attribute: provides a natural language definition for this CV in the language defined by the tag in the xml:lang attribute.
  • note with @xml:lang attribute: provides a note about this CV in the language defined by the tag in the xml:lang attribute.
  • related with a rel="skos:hasTopConcept" attribute: the rel qcode identifies a concept residing at the top level of this CV's hierarchy.
  • schemeMetaExtProperty with a rel="ikos:availLang" attribute: each of this elements represents one of the languages the free-text names, definitions, notes etc have been translated to from the British English ("en-GB") reference version.
 
close