Jump to content

Wikivoyage:RDF

From Wikivoyage

Wikivoyage's guides and other articles have great user-contributed text and pictures meant for human beings to read and understand. Our wiki technique lets us describe people, places, languages, attractions, and thousands of other kinds of things with free prose and pictures. But there are also descriptions and relationships between things in Wikivoyage that can be standardized so that computers can process them as well. For example we can define whether an article is about a country or a city. If it is a city, we can define what country it is in. To make these standardized machine-readable descriptions, we use a tool called the Resource Description Framework or RDF.

So what is RDF?

[edit]

RDF is a way of recording standarised machine readable information and relationships directly in an article text. There is no need for any form of external database. To make the system more human friendly, the RDF statements are usually contained within templates which are then incorporated into Wikivoyage articles.

So, RDF is a framework for making statements about resources. In Wikivoyage a resource can in theory be anything: a destination, an attraction, a picture, an article, anything that can have a name.

Statements are in the form:

resource predicate object

A predicate is the name of a property of a resource, like its size, location, history, licence restrictions, or a relationship to other "resources". The object is the value.

So "car color red" would be a statement about a car; "Sydney isPartOf New_South_Wales" would be a statement about Sydney and its relationship with New South Wales.

Resources and properties must be uniquely identified. Humans may be able to guess that we are referring to Sydney, Australia, but computers aren't so clever. In the Web world, each "resource" must be identified with a URI (usually an URL). So, we would identify Sydney as "http://en.wikipedia.org/wiki/Sydney". For more abstract resources this can be tricker, but usually it is just a matter of assigning them a unique identifier within the wikivoyage namespace. For example, to identify User:(WT-en) Evan, you could use the URI for his Wikivoyage user page, "http://en.wikivoyage.org/wiki/User:Evan".

Schemas

[edit]

People who need to agree on statements often create 'vocabularies' or 'schemas'. By agreeing on a schema, we can all agree about what a particular predicate means. Without an agreed schema, people may invent different predicates to mean the same thing, and it all becomes terribly confusing. Think of the schema as being a glossary or dictionary for the predicates that we use.

For example, the Dublin Core Metadata Initiative (DCMI) has a schema for very simple information, such as you'd find on a library card. Because we already know that we need a unique name for all of our resources and predicates, all predicates defined by the DCMI, are prefixed with "dcterms:".

The DCMI defines an isPartOf predicate - the idea of something being part of something else.

So, returning to our above example, we now have the RDF relationship

http://en.wikivoyage.org/wiki/Sydney dcterms:isPartOf http://en.wikivoyage.org/wiki/New_South_Wales

IsPartOf is only one of the many predicates available in the DCMI. Others that Wikivoyage uses include, contributor, and date.

The DCMI isn't the only schema useful for Wikivoyage. Schemas like the Creative Commons schema may also be useful for specifying licensing information.

We also have our own schema, which contains predicates for specifying information particular to wikivoyage, like the predicate wts:hasDocent is a predicate for the fact that some articles have docents.

RDF statements can be encoded in a number of different ways but on Wikivoyage we use a format called "Turtle RDF", which is just a simple way of writing RDF relationships in a human readable way.

What isn't RDF

[edit]

RDF isn't a programming language. It isn't a way to make things happen, or to implement programming logic. It isn't something that requires experience in programming to understand. The more information we encode in RDF the more automatic processing can be done with Wikivoyage information. RDF is ideally used to make information that is already standardised easily accessible to allow novel and interesting uses of our information by applications and software developers. However, overuse of RDF could make Wikivoyage too much like a database rather than travel guide in free prose.

Getting RDF from Wikivoyage

[edit]

Retrieving RDF statements about resources from Wikivoyage articles is simple. The Special:Rdf lets you choose which article you're interested in, and what kinds of data to retrieve. Many interesting bits of information about pages -- history, contributors, licenses, links -- can be read in RDF-encoded format.

The RDF for a page is also linked (invisibly) from each page, in a <link> tag in the headers for the page. Although this is invisible to human readers, some browser tools and Web spiders can read and understand the encoded RDF information.

Adding RDF to Wikivoyage pages

[edit]

It is possible to add RDF statements to Wikivoyage pages through the regular editing process. RDF statements -- or groups of statements -- can just be written between <rdf> ... </rdf> tags in the source of the page. RDF statements in these blocks should be encoded in Turtle RDF format -- an easy-to-use format that mirrors English syntax.

RDF that's stored in the page can be retrieved using the "in-page" model on the Special:Rdf page.

RDF and templates

[edit]

It is better to put all RDF within Mediawiki templates, as this keeps the pages clearer, and makes fixing errors easier.

The syntax for Turtle RDF might also be daunting for people who just want to tag an article with a little information, so putting the RDF within templates helps keep it simple for these people too.

  • Template:isPartOf contains RDF that says the destination described in the current guide isPartOf another place. For example, {{IsPartOf|Veneto}}, added to the Venice page, says that Venice isPartOf the Veneto region of Italy. This data is also used for breadcrumb navigation links.
  • Template:Geo contains RDF to say that the destination described by the current guide has given lat/long coordinates. This is used for geocoding.

See also: Project:RDF templates

Possible uses for RDF on Wikivoyage

[edit]

Wikivoyage's use of RDF for in-page data is in its infancy, and the take up has been slow. There is a lot of room for experimentation. Some possible internal uses:

  • License info. Adding multi-licensing information using "cc:licence".
  • Noting that a Wikivoyage article is derived from another article on the Web (like the CIA Factbook, say) and information about who wrote it.
  • Listing related articles; articles about similar places, linking destinations with the phrasebook for the language spoken there, linking itineraries to a destination guide
  • Describing places (cities, countries, regions) and their geographical relationships to each other (nearby cities, bordering countries, ...)
  • Article status
  • Geo-spatial lat/long information, including GPS data (See Project:Geocoding)
  • Building "Bread-crumb" navigation, like "North America > Canada > Quebec > Montreal" (See Project:Breadcrumb navigation) .
  • Special presentation for certain kinds of articles (travel topics, itineraries, destination guides) and destinations (cities, countries, regions)
  • Organizing "clusters" of articles automatically (say, if you wanted to download Italy and all cities and regions in that country)
  • Automatic interface with mapping sites like Google Maps or Yahoo Maps.

There is an RDF Expedition about the uses for RDF on Wikivoyage. Join the expedition if you are interested in more details.


See also

[edit]

RDF and its possible uses is an extensive topic. These are some useful links for learning more.


See also

[edit]