October 30, 2013, by Graham Kendall
What is the Semantic Web?
In this What is? series, we invite experienced researchers to write about their area of research, but in more general terms than they might normally do.
This article, entitled What is the Semantic Web?, is written by Muhammad Abba Lawan
If you would like to contribute, then please let me know (Graham.Kendall@nottingham.edu.my) and we can discuss how best to go about it.
Introduction:
For most people, the current World Wide Web is more than enough as a tool but for others, the lazy ones, we need something more from the Web. Well, the ‘something more’ comes in the name of Semantic Web: an extension to the current Web proposed a decade ago by “Tim Berners Lee [1]” – referred by many as the Father of the Web. Before we explain what it is, let us first discuss why the Semantic Web is worth having.
The current Web of linked documents can present tremendous amount of human-understandable WebPages from a single search through lexical marching of the search keywords with similar or exact words found in the documents available on the web. This information is vast and mostly irrelevant to the user (especially after the first few pages of the search result). Moreover, users have to scan through individual search result to make meaning out of its content. These machines storing and presenting the web documents knows nothing about their contents and as a result, operations that could have been entirely carried out by machines become necessary to have user intervention and supervision.
On the other hand, a Web that can store data in documents that are both human as well as machine readable will open up a whole new experience of human-machine interaction in our life. Should we allow computers to understand the Web contents, one will only need to submit basic data or answer a single query before an entire transaction is carried out efficiently by a machine: for example, one can sell a printed book online by providing only the ISBN number to a certain web application and with the help of the linked-data in the Semantic Web, the application will automatically fill out the remaining information of the book and contact appropriate auction websites to negotiate and place an auction without further user intervention. Since data not documents, is inter-linked in the Semantic Web, consider checking your schedules, account balance, emails, chats, etc from a single web application without having to open different web pages, sounds interesting.
The Semantic Web
As an extension of the classical web, the Semantic web is envisioned by W3C as the web of linked-data instead of web of linked documents provided by the classical web. The Semantic Web technologies allows machines/computers to walk-through web contents as humans do, identifying data objects and the relationship between them, thereby enabling informed decision-making and effective collaboration between application agents. In essence, the vision of the Semantic Web is to allow machines to do more with the data on the Web as is currently possible, through the global database of linked-data stored in a machine-readable resource description format (RDF graphs), using a vocabulary of concepts called Ontology and with rules to define how to handle the data.
The Semantic web has been successfully implemented by different organizations and the technology demonstrates effective support for large-sale users on the web. For example, in the case of BBC World Cup 2010 portal (http://news.bbc.co.uk/sport2/hi/football/world_cup_2010/), which uses Semantic Web technology to provide OWL reasoned semantic RDF data to results of millions of user page-requests per day [2].
More Semantic Web success stories can be found in this links http://www.semantic-web.at/success-stories and http://www.w3.org/2001/sw/sweo/public/UseCases/.
Semantic Web Layers
A glimpse of the technologies that make up the Semantic Web is presented in the figure 1 below and each layer is briefly described in relation to its functionality in the overall picture of the Semantic Web.
THE URI/IRI Layer:
At the core technology of the Semantic Web is the underlying ‘Unified Resource Identification/International Resource Identification’ URI/IRI layer, which is a string of text used to identify any resource on the web. This layer, a superset of URL, provides the basic building block for the semantic web on the existing world-wide-web and the URI layer coupled with the XML layer provides the basis for data interchange on the semantic web using the Resource Description Framework (RDF) Layer. An RDF is a graph format for encoding the description of web resources [3].
RDF Layer
Although XML has been a great success in providing a structural format for web documents, it is basically a format that provides human-readable documents on the web, which is not understood or processed by programs. To allow machine readable contents, a new framework is proposed, RDF employs the Uniform Resource Identifiers (URIs, also called web identifiers, a superset of URL) to identify resources (subjects and objects on the web) and describe them in terms of their properties and values (called predicates).
The RDF is able to explicitly represent metadata by separating contents of web documents from their structure. The basic idea of RDF as pointed out by W3C editors [4] is to create machine-processable statements to represent information about web resources or particularly their metadata to allow applications to exchange the information without loss of meaning. Unlike Uniform Resource Locators (URLs), RDF can be used to represent information about resources that cannot be located or retrieved from the web.
RDF Graphs
The subject-predicate-object’s form of representation by RDF also called “triples”, is often represented as a directed graph as shown in figure 2 below. The originating node represents the subject resource to be identified, the destination node represent the object responsible for the subject, while the directed arc represents the predicate or relation between them.
The RDF graph above represents the statement that the person ‘abdur.rakib’ – the subject identified by the URI in the subject node, has email – the predicate, identified by the URI as shown and the property value of the mailbox – the object, which is identified by the mailto: URI in the object node. With such form of identification, RDF is able to identify ‘all things’ on the web both the downloadable such as web documents and those that cannot be retrieved such as a person.
XML and QUERY Layers:
Extension Markup Language (XML) is a general markup language used to provide common structure for classical web documents and is utilized by the Semantic Web to guarantee common syntax using its namespace and schema definitions. With all the data of the Semantic Web being semantically interlinked in RDF format, a means for querying the data is necessary and hence the Special Protocol for RDF Query Language (SPARQL) provided by the Query layer. The SPARQL is a declarative query language that allows queries of semantic web data stored in RDF format.
Ontology Layer
In contrast to a single-schema database, semantic web being a global database may contain inconsistent, incomplete or even contradicting information. As such, a framework is needed to define particular domain of discourse – the Ontology Layer. This is made possible by extending the schema of the Resource Description Framework – RDFS.
Ontology provides explicit specification of the concepts in application domain using ontology development languages such as Web Ontology Language (OWL). The ontology language (OWL) and RDFS provide constructs that help to define domain concepts (classes, individuals and their properties) in a hierarchical order, which improves understanding of data by machines [5]. Moreover, the web ontology language (OWL) allows automated reasoning and inference on domain knowledge thereby deducing additional knowledge based on existing facts.
RIF and CRYPTO Layers
Alternatively, inference can be achieved via logic rules used to transform data or enrich data with additional specifications. This can be achieved through the Rule Interchange Format (RIF) layer, which specifies the format for encoding and exchanging the logical rules. A unifying logic provided by the Unifying Logic Layer, is necessary to ensure reliable data interchange between the three layers: Queries, Ontologies and Rules. The Crypto Layer provides the cryptography techniques to verify data source origins for reliable inputs, while the Proof and Trust layers ensure that all semantics and rules are adhered and effectively executed before passing the results to a User Interface & Applications layer.
As the Semantic Web is still evolving, these Semantic cake layers are also expected to evolve and be continuously standardized. In conclusion, continuous evolution of the web is a current research interest and since the proposal of the Semantic Web by Tim Berners Lee in 2001, there are still unanswered research questions of how much has been done and how much can be realized on the idea of the Semantic web. Nevertheless, the use of Ontology as a model for knowledge representation has proven to be a success especially in the field of medical sciences and is getting more and more widely accepted.
References:
[1] W3C, “Tim Berners-Lee,” Worl Wide Web Consortium, 2013. [Online]. Available: http://www.w3.org/People/Berners-Lee/. [Accessed: 02-Jul-2013].
[3] R. N. Sahay, “An Ontological Framework for Interoperability of Health Level Seven (HL7) Applications: the PPEPR Methodology and System,” National University of Ireland, Galway, 2012.
[4] F. Manola and E. Miller, “RDF Primer W3C Recommendation 10 February 2004,” W3C, 2004. [Online]. Available: http://www.w3.org/TR/rdf-primer/. [Accessed: 02-Jul-2013].
[5] J. H. and O. L. Tim Berners-Lee, “The Semantic Web,” Scientific American Magazine, May-2011.
No comments yet, fill out a comment to be the first
Leave a Reply