Wednesday, July 11, 2007

Week 6 : Semantic Web

Semantic Web: A Brief Summary

Introduction:

The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form. The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The first steps in weaving the Semantic Web into the structure of the existing Web are already under way. In the near future, these developments will usher in significant new functionality as machines become much better able to process and "understand" the data that they merely display at present.

Origin and Development:

Semantic Web origins from the premise that the Web is incomplete. It was posed by its inventor, Tim Berners-Lee. Even in his first designs of what the Web should be, there were ideas that did not come into reality in the version of the Web we currently have, which can be called the "Web 1.0". In 1999, in conjunction with other people interested in creating a new web, Berners-Lee engaged a new trial to get a more complete picture of his initial Web dream. This new attempt was called the Semantic Web and has created a new community of research organized around the Semantic Web Interest Group at the World Wide Web Consortium. The "Semantic Web", a term coined by Tim Berners-Lee, is used to denote the next evolution step of the Web. Associating meaning with content or establishing a layer of machine understandable data would allow automated agents, sophisticated search engines and interoperable services, will enable higher degree of automation and more intelligent applications. The ultimate goal of the Semantic Web is to allow machines the sharing and exploitation of knowledge in the Web way, i.e. without central authority, with few basic rules, in a scalable, adaptable, extensible manner. With RDF as the basic platform for the Semantic Web, a multitude of tools, methods and systems have just appeared on the horizon. To give a brief introduction about RDF, its defined as a general framework for describing a Web site's metadata, or the information about the information on the site. It provides interoperability between applications that exchange machine-understandable information on the Web. RDF details information such as a site's sitemap, the dates of when updates were made, keywords that search engines look for and the Web page's intellectual property rights.

In simple terms Semantic web can be defined as a Web that includes documents, or portions of documents, describing explicit relationships between things and containing semantic information intended for automated processing by our machines. Two important technologies for developing the Semantic Web are already in place: eXtensible Markup Language (XML) and the Resource Description Framework (RDF).

Semantic Web Architecture and Applications:

Semantic Web architecture and applications are the next generation in information architecture. The previous ideas and principles to complete the Web are being put into practice under the guidance of the World Wide Web Consortium. To reduce the amount of standardization required and increase reuse, the Semantic Web technologies have been arranged into a layer cake as shown in the Figure below. The two base layers are inherited from the previous Web. The rest of the layers try to build the Semantic Web. The top one adds trust to complete a Semantic Web of trust. The Semantic Web layers are arranged following an increasing level of complexity from bottom to top. Higher layers functionality depends on lower ones. This design approach facilitates scalability and encourages using the simpler tools for the purpose at hand. All the layers are detailed in the next subsections.

The current architecture for the Semantic Web is mainly split into three layers:

From lowest to highest:

  1. Resource Description Framework (RDF): lets you assert facts
    e.g. person X is named "Drew".
  2. RDF Schema: lets you describe vocabularies and use them to describe things
    e.g. person X is a LivingPerson.
  3. Web Ontology Language (OWL): lets you describe relationships between vocabularies
    e.g. persons in schema A are the same thing as users in schema B

The Semantic Web has been developing a layered architecture, which is often represented using a diagram first proposed by Tim Berners-Lee, with many variations since.

While necessarily a simplification which has to be used with some caution, it nevertheless gives a reasonable conceptualisation of the various components of the Semantic Web. We describe briefly these layers.

Unicode and URI: Unicode, the standard for computer character representation, and URIs, the standard for identifying and locating resources (such as pages on the Web), provide a baseline for representing characters used in most of the languages in the world, and for identifying resources.

XML: XML and its related standards, such as Namespaces, and Schemas, form a common means for structuring data on the Web but without communicating the meaning of the data. These are well established within the Web already.

Resource Description Framework: RDF is the first layer of the Semantic Web proper. RDF is a simple metadata representation framework, using URIs to identify Web-based resources and a graph model for describing relationships between resources. Several syntactic representations are available, including a standard XML format.

RDF Schema: a simple type modelling language for describing classes of resources and properties between them in the basic RDF model. It provides a simple reasoning framework for inferring types of resources.

Ontologies: a richer language for providing more complex constraints on the types of resources and their properties.

Logic and Proof: an (automatic) reasoning system provided on top of the ontology structure to make new inferences. Thus, using such a system, a software agent can make deductions as to whether a particular resource satisfies its requirements (and vice versa).

Trust: The final layer of the stack addresses issues of trust that the Semantic Web can support. This component has not progressed far beyond a vision of allowing people to ask questions of the trustworthiness of the information on the Web, in order to provide an assurance of its quality.

Future of Semantic Web:

The Semantic Web has great potential. However, it has been a long time in the development and does require an investment of time, expertise and resources. Nevertheless, the time does seem right to start to think how best to use the simpler applications of the technology. Although Semantic Web applications are very new, I believe we are at the beginning of the next generation of the internet and you’ll see some interesting services popping up in the near future. Companies are heavily investing in semantic web technologies. Adobe, for example, is reorganizing its software metadata around RDF, and they are using Web ontology-level power for managing documents. Because of this change, "the information in PDF files can be understood by other software even if the software doesn't know what a PDF document is or how to display it." In its recent creation of the Institute of Search and Text Analysis in California, IBM is making significant investments in semantic web research. Other companies, such as Germany's Ontoprise, are making a business out of ontologies, creating tools for knowledge modeling, knowledge retrieval, and knowledge integration. The building blocks are here, semantic web-supporting technologies and programs are being developed, and companies are investing more money into bringing their organizations to the level where they can utilize these technologies for competitive and monetary advantage.

2 comments:

Anonymous said...

A good demo on semantic web.

Vinod said...

Thanks for the demo bam.