Why Technical English

The Semantic Web – great expectations

October 31, 2011
3 Comments

By Galina Vitkova

The Semantic Web brings the further development of the World Wide Web aimed at interpreting the content of the web pages as machine-readable information.

In the classical Web based on HTML web pages the information is comprised in the text or documents which are read and composed into visible or audible for humans web pages by a browser. The Semantic Web is supposed to store information as a semantic network through the use of ontologies. The semantic network is usually a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent relations among the concepts.  An ontology is simply a vocabulary that describes objects and how they relate to one another. So a program-agent is able to mine facts immediately from the Semantic Web and draw logical conclusions based on them. The Semantic Web functions together with the existing Web and uses the protocol HTTP and resource identificators URIs.

The term  Semantic Web was coined by sir Tim Berners-Lee, the inventor of the World Wide Web and director of the World Wide Web Consortium (W3C) in May 2001 in the journal «Scientific American». Tim Berners-Lee considers the Semantic Web the next step in the developing of the World Wide Web. W3C has adopted and promoted this concept.

Main idea

The Semantic Web is simply a hyper-structure above the existing Web. It extends the network of hyperlinked human-readable web pages by inserting machine-readable metadata about pages and how they are related to each other. It is proposed to help computers “read” and use the Web in a more sophisticated way. Metadata can allow more complex, focused Web searches with more accurate results. To paraphrase Tim Berners-Lee the extension will let the Web – currently similar to a giant book – become a giant database. Machine processing of the information in the Semantic Web is enabled by two the most important features of it.

  • First – The all-around application of uniform resource identifiers (URIs), which are known as addresses. Traditionally in the Internet these identifiers are used for pointing hyperlinks to an addressed object (web pages, or e-mail addresses, etc.). In the Semantic Web the URIs are used also for specifying resources, i.e. URI identifies exactly an object. Moreover, in the Semantic Web not only web pages or their parts have URI, but objects of the real world may have URI too (e.g. humans, towns, novel titles, etc.). Furthermore, the abstract resource attribute (e.g. name, position, colour) have their own URI. As the URIs are globally unique they enable to identify the same objects in different places in the Web. Concurrently, URIs of the HTTP protocol (i.e. addresses beginning with http://) can be used as addresses of documents that contain a machine-readable description of these objects.

  • Second – Application of semantic networks and ontologies. Present-day methods of automatic processing information in the Internet are as a rule based on the frequency and lexical analysis or parsing of the text, so it is designated for human perception. In the Semantic Web instead of that the RDF (Resource Description Framework) standard is applied, which uses semantic networks (i.e. graphs, whose vertices and edges have URIs) for representing the information. Statements coded by means of RDF can be further interpreted by ontologies created in compliance with the standards of RDF Schema and OWL (Web Ontology Language) in order to draw logical conclusions. Ontologies are built using so called description logics. Ontologies and schemata help a computer to understand human vocabulary.

 

Semantic Web Technologies

The architecture of the Semantic Web can be represented by the Semantic Web Stack also known as Semantic Web Cake or Semantic Web Layer Cake. The Semantic Web Stack is an illustration of the hierarchy of languages, where each layer exploits and uses capabilities of the layers below. It shows how technologies, which are standardized for the Semantic Web, are organized to make the Semantic Web possible. It also shows how Semantic Web is an extension (not replacement) of the classical hypertext Web. The illustration was created by Tim Berners-Lee. The stack is still evolving as the layers are concretized.

Semantic Web Stack

As shown in the Semantic Web Stack, the following languages or technologies are used to create the Semantic Web. The technologies from the bottom of the stack up to OWL (Web Ontology Langure) are currently standardized and accepted to build Semantic Web applications. It is still not clear how the top of the stack is going to be implemented. All layers of the stack need to be implemented to achieve full visions of the Semantic Web.

  • XML (eXtensible Markup Language) is a set of rules for encoding documents in machine-readable form. It is a markup language like HTML. XML complements (but does not replace) HTML by adding tags that describe data.
  • XML Schema published as a W3C recommendation in May 2001 is one of several XML schema languages. It can be used to express a set of rules to which an XML document must conform in order to be considered ‘valid’.
  • RDF (Resource Description Framework) is a family of W3C specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description of information that is implemented in web resources. RDF does exactly what its name indicates: using XML tags, it provides a framework to describe resources. In RDF terms, everything in the world is a resource. This framework pairs the resource with a specific location in the Web, so the computer knows exactly what the resource is. To do this, RDF uses triples written as XML tags to express this information as a graph. These triples consist of a subject, property and object, which are like the subject, verb and direct object of an English sentence.
  • RDFS (Vocabulary Description Language Schema) provides basic vocabulary for RDF, adds classes, subclasses and properties to resources, creating a basic language framework
  • OWL (Web Ontology Language) is a family of knowledge representation languages for creating ontologies. It extends RDFS being the most complex layer, formalizes ontologies, describes relationships between classes and uses logic to make deductions.
  • SPARQL (Simple Protocol and RDF Query Language) is a RDF query language, which can be used to query any RDF-based data. It enables to retrieve information for semantic web applications.
  • Microdata (HTML)  is an international standard that is applied to nest semantics within existing content on web pages. Search engines, web crawlers, and browsers can extract and process Microdata from a web page providing better search results

As mentioned, top layers contain technologies that are not yet standardized or comprise just ideas. May be, the layers Cryptography and Trust are the most uncommon of them. Thus Cryptography ensures and verifies the origin of web statements from a trusted source by a digital signature of RDF statements. Trust to derived statements means that the premises come from the trusted source and that formal logic during deriving new information is reliable.

Advertisements

World Wide Web

September 3, 2011
Leave a Comment

Dear friends of Technical English,

I have just started publishing materials for my projected e-book devoted to the Internet English, i.e. English around the Internet. It means that during a certain period of time I will publish posts which will make basic technical texts in units of the mentioned e-book with a working name Internet English. The draft content of the e-book has already been published on my blog http://traintechenglish.wordpress.com in the newsletter Number 33 – WWW, Part 1 / August 2011. One topic in the list means one unit in the e-book.

Thus you find below the first post of a post series dealing with Internet English. I hope these texts will contribute to develop your professional English and at the same time will bring you topical information about the Internet.    Galina Vitkova

 

World Wide Web

 Composed by Galina Vitkova

The World Wide Web (WWW or simply the Web) is a system of interlinked, hypertext documents that runs over the Internet. A Web browser enables a user to view Web pages that may contain text, images, and other multimedia. Moreover, the browser ensures navigation between the pages using hyperlinks. The Web was created around 1990 by the English Tim Berners-Lee and the Belgian Robert Cailliau working at CERN in Geneva, Switzerland.

Today, the Web and the Internet allow connecti...

Today, the Web and the Internet allow connecti...

The term Web is often mistakenly used as a synonym for the Internet itself, but the Web is a service that operates over the Internet, as e-mail, for example, does. The history of the Internet dates back significantly further than that of the Web.

Basic terms

The World Wide Web is the combination of four basic ideas:

  • The hypertext: a format of information which in a computer environment allows one to move from one part of a document to another or from one document to another through internal connections (called hyperlinks) among these documents;
  • Resource Identifiers: unique identifiers used to locate a particular resource (computer file, document or other resource) on the network – this is commonly known as a URL (Uniform Resource Locator) or URI (Uniform Resource Identifier), although the two have subtle technical differences;
  • The Client-server model of computing: a system in which client software or a client computer makes requests of server software or a server computer that provides the client with resources or services, such as data or files;
  • Markup language: characters or codes embedded in a text, which indicate structure, semantic meaning, or advice on presentation.

 

How the Web works

Viewing a Web page or other resource on the World Wide Web normally begins either by typing the URL of the page into a Web browser, or by following a hypertext link to that page or resource. The act of following hyperlinks from one Web site to another is referred to as browsing or sometimes as surfing the Web. The first step is to resolve the server-name part of the URL into an Internet Protocol address (IP address) by the global, distributed Internet database known as the Domain name system (DNS). The browser then establishes a Transmission Control Protocol (TCP) connection with the server at that IP address.

TCP state diagram

TCP state diagram

The next step is dispatching a HyperText Transfer Protocol (HTTP) request to the Web server in order to require the resource. In the case of a typical Web page, the HyperText Markup Language (HTML) text is first requested and parsed (parsing means a syntactic analysis) by the browser, which then makes additional requests for graphics and any other files that form a part of the page in quick succession. After that the Web browser renders (see a note at the end of this paragraph) the page as described by the HyperText Markup Language (HTML), Cascading Style Sheets (CSS) and other files received, incorporating the images and other resources as necessary. This produces the on-screen page that the viewer sees.

Notes:

  • Rendering is the process of generating an image from a model by means of computer programs.
  • Cascading Style Sheets (CSS) is a style sheet language used to describe the look and formatting of a document written in a markup language.

 

Web standards

At its core, the Web is made up of three standards:

  • the Uniform Resource Identifier (URI), which is a string of characters used to identify a name or a resource on the Internet;
  • the HyperText Transfer Protocol (HTTP), which presents a networking protocol for distributed, collaborative, hypermedia information systems, HTTP is the foundation of data communication on the Web;
  • the HyperText Markup Language (HTML), which is the predominant markup language for web pages. A markup language presents a modern system for annotating a text in a way that is syntactically distinguishable from that text.

 


    August 2019
    M T W T F S S
    « Jul    
     1234
    567891011
    12131415161718
    19202122232425
    262728293031  

    Blog Stats

    • 204,580 hits

    Subscribe with BlogLines

    Translatorsbase

    Dynamic blog-up

    technorati

    Join the discussion about

    Seomoz

    I <3 SEO moz