Composed by G. Vitkova
The following text about XML is intended to remind basic knowledge about integrating tool over Internet. Our aim is to prepare a platform for a discussion further improvement of users´comfort introduced and implemented in last versions of Windows based on XML. Enjoy the text and discuss. Galina Vitkova
XML (Extensible Markup Language) is a set of rules for encoding documents electronically. XML design goals emphasize simplicity, generality, and usability over the Internet. It issues from SGML (Standard Generalized Markup Language – ISO 8879).
By the mid-1990s some practitioners of SGML gained experience with the then-new World Wide Web, and believed that SGML offered sufficient solutions to WEB functioning. Nevertheless, as the WEB grew, some new problems appeared, which the Web was to face. So, an XML working group of eleven members, supported by an approximately 150-member Interest Group was established. Technical debates took place on the Interest Group mailing list and issues were resolved by consensus or, when that failed, majority vote of the Working Group.
The members of the XML Working Group never met face-to-face; the design was accomplished using a combination of emails and weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and November 1996, when the first Working Draft of an XML specification was published. Further design work continued through 1997, and XML 1.0 became a W3C Recommendation on February 10, 1998.
Most of XML accrues from SGML unchanged. For example, the separation of logical and physical structures (elements and entities), the availability of grammar-based validation (DTDs – Document Type Definition), the separation of data and metadata (elements and attributes), mixed content, the separation of processing from representation (processing instructions), and the default angle-bracket syntax comes from SGML. XML has a fixed delimiter set and adopts Unicode as the document character set.
Other sources of technology for XML were the Text Encoding Initiative (TEI), which defined a profile of SGML for use as a ‘transfer syntax’; HTML, in which elements were synchronous with their resource, the separation of document character set from resource encoding, and the HTTP notion that metadata accompanied the resource rather than being needed at the declaration of a link. The Extended Reference Concrete Syntax (ERCS) project of the SPREAD (Standardization Project Regarding East Asian Documents) followed later.
There are two current versions of XML. The first (XML 1.0) was initially defined in 1998. It has undergone minor revisions since then, without being given a new version number. Currently it is in its fifth edition, which was published on November 26, 2008. The version is widely implemented and still recommended for general use.
The second (XML 1.1) was initially published on February 4, 2004, the same day as XML 1.0 Third Edition, and is currently in its second edition, as published on August 16, 2006. This version contains features (some contentious) that are intended to make XML easier to use in certain cases. The main changes are to enable the use of line-ending characters used on EBCDIC platforms, and the use of scripts and characters absent from Unicode 3.2. XML 1.1 is not very widely implemented and is recommended for use only by those who need its unique features.
Prior to its fifth edition release, XML 1.0 differed from XML 1.1 in having stricter requirements for characters available for use in element and attribute names and unique identifiers: in the first four editions of XML 1.0 the characters were exclusively enumerated using a specific version of the Unicode standard (Unicode 2.0 to Unicode 3.2.) The fifth edition substitutes the mechanism of XML 1.1, which is more future-proof but reduces redundancy. The approach taken in the fifth edition of XML 1.0 and in all editions of XML 1.1 is that only certain characters are forbidden in names, and everything else is allowed, in order to accommodate the use of suitable name characters in future versions of Unicode. In the fifth edition, XML names may contain characters in the Balinese, Cham, or Phoenician scripts among many others which have been added to Unicode since Unicode 3.2.
Almost any Unicode code point can be used in the character data and attribute values of an XML 1.0 or XML 1.1 document, even if the character corresponding to the code point is not defined in the current version of Unicode. In character data and attribute values, XML 1.1 allows the use of more control characters than XML 1.0. But for “robustness” most of the control characters introduced in XML 1.1 must be expressed as numeric character references. Among the supported control characters in XML 1.1 are two line break codes that must be treated as whitespace. Whitespace characters are the only control codes that can be written directly.
There has been discussion of an XML 2.0, although no organization has announced plans for work on such a project. XML-SW written by one of the original developers of XML, contains some proposals for what an XML 2.0 might look like: elimination of DTDs from syntax, integration of namespaces, XML Base and XML Information Set into the base standard.
Subscribe with BlogLines
Pages of the blog