XY Viewer D2.1. Plot data XML format 1. Introduction Most data analyses have outputs that are expected to be formatted as figures. In order to provide generic tools able to display or print those figures, we need a common data format. This is an attempt to describe figure data in a generic way as an XML document, dissociating as most as possible the contents from the representation. 1.1. Author Javier Iglesias 1.2. Version This document version is : $Id$ 1.3. Changes 2004-11-17 : minor documentation mistakes 2003-05-05 : added section 1.1. "Author" ; subsequent section renumbered 2002-10-23 : removed "legend", replaced by "description" and "details". added section 1.1. "Version" and 1.2. "Changes". 2002-09-11 : original document (draft). 2. Detailed description In order to handle this task, lets define some terms that will get their way into the DTD. As an example test case, we'll use the data generated by a weather measurement system, composed of point measures of air pressure, temperature and humidity during some period of time. 2.1. Frame A group of figures that are tightly linked. E.g. the figures for air pressure, temperature and humidity for the same location during the same period. A frame always contains at least one figure. 2.1.1. Attributes There are no attributes for frames. 2.1.2. Elements 2.1.2.1. title An optional title of the frame. 2.1.2.2. description An optional free text description of the frame contents. 2.1.2.3. details An optional free text detailed and possibly lengthy description of the frame contents. 2.1.2.4. author An optional free text containing the name of the author of the data. 2.1.2.5. date An optional formatted date of creation of the file. The format used for the date is specified in the "format" attribute that can can one of the two values "ANSI" or "ISO". ANSI format looks like 'Wed Sep 11 15:34:41 CEST 2002'. ISO format looks like '2002-09-11' and refers to the ISO-8601 standard. 2.1.2.6. application An optional empty element containing two attributes, "name" and "version" in order to keep trace of the application that generated the original file. 2.1.3. XML description The "frame" is the document root element. --------------------------------------------------------------- --------------------------------------------------------------- 2.2. Figure The information formatted as a graphic. E.g. the air temperature variations along time in three distinct locations during the same period. A figure always contains at least one plot. A frame always contains at least one figure. 2.2.1. Attributes There are no attributes to the figure. 2.2.2. Elements 2.2.2.1. title An optional title of the figure. 2.2.2.2. description An optional free text description of the figure contents. 2.2.2.3. details An optional free text detailed and possibly lengthy description of the figure contents. 2.2.2.4. axis A group of data meta-information that has at least one instance. 2.2.2.5. plot A group of data values that has at least one instance. 2.2.3. XML description --------------------------------------------------------------- --------------------------------------------------------------- 2.3. Axis An axis groups some meta-information on the plotted data. E.g. measurement unit. This information is not placed in the the "Plot" description, as it will most certainly be shared by several plots for the same figure, then "Plots" are linked to as many "Axes" than there are dimensions in them. E.g. 3 dimensional data will be linked to 3 axes (without requiring them to be 3 different axes). One plot is always linked to as many axes as there are dimensions. 2.3.1. Attributes 2.3.1.1. name A required unique ID allowing the plots to refer to them. 2.3.1.2. scale A description of the data values gradations. Can take one of the values "linear", "logarithmic" or "non_numeric" defaulting to "linear" when no specific value is provided. The value "non_numeric" allows to describe the data as a list of classes, like 'class1=23 class2=45 ...' typically for bar graphics or poll results. The labels to use are placed in the optional "ticmarks" element (read below). 2.3.2. Elements 2.3.2.1. title An optional title of the axis. 2.3.2.2. description An optional free text description of the axis. 2.3.2.3. details An optional free text detailed and possibly lengthy description of the axis. 2.3.2.4. unit An optional unit for the values on the axis. E.g. meters, seconds. 2.3.2.5. range An optional empty description of the range of the axis containing two optional attributes "min" and "max", respectively for the minimal and maximal values to display. FIXME : does it really belong to the data description ?! 2.3.2.6. ticmarks An optional list of labels to use instead of the numbers. This element is only really useful in conjonction with the "non_numeric" scale attribute value to describe the data as free text. E.g. class1 is labeled as "yes", class2 as "no", etc. 2.3.3. XML description --------------------------------------------------------------- --------------------------------------------------------------- 2.4. Plot The set of data values that is linked together. E.g. the air temperature measures taken at one location during some period. In order to be generic, we will not consider only the two dimensional sets of data, as will be explained later. One figure always contains at least one plot. One plot is always linked to as many axes as there are dimensions. 2.4.1. Attributes There are no attributes for plots. 2.4.2. Elements 2.4.2.1. data The data values, given as series of tuples separated by spaces, tabs, line feeds or carriage returns. X-Y data reads : 'x1 y1 x2 y2 x3 y3...' X-Y-Z data reads : 'x1 y1 z1 x2 y2 z2 x3 y3 z3...' etc. This way, there is no limit to the number of dimensions that can be represented by this data structure. Any number of space characters (spaces, tabs, line feeds and carriage returns) are treated as one space. To be able to read it correctly, there is a 'tuple' attribute to the data element. E.g. X-Y data will have a tuple="2", X-Y-Z will have a tuple="3", etc. To describe the data completely, the information located in an axis elements is required. To explicit this link, the list of unique IDs defined for each axis instance must be placed in the attribute 'axes'. E.g. x1 y1 x2 y2 ..., describing that the data of this plot is to be represented as bi-dimensional data, x's on the axis with ID="1", and y's on the axis with ID="3". This structure may look a bit odd, but it opens the door for complex data representations while requiring only a small burden on the format. The "data" element can contain portions of data enclosed into .... They mark specific portions of the data to highlight on the graphic representation, without specifying the exact method. 2.4.2.2. title An optional title of the plot. 2.4.2.3. description An optional free text description of the plot. 2.4.2.4. details An optional free text detailed and possibly lengthy description of the plot. 2.4.3. XML description --------------------------------------------------------------- --------------------------------------------------------------- 3. Complete DTD --------------------------------------------------------------- --------------------------------------------------------------- 4. Complete example Here is a complete example of how the data format could be used to represent a frame of three graphics. The first figure is composed of two curves on the same graphic, composed of data acquired in two separate experimental situations. The x and y axis are shared (speed against time) between the two plots. The second figure is composed of two curves on the same graphic, where the x axis is shared (time), but each plot has its own y axis (one uses speed, the other acceleration). This kind of figure can be drawn with the time going from left to right, the first y axis on the left from bottom to top, the second y on the right from bottom to top. Any other representation is possible, though. The last figure is the result of a poll where students had to tell if they liked to take the measurements or not. --------------------------------------------------------------- The Uniformally Accelerated Linear Movement Some figures on the way uniform acceleration modifies the speed John Smith 2002-09-11
Speed against time The car is accelerated in one direction, speed is measured Time s Speed m/s first trial weight=1 kg, acceleration=2 m/s^2 0.0 0.0 1.0 1.9 2.0 6.1 second trial weight=2 kg, acceleration=2 m/s^2 0.0 0.0 1.0 1.8 2.0 6.0
Speed and acceleration against time The car is accelerated in one direction, speed and acceleration are measured Time s Speed m/s Acceleration m/s^2 Speed weight=1 kg, acceleration=2 m/s^2 0.0 0.0 1.0 1.9 2.0 6.1 Acceleration acceleration is supposed to be 2 m/s^2 but... 0.0 1.9 1.0 1.8 2.0 2.1
Interest in the data acquisition At the end of the work, students where asked if they liked to measure speed and accelerations Answer Loved Hate 'No comment' Percentage % Percentage of satisfaction Number of pupils who answered to that proposition over the total number of pupils 1 70.2 2 18.7 3 12.1
--------------------------------------------------------------- 5. Caveats I wonder if the range element of the axis element really belongs to the data description. It is certainly usefull, but one might consider it is already part of the representation... It is important to estimate if this data format has the potential to represent a large set of different graphics. This is the only reason to justify some weird notation tricks like 'tuple' and the like. The separation between axis and plot is not completely clear. Axis is here to promote the reusing of some information... but maybe it is not the way to treat the problem.