source: TI03-DataExtractor/branches/old_stuff/latest_dx/dx/dxc/manuals/index.html @ 793

Subversion URL: http://proj.badc.rl.ac.uk/svn/ndg/TI03-DataExtractor/branches/old_stuff/latest_dx/dx/dxc/manuals/index.html@793
Revision 793, 7.6 KB checked in by astephen, 13 years ago (diff)

Put all the old code in the old_stuff branch.

  • Property svn:executable set to *
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2<HTML>
3<HEAD>
4        <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=utf-8">
5        <TITLE>Home Page</TITLE>
6        <META NAME="GENERATOR" CONTENT="OpenOffice.org 1.1.1  (Linux)">
7        <META NAME="CREATED" CONTENT="20060317;12440700">
8        <META NAME="CHANGED" CONTENT="20060317;12445300">
9        <META NAME="ProgId" CONTENT="FrontPage.Editor.Document">
10</HEAD>
11<BODY LANG="en-US" DIR="LTR">
12<H1>An overview of the Data Extractor (DX)</H1>
13<H2>Introduction</H2>
14<P>The Data Extractor (DX) is a python-based tool for allowing users
15to access subsets of large geospatial datasets via a common
16interface. This is typically the DX Browser Client which is
17accessible as a set of web pages. However, users can also interact
18programmatically with the DX-Server which presents a functional
19interface as a Web Service. This document provides an overview of the
20key components of the DX. More detail is, or will soon be, available
21in the following guides:</P>
22<UL>
23        <LI><P STYLE="margin-bottom: 0cm"><B>DX Installation Guide</B> 
24        </P>
25        <LI><P STYLE="margin-bottom: 0cm"><B>DX Data Ingestion Guide</B> 
26        </P>
27        <LI><P STYLE="margin-bottom: 0cm"><B>DX Administrator's Guide</B> 
28        </P>
29        <LI><P STYLE="margin-bottom: 0cm"><B>DX User Guide</B> 
30        </P>
31        <LI><P><B>Guide to Securing the DX</B> 
32        </P>
33</UL>
34<H2>Architecture</H2>
35<P>The following diagram provides an overview of the DX architecture
36highlighting the main components in terms of managing and interacting
37with the package.</P>
38<P><IMG SRC="images/dx_arch.gif" NAME="Graphic1" ALIGN=BOTTOM WIDTH=767 HEIGHT=383 BORDER=0></P>
39<P>Each component is described in more detail below.</P>
40<H3>DX-Server</H3>
41<P>This is the part of the system that does the core processing such
42as file I/O, subsetting and writing of data files. It provides:</P>
43<UL>
44        <LI><P STYLE="margin-bottom: 0cm">a functional interface that can be
45        interrogated by the client (represented by the <B>Web Service
46        interface</B> in the above diagram) applications.
47        </P>
48        <LI><P STYLE="margin-bottom: 0cm">a metadata store describing
49        datasets located in a local archive.
50        </P>
51        <LI><P>an I/O layer that extracts requested data (and metadata).
52        </P>
53</UL>
54<P>The DX-Server is controlled by the <B>Administrator</B>.</P>
55<P>Installation requires knowledge of the local file system and
56access to various locations such as the webserver CGI area. The
57<B>Server Configuration</B> module (typically called <I>serverConfig.py</I>)
58is used to set up the correct paths to local resources which can then
59be accessed by the DX-Server. These issues are dealt with further in
60the DX Installation and Administrator Guides.</P>
61<P>Both the DX-Server and the DX-Clients are python packages (i.e.
62collections of python modules). The DX is written using Object
63Oriented Programming in order to make the code straightforward and
64simple for the developer to build upon and modify where required. The
65DX-Server builds upon the Climate Data Analysis Tools (CDAT) package
66which provides the underlying I/O, selection and subsetting tools.
67CDAT is not distributed with the DX.</P>
68<H3>DX-Client (Browser)</H3>
69<P>The DX Browser Client is the main method via which users will
70access the DX. If provides a CGI front-end that a user can access via
71any standard web-browser. In a secure configuration users must log-in
72to the DX client but you can also configure the DX to provide open
73access where users can see all datasets. Access can be limited by
74user and/or by roles associated with datasets.</P>
75<P>The Administrator will install the DX-client which may exist on
76the same machine as the DX-Server or remotely. The client and server
77communicate using SOAP (Simple Open Access Protocol) messages which
78require the installation of the python ZSI library (not supplied with
79the DX).</P>
80<P>The <B>Client Configuration</B> module (normally called
81<I>clientConfig.py</I>) is controlled by the Administrator who
82configures the client for the local system.</P>
83<H3>DX-Client (Command Line)</H3>
84<P>The command line client for the DX allows users to interact
85programmatically with the DX-Server. This is a relatively untested
86feature but has the potential to allow users to embed calls to the
87DX-Server in their programmes and scripts so that data can be
88extracted seamlessly as and when the user needs it.</P>
89<H3>Archive</H3>
90<P>The data archive must currently sit on the same network as the
91DX-Server and be visible via local path names. The archive must
92contain data held in files formatted as NetCDF and GRIB. There is
93also some support available in non-standard versions for pp-format
94(UK Met Office).
95</P>
96<P>The metadata inside the files should adhere (to some degree) to
97the CF-Metadata Convention for NetCDF although some variation will
98normally work. Such data will be easy to ingest without manual
99intervention.</P>
100<H3>Dataset Metadata</H3>
101<P>The DX understands the concept of a &quot;Dataset&quot; as a
102collection of one or more data files containing variables with a
103repeated structure. Typically these are 2D or 3D model fields with
104one time step per file.</P>
105<P>The DX also has the concept of a &quot;Dataset Group&quot;. This
106is a logical collection of &quot;Datasets&quot;. For example:</P>
107<TABLE WIDTH=100% BORDER=1 CELLPADDING=2 CELLSPACING=2>
108        <TR>
109                <TD WIDTH=33%>
110                        <P><B>Dataset Group</B></P>
111                </TD>
112                <TD WIDTH=33%>
113                        <P>VFGS Model Output</P>
114                </TD>
115                <TD WIDTH=34%>
116                        <P>VFGS Model Output</P>
117                </TD>
118        </TR>
119        <TR>
120                <TD WIDTH=33%>
121                        <P><B>Datasets</B></P>
122                </TD>
123                <TD WIDTH=33%>
124                        <P>VFGS Ocean Model Output</P>
125                </TD>
126                <TD WIDTH=34%>
127                        <P>VFGS Atmospheric Model Output</P>
128                </TD>
129        </TR>
130        <TR>
131                <TD WIDTH=33%>
132                        <P><B>Variables</B></P>
133                </TD>
134                <TD WIDTH=33%>
135                        <P>Salinity, SST...</P>
136                </TD>
137                <TD WIDTH=34%>
138                        <P>u-wind, v-wind...</P>
139                </TD>
140        </TR>
141</TABLE>
142<P>By default the DX requires the Administrator to ingest new
143Datasets into the DX-Server before they can be accessed by users. The
144Administrator can also create new Dataset Groups to put Datasets
145under.</P>
146<P>When interacting with the DX (via the Browser Client or Command
147Line Client) the user will select make selections in the following
148order:</P>
149<OL>
150        <LI><P STYLE="margin-bottom: 0cm">Dataset Group
151        </P>
152        <LI><P STYLE="margin-bottom: 0cm">Dataset
153        </P>
154        <LI><P STYLE="margin-bottom: 0cm">Variable
155        </P>
156        <LI><P STYLE="margin-bottom: 0cm">Spatial (Horizontal and Vertical)
157        axes
158        </P>
159        <LI><P STYLE="margin-bottom: 0cm">Temporal axes
160        </P>
161        <LI><P>Output file format
162        </P>
163</OL>
164<P>If the user selects 2 variables the DX will try and subtract
165variable 2 from variable 1 by interpolating variable 2 to the grid of
166variable 1.</P>
167<P>The Dataset Metadata is stored in an XML file (normally called
168<I>inputDatasets.xml</I>). Ingestion of datasets is describe in
169detail in the DX Ingestion Guide.</P>
170<H3>Web Service Interface</H3>
171<P>The Web Service Interface to the DX-Server is a python script with
172a number of functions that are presented as a Web Service when the
173script is run. This server script then waits for calls from client
174applications. Clients can only access the DX-Server when this script
175is running on the DX-Server machine.</P>
176<H3><B>Security</B></H3>
177<P>The DX can be secured or run in non-secure mode. This is all
178controlled in the <B>Server </B>and <B>Client Configuration </B>modules.
179The DX provides a set of programmatic hooks that an Administrator can
180plug into her local security system. The DX allows secure tokens to
181be exchanged between client and server so these can be modified to
182provide an interface to the local security implementation in your
183system.</P>
184<P>More detailed are provided in the <B>Guide to Securing the DX</B>.</P>
185<P>&nbsp;</P>
186<P>&nbsp;</P>
187<P>&nbsp;</P>
188<P>&nbsp;</P>
189<P>&nbsp;</P>
190</BODY>
191</HTML>
Note: See TracBrowser for help on using the repository browser.