Reading Data into Data Explorer
To read and properly use a dataset,
DX has to know how to interpret a set of numbers in a file. Specificly, DX
needs to know:
- Whether the data is gridded or scattered.
- The array size/dimension if gridded, or the number of points if scattered.
- The positions of the points in space. If gridded, the positions may be
specified by grid origin and grid deltas for each dimension.
- The data format, text or binary.
- Each data type, such as float, integer, double, byte, etc.
- Whether the data is scalar or vector.
- How the data is grouped. If for instance you measure two variables
in a volume, perhaps temperature and pressure, you could arrange the data
as
x, y, z, temp, p
repeated for each x,y,z or as
five lists of all the x values, followed by all the y values, etc.
- The order in which the data was written. For instance, C programs tend
to write the last index of an array changing fastest, while FORTRAN programs
tend to write the first index of an array changing fastest.
- If there are there comments in the file that must be skipped.
- A name for a dataset, for reference in a program.
There are a set of keywords which are used to specify the list above. Very
often you only need to specify a few items, since most have a reasonable
default. Keywords and their values may be specified by typing them directly
into the format field of an import module, but it is a bit easier to read
and edit if a Format module is used also. First we will look at three examples
then the details of the input specifications.
-
Example 1 shows a fragment of
DX code which will read a data file with name data.file in which there are 100
scattered values, each located at a 3D point. The data file is assumed to
have the form of
x,y,z,data repeated for each point. Refering
to the image, there are two dialog boxes open in a DX VPE window.
The Input module dialog box shows that the first input sets the data filename.
The Format dialog box shows six inputs:
- Specification to make one string, delimited by commas out of the next
five strings.
- A keyword which tells DX that a certain class of keywords follows.
- Tells DX there are 100 points in the file.
- Defines the structure for each variable: location is a 3D vector and the
data is a scalar.
- Defines names for each variable. "Locations" is a reserved word which makes
the variable the positions for any other data in the file.
- Indicates that the positions and data are grouped together.
Given these specifications, DX will default to text format, with data type of
float when reading a file.
- The second example we will do by just specifiying the keywords. Assume
that you have a 128 by 128 by 64 grid of scalar values stored in a file in
binary format and that the data was written in array-default order from a
FORTRAN program. (The actual grid positions however are not in the file)
Also assume that the grid origin is [0,0,0] and the
grid deltas are 1,1,0.3 in the x, y and z directions respectively.
The required specifications (which would be typed into
the fields of a Format module) are:
- %s,%s,%s,%s,%s (concantenates the next strings)
- general
- grid=128x128x64
- majority=column (FORTRAN order--first index changes fastest)
- format=binary
- positions=0, 1, 0, 1, 0, .3
- The third example will be a data set consisting of an array of 3D valued
vectors sampled on a 2D grid of size 10 by 20, written in C default order in
a text file. The file also has a 3 line long descriptive header.
The actual grid positions are not in the file.
The required specifications (which would be typed into
the fields of a Format module) are:
- %s,%s,%s,%s (concantenates the next strings)
- general
- grid=10x20
- structure=3-vector
- header=lines 3
Now we need to systematically show the keyword syntax.