# CGNS/Python Tree¶

The CGNS/Python mapping defines a **tree** structure composed of
**nodes** implemented for the *Python* programming langage.
A special **links** structure is also defined for a correct mapping of
the management of files on the disk.
The mapping presented here is *NOT* a library [6], it is the lowest possible
correspondance between a CGNS/SIDS structure and a Python
representation.
This specification is public and could be used as the basis for Python
based CGNS application interoperability.
*Python* is an interpreted langage and it has a textual representation of
its objects, this representation can be used for CGNS/Python trees as well.

## Commitment with CGNS standard¶

The mapping of the SIDS into a CGNS/Python structure uses the node as atomic
structure. Comparing to CGNS/ADF or CGNS/HDF5, the contents
of a node is
unchanged in CGNS/Python. The way we represent data is different but all
nodes attributes found in the section 6 of
the *SIDS-to-ADF File Mapping Manual* [1] are applicable to the
CGNS/Python mapping.

The **data type** mapping is changed compared to CGNS/ADF or CGNS/HDF5, the
actual representation of basic types such as integers, floats and strings
are closely mapped to the Python data types.
See the table Data types.

Other elements of the node description are like the CGNS/ADF or CGNS/HDF5
mappings, in particular the **dimensions** and the order of these dimensions.
The CGNS/SIDS section 3.1 states that the dimensions order should be the
so-called *Fortran indexing convention* which states the column index is
the first. The CGNS/Python nodes should respect this requirements.

Warning

The Python arrays can be defined with either a *C* or a *Fortran* flag,
this flag is used to set or to find the order used for the internal
storage of an array. It has **no effect** on the dimensions of a *numpy*
array, but on its internal memory layout.
It’s up to the user to manage this flag and its impact on the use of an
array, in particular for the read/write on the disks through the C API.

For example, section 6.1.2.2 describes the `DimensionalUnits_t`

node
with dimensions values `(32,5)`

.
This should be understood as *Fortran* order values, and thus `(32,5)`

should be found as this in the shape of the *numpy* array [2]
whichever status the *Fortran* flag set has.

A *numpy* array with the *C* flag set should also have a shape of
`(32,5)`

, again, the internal representation of this *C* array has to
be taken into account during read/write operations.

See the *C API* and *Examples and Tips* sections about this requirement and
its impact on *numpy* array use.

## The node structure¶

The structure of a CGNS data set is held in a so-called **CGNS/Python tree**.
The tree is composed of nodes, each node may have children which are nodes
too. The node structure is a python sequence (i.e. list or tuple), composed
of four entries: the name, the value, the list of children and the type.

AttributetypeName string Value numpyarrayChildren list of CGNS/Python nodes Type string

The CGNS/Python mapping requires that:

The

nameis a Python string, it should not be empty. The name should not have more than 32 chars and should not have`/`

in it. The names`.`

(a single dot) And`..`

(dot dot) are forbidden (but names with dot and something else are allowed, for example`Zone.001`

is ok).The representation of

valuesuses the numpy library array. It makes it possible to share an already existing memory zone with the Python object. The numpy mapping of the values is detailled hereafter. An empty value should be represented as None, any other value is forbidden.The

children listcan be a list or a tuple. The use of a list is strongly recommended but not mandatory. A read-only tree can be declared as a tuple. It is the responsibility of the applications to parse sequences wether these are lists or tuples. A node without child has the empty list`[]`

as children list.The

typeis the Python string representation of the CGNS/SIDS type [9] (i.e. it is the same for CGNS/ADF or CGNS/HDF5). A type string cannot be empty.

We have now a typical CGNS/Python node, which can be represented with the pattern [8]:

```
node = [ <name:string>, <value:numpy.array>, [ <child:node>* ], <cgns-type:string> ]
```

We use there the textual representation of a Python object. All the Python types used in this CGNS/Python mapping have a full textual representation. This is detailled in the next section.

The order of the values is significant, for example `node[0]`

should always
be the name of the node (Python has an index ordering starting with zero)

We see now that a CGNS/Python tree is a node. This node has children
which have children and so on... Any node can be held as a subpart of a
complete tree, we say each node is a *sub-tree*.
Our CGNS/Python tree has a *root* node which is its first node.
There is no clear definition of a *root* node in the CGNS/SIDS or in the
SIDS mappings.

In the case of a CGNSBase_t level node, the CGNS/ADF or CGNS/HDF5 defines a sound node which can be mapped to CGNS/Python. However, the CGNS/SIDS states that several bases can be found in a CGNS tree. The father node of a base would have the pattern:

```
root = [ <CGNSLibraryVersion:node>, <CGNSBase:node>* ]
```

Which is not consistent with a *normal* node. We want to remove this exception,
we define a CGNS/Python tree root, or first node, as a list with a compliant
CGNS/Python node. which is not the node pattern.
Then the applications have to have a specific
way to manage this first node. This lack of root node is not that important
when you use the CGNS/MLL because the function are hidding the actual node
implementation. With CGNS/Python, the user can manage the nodes as true
Python objects, and we have to provide him with a sound interface, or at
least as sound as possible. For this consistency reason,
the CGNS/Python mapping defines a new type for the root node,
see the CGNSTree_t type section.

## Textual representation¶

It is possible to declare a CGNS/Python node as a textual
representation. There is a exemple of a zone connectivity sub-tree
with the CGNS/Python in textual mode, a simple `PointRange`

node with
two 3D indices:

```
pr=['PointRange',
numpy.array([[1,25],[1,9],[1,1]],dtype=numpy.int32,order='Fortran'),
[],
'IndexRange_t']
```

The `PointRange`

node has no child, the children list is an empty
list. The values of the array are initialized with a list, the order
of the elements in the list matches the Fortran indexing: in that
example the first point indices are `[1,1,1]`

and the second point
indicies are `[25,9,1]`

.

The evaluation of this string by the Python interpreter creates a CGNS/Python
compliant node as a Python list. Please note the types of this
`pr`

node, there are only native Python types (list, string,
integer) and *numpy* types or enumerates. You have to have a variable to
hold the node or the CGNS sub-tree, if you have no reference to the actually
created Python objects these will be unreachable and thus garbaged.

The textual representation can be import-ed as any Python textual file, with all possible Python use you can imagine.

Warning

The Python lists are objects. When you refer to a list you do not
copy this list unless you ask for such a copy. This is important because
if you modify an existing list you modify an object that could be used
by others. In the CGNS/Python mapping the children of a node is a list of
nodes. If you refer to such a list without a copy, any modification of this
child list will impact nodes using this list. This is detailled in the
section *Examples and Tips* .

## Numpy array mapping¶

A CGNS/Python node value is a *numpy* array, this python object contains
the **number of dimensions**, the **dimensions**, the **data type** and
the actual
**data** array. Then this implicit information is not a part of the *node*
structure.
As we really want to have the most generic node as possible, we require that
even single dimension values should be stored as *numpy* array. A single
integer, float or a single string should be embedded into a *numpy* array.

As we mentionned before, an empty value has to be represented by `None`

which is a native Python value, not a *numpy* value:

```
gc=['Grid#002',None,[cx,cy,cz],'GridCoordinates_t']
```

Here `cx`

, `cy`

, `cz`

, are nodes, not arrays.

The *numpy* end-user interface makes it possible to define
some of these required data as deduction of required
parameters. The number of dimensions is the size of the so-called
shape. The dimensions can be forced for empty values or can be deduced
from the data itself:

```
a=numpy.array([1.4])
b=numpy.ones((5,7,3),'i')
```

The first declaration has dimension 1, number of dims 1, data type
`float64`

, all deduced from the data declaration, the second has
dimensions `(5,3,7)`

, number of dimensions 3, data type set as
`int32`

.

A *numpy* array can be declared as *C order* or *Fortran order*.
There is no requirements in this mapping wether the internal layout of the
memory should be *C* or *Fortran*. However, an array should have a shape
with the same order of dimensions as described in
the *SIDS-to-ADF File Mapping Manual* ([11]).

Warning

If you use the Python C API, it is the responsability to the
application to check the *numpy* ordering flag and to manage
the arrays with respect to memory layout. See the *C API* section.

The way to get the node data information regarding the [11]
datatypes and dimensions requirements is to access to the *numpy* object
attributes:

```
pr=numpy.array([[1,2,3],[4,5,6]])
dims=pr.shape
ndims=len(pr.shape)
datatype=pr.dtype
fortranorder=numpy.isfortran(pr)
corder=not numpy.isfortran(pr)
```

## Data types¶

A value is a numpy array, the contents of an array is homogeneous and has a data type. The data types of your CGNS/Python arrays depends on the data type as defined in [11].

The type of the data can be set at the creation time, the numpy type is associated to the ADF type required by the CGNS/SIDS. A bad data type, even if it silently looks like the result you want, would lead to an non-compliant CGNS tree. The required mapping for the end-user interface uses the types :

ADF type |
Numpy type(s) |
Remarks |
---|---|---|

I4 | ‘i’ int32 | (1) |

I8 | ‘l’ int64 | (2) |

R4 | ‘f’ float32 | (3) |

R8 | ‘d’ float64 | (4) |

C1 | ‘c’ ‘|S1’ | (5) |

All other ADF or numpy types are ignored. The string type is a bit special, see the remark (5) about the strings used in numpy arrays.

**Remarks:**

The 32bits precision has to be forced, the default integer size in python the

`int64`

data type. To create an I4 array, you can use:numpy.array([1,2,3],'i',order='Fortran')The 64bits precision is the default integer in python. To create an I8 array, you can use:

numpy.array([1,2,3],order='Fortran')The 32bits precision has to be forced, the default float size in python is

`float64`

. To create an R4 array, you can use:numpy.array([1.4],'f',order='Fortran')The 64bits precision is the default float in python. To create an R8 array, you can use:

numpy.array([1.4],order='Fortran')The array has to be created as a char multi-dimensionnal array. An incorrect creation with a simple statement such as:

`numpy.array('GoverningEquations')`

produces awrongzero dimension array. The correct creation for a single value could be:`numpy.array(tuple('GoverningEquations'),'|S1')`

where the shape (i.e. the dimensions of the array) is`(18,)`

.

# Specific CGNS/Python topics¶

## The CGNSTree_t type¶

The tree structure of a CGNS data set is broken by the exception of the root node. We take the opportunity of this new CGNS/Python mapping to add a consistent root node for the CGNS tree [7].

The CGNSTree_t type is a node with the pattern:

```
root= [ <name:string>, None, [ <CGNSLibraryVersion:node>, <CGNSBase:node>* ], 'CGNSTree_t' ]
```

The children list is the CGNS/ADF-like root node. The CGNSTree node has a user-defined name, no value and a fixed CGNSTree_t type.

## Legacy CGNS types alternative¶

The CGNS/SIDS defines all CGNS types and has a rule to suffix them with `_t`

.
There are some exceptions where some CGNS/SIDS types have been translated
into strings with a special syntax.

The CGNS/Python mapping allows the use of alternate types for these, the user can either use the legacy type or the alternate CGNS/Python type. The alternate types are:

CGNS/SIDS type | CGNS/Python optional type |
---|---|

`"int[1+...+IndexDimension]"` |
`DiffusionModel_t` |

`"int[IndexDimension]"` |
`Transform_t` |

`"int[IndexDimension]"` |
`InwardNormalIndex_t` |

`"int"` |
`EquationDimension_t` |

Please note the `["]`

character which is part of the CGNS legacy type.

Warning

This CGNS/Python feature adds *NON-SIDS* type(s) and this should be
added or removed by the user application during the read and the
write to the disk with a CGNS/ADF or CGNS/HDF5 compliance.
The CGNS.MAP module has an option to check and remove these alternate
types. As long as your application has
interoperability with another CGNS/Python application there should be
no problem.

## Links¶

The **links** are used to set and get CGNS symbolic links information.
This information is relevant only during read/write operations on disks.
A CGNS/Python tree cannot have embedded links, as this tree is a list of
lists making a link to another list is non-sense in Python [3].
The **links** list is an extra information, not embedded into the CGNS/Python
tree, and only used as disk-related operations.

Warning

In the case a CGNS/Python application would not like to follow a
link and then to have some *missing* data in its CGNS tree, the
so-called *linked-from* node has to be removed from its parent
children list.

This **links** list is an unsorted list of *link-entries* with only one
entry per link. A *link-entry* is an ordered list of Python string values:

The

target directory nameis the linked-to directory name, as it would be used to open it. It should be a valid absolute/relative file path as a plain Python string or`None`

.The target file name is the linked-to file name, as it would be used to open it. It should be a valid absolute/relative path as a plain Python string. Its path-prefix part and its file extension part can be empty but the filename itself cannot.

The target node name is the linked-to node name as a plain Python string. It should be the

absolutepath of the node in the linked-to file. This value cannot be empty.The local node name is the

absolute pathof the node in the source Python/CGNS tree. This plain Python string cannot be empty.

The links with a second level file, in other words the links in a file you
are parsing after following a first link, are **always** referred as if you
where in the *target filename*. Then, a list of links can be reused from one
parse to another, because the `links`

list is relative to the target file.
The example hereafter can be an input as well as an output links list,
an application would set it for a save or get it from a load:

```
[ ['/tools/CFD/ref#M6','M6_A.cgns' ,'/Base#1/ReferenceState',
'/Base/ReferenceState'],
['/tmp/restart' ,'M6#001.cgns','/Base#1/Zone1/FlowSolution#EndOfRun',
'/Base/Zone1/FlowSolution#Init']
]
```

The target directory name information is distinct to the filename, because you can have different actual target files depending on the search paths you set. This information is relevant as ouput from the read of an actual file, it should be set to None or ignored for a write. During a write, the only information taken into account should be the target file name, target node name and the local node name.

In the example above, the entries are interpreted in a different way depending
if they are result of a read or directives for a write. In the case of a
read, the first entry means that the file we have read has a node
`/Base/ReferenceState`

which is a link to the node `/Base#1/ReferenceState`

in the file `M6_A.cgns`

. The first directory of the file search path
in which the file `M6_A.cgns`

has been found is `/tools/CFD/ref#M6`

.
In the case of a write, the same entry means that the application should
create a link for the node `/Base/ReferenceState`

when it reaches it.
This link would have `M6_A.cgns`

as target file and
`/Base#1/ReferenceState`

as target node. The `/tools/CFD/ref#M6`

value
is ignored.

Warning

The links list is relative to the current tree. If you want to track links of links your application has to manage this by itself, setting or getting links list during the different tree traversals.

## C API¶

There is no requirement on the way you would create or manage a numpy array at the C API level. But you have to remember that the definition of the node contents is SIDS-to-ADF which states that data arrays and index ordering use the Fortran convention.

You can manage all your numpy arrays with the C order in memory, but you have
to be sure that the storage on the disk, i.e. using ADF or HDF5, has the
correct fortran orders. The storage also has to be contiguous in the memory.
When you create or obtain a copy of a numpy array you can set a flag to
force a C or Fortran ordering:
one of the `NPY_CCONTIGUOUS`

or `NPY_FCONTIGUOUS`

flag can be set.
In the case of a `NPY_CCONTIGUOUS`

flag set, it is up to the application to
set a Fortran memory layout and a Fortran index ordering while reading/writing
data to/form a CGNS/ADF or CGNS/HDF5 file [5].

The *numpy* *C API* allows the share of memory zone. In other words you can
have a *Fortran* or *C* array you can directly set as your *numpy* array
without duplication. You can reduce the memory use when your application
can handle this, you can also set the `NPY_OWNDATA`

flag to indicate to
*numpy* that it should not release the array memory when the *numpy*
array object is garbaged.

# Examples and tips¶

Python comes from the C world, as well as the numpy library. This means that many behavior are assuming C-order in dimensions. The CGNS/Python mapping states that arrays should have a Fortran indexing for their actual data and that the dimension order of the data is those detailled in the [11] and [12] documents.

We give here some known issues and tips to handle this Fortran indexing in CGNS/Python. We use specific CGNS/SIDS structures to illustrate our examples.

## IndexRange_t¶

The IndexRange_t is an integer array of dimensions (IndexDimensions,2)
as detailled in [1]. The node data, in the example here, is two points with
three indices. The *Python-ish* way to define them is to have a list of
two lists of integers, which leads to problems if you forget your fortran
order. We want to set a node with the following Python code:

```
node=['PointRange', a, [], 'IndexRange_t']
```

Now we see how to declare a correct a variable as a *numpy* array.
If you do not specify an order to *numpy*, the default is the C-order:

```
>>> a=numpy.array([[1,2,3],[4,5,6]],dtype=numpy.int32)
>>> numpy.isfortran(a)
False
>>> a[0]
array([1,2,3], dtype=int32)
>>> a.shape
(2,3)
```

This *numpy* array is correct but you would have to transpose dimensions
are memory layout before a storage on disk.
Or you can enter the list itself using an explicit Fortran-order:

```
>>> a=numpy.array([[1,4],[2,5],[3,6]],dtype=numpy.int32)
>>> numpy.isfortran(a)
False
>>> a[0]
array([1,4], dtype=int32)
>>> a.shape
(3,2)
```

In that case, the shape is correct but the user has no mean to know wether your convention is C or Fortran. You can set the fortran flag for this. The possible creation of the array above is then:

```
>>> a=numpy.array([[1,4],[2,5],[3,6]],dtype=numpy.int32,order='Fortran')
>>> numpy.isfortran(a)
True
>>> a[0]
array([1,4], dtype=int32)
>>> a.shape
(3,2)
```

Then an application can detect your array has *Fortran* order and should
be stored as found without any transpose.

## IndexArray_t¶

There is another example switching from one order to another, this is used to add a point in a list in an easier way

```
node=['PointList', a, [], 'IndexArray_t']
```

The possible creation of the array `a`

above is then:

```
>>> a=numpy.array([[1,4],[2,5],[3,6]],dtype=numpy.int32,order='Fortran')
>>> a
array([[1, 4],
[2, 5],
[3, 6]], dtype=int32)
>>> a=numpy.array(a.T.tolist()+[[7,8,9]],dtype=numpy.int32,order='Fortran').T
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=int32)
```

You see that the syntax is completely unreadable, we use the *numpy* transpose
attribute `T`

to switch from *Fortran* to *C* order and back..
If you start with the *C* order, the Python syntax is clear:

```
>>> a=numpy.array([[1,2,3],[4,5,6]],dtype=numpy.int32)
>>> a
array([[1,2,3],
[4,5,6], dtype=int32)
>>> a=numpy.array(a.tolist()+[[7,8,9]],dtype=numpy.int32)
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=int32)
```

And the application in charge of the write to the disk that would
detect the abscence of *Fortran* flag and then transpose the array and its
dimension.

## DimensionalUnits_t¶

This node contains strings. The strings are an issue in CGNS/Python because
we want to use the raw level for *numpy* (instead of *numpy* module proposed
for string manipulation). We want to keep a common interface for all nodes
and we do not want an exception with strings. The DimensionalUnits_t node
can be defined as:

```
node=[`DimensionalUnits`, a, [], `DimensionalUnits_t`]
```

Now we see how we can defined the *numpy* array in variable a.
The DimensionalUnits_t states we need a (32,5) array of chars.
In the case of a fixed size multi-dimensionnal string array, each entry
should be split as a sequence with a fixed max size (usually 32
chars):

```
a=numpy.array([
tuple('%-32s'%'Kilogram'),
tuple('%-32s'%'Meter'),
tuple('%-32s'%'Second',)
tuple('%-32s'%'Kelvin'),
tuple('%-32s'%'Radian'),
],'|S32',order='Fortran').T
```

The shape of the resulting array is `(32,5)`

again note the `T`

at the end of the command which produces the transpose.
You can use a `S32`

, `|S1`

or `c`

type directive.
An important point in this string as an array is the trailing
spaces you have to fill the array cell. You have to use a
`string.strip`

before any string operation unless your Python
application is aware of this *forced* size.

## Zone_t¶

There we have an interesting example with the use of a data of a node.
The Zone_t node has the dimensions of the *zone*. These dimensions are
a data and theses data values should be used as *dimension* attribute
of the children nodes. In other words, the user takes the Zone_t
dimensions and creates a *numpy* array with them:

```
zonenode=['Zone001',zonedims,zonechildrenlist,'Zone_t']
```

The `zonedims`

*numpy* array can b set as:

```
zonedims=numpy.array([[3,2,0],[5,4,0],[7,6,0]],dtype=numpy.int32,order='Fortran')
```

in the case of a 3D structured zone with `(ni,nj,nk)=(3,5,7)`

.
If you want to create a solution array with these dimensions, you can
to use the following syntax:

```
zonevertexsize=zonedims[:,0]
zonecellsize=zonedims[:,1]
zonevertexboundarysize=zonedims[:,2]
```

This numpy syntax allows the user to take the whole column as a
so-called *slice*.

## Sub-tree imports¶

For example, the following snippet imports a truncated ReferenceState:

```
import numpy
refvalues=[
['Mach',numpy.array([0.2]),[],'DataArray_t']
['Reynolds',numpy.array([23300000.0]),[],'DataArray_t']
['LengthReference',numpy.array([0.5]),[],'DataArray_t']
['Density',numpy.array([1.22524863848]),[],'DataArray_t']
]
data=['ReferenceState',None,refvalues,'ReferenceState_t']
```

Once import-ed, your Python code can insert this node in its
structure (here our previous code snippet is in the file `refstate.py`

:

```
import numpy
import refstate
tree=['CGNSTree',None,[],'CGNSTree_t']
base=['Fuselage',numpy.array([3,3],dtype=numpy.int32),[],'CGNSBase_t']
tree[2].append(base)
base[2].append(refstate.data)
```