From 9b832b7088326016278c411f110899413869daa2 Mon Sep 17 00:00:00 2001 From: Eivind Jahren Date: Wed, 25 Oct 2023 14:30:55 +0200 Subject: [PATCH] Remove outdated documentation --- python/README | 7 - python/txt-doc/README.txt | 4 - python/txt-doc/devel.txt | 429 ------------------------- python/txt-doc/import.txt | 231 -------------- python/txt-doc/install.txt | 88 ------ python/txt-doc/tips.txt | 618 ------------------------------------- 6 files changed, 1377 deletions(-) delete mode 100644 python/README delete mode 100644 python/txt-doc/README.txt delete mode 100644 python/txt-doc/devel.txt delete mode 100644 python/txt-doc/import.txt delete mode 100644 python/txt-doc/install.txt delete mode 100644 python/txt-doc/tips.txt diff --git a/python/README b/python/README deleted file mode 100644 index 10154c0b78..0000000000 --- a/python/README +++ /dev/null @@ -1,7 +0,0 @@ -Some of the code in the ERT codebase has been wrapped with with -Python. - - - -More detailed documentation of the wrapping can be found in -doc/devel.txt. diff --git a/python/txt-doc/README.txt b/python/txt-doc/README.txt deleted file mode 100644 index 7a7a6b294a..0000000000 --- a/python/txt-doc/README.txt +++ /dev/null @@ -1,4 +0,0 @@ -The files 'devel.txt', 'import.txt', 'tips.txt' and 'install.txt' in the 'txt-doc/' -directory are old text files with documentation. The content of these files is - -to a varyiing degree - relevant and valuable, but it should be integrated into -the rst files, and then deleted. diff --git a/python/txt-doc/devel.txt b/python/txt-doc/devel.txt deleted file mode 100644 index 4ac0fdd086..0000000000 --- a/python/txt-doc/devel.txt +++ /dev/null @@ -1,429 +0,0 @@ -Developer documentation ------------------------ - - 1 Summary - 2 About ctypes - 2.1 Loading the shared library - 2.2 Type mappings - 2.2.1 Fundamental types - 2.2.2 User defined types and classes - 3 The C code - 4 Common structure of the wrapping - 4.1 Loading the shared library - 4.2 Prototyping - 4.3 Python classes - 5 Garbage collection and 'data_owner' - 6 Installation in Equinor - - - -1 Summary - -The ert Python classes are based on wrapping (some) of the C libraries -which constitute the ERT application. The wrapping is based on the -python ctypes module which does the wrapping dynamically. The python -classes are quite thin, most of the actual code is in C, however a -user of the ert Python should NOT need to know that there is some C -code deep down. - - -2 About ctypes - -ctypes is a standard Python module for dynamic wrapping of C code. The -C code should compiled into a shared library. The ctypes library works -by loading the shared library with dlopen(). In addition there is a -quite good system for transparently mapping between Python types and C -types. More extensive documentation of ctypes is available at: -http://www.python.org/org/library/ctypes.html - -2.1 Loading the shared library - -In the ert Python wrapping loading the shared library is handled by -the ert.util.clib function load(). When the shared library has been -loaded, all the symbols of the shared library are available as -attributes of the ctypes load handle. At the lowest level the loading -of shared libraries is through the ctypes.CDLL() call like: - - lib_handle = ctypes.CDLL( "libname" ) - -however in the ert py library this is wrapped by the function load() -in the ert.util.clib module: - - import ert.util.clib as clib - lib_handle = clib.load("name1" , "name2" , "name3") - -The clib.load() function will only load one library, but it can take -several names to that library and will try loading until one of them -succeeds. This is quite useful because several of the standard -libraries are identified with different names on the different RedHat -releases, for instance zlib is loaded like this: - - zlib_handle = clib.load("libz", "libz.so.1") - -When the library has been loaded, all functions of the shared library -will be available as python function attributes of the library -handle. I.e. if loading the standard library like: - - clib_handle = clib.load( "libc" ) - -The attribute clib_handle.getenv will be a pointer to a Python -function object which wraps the standard library getenv() -function. Before this is really usable we have to "tell" the function -object which input arguments and return to expect, this is done with -the restype attribute and the argtypes attribute of the function -object, more details of the prototype process can be found in section -4.2 - - -2.2 Type mappings - -The ctypes library automagically handles conversion between common -C-types and Python types, so for the example above we would be able to -write: - - PATH = clib_handle.getenv( "PATH" ) - -And the conversion between Python strings and NULL based char* in C is -handled transparently. In the ert wrapping setting the type of return -values and input values is handled by the function prototype() in -ert.cwrap.cwrap. - -The type mappings necessary to get prototype() function to work is -maintained in the CWrapper class in ert.cwrap.cwrap.py. During -initialization of the ert Python code there are many calls -associating the name of a C type you whish to use in the prototype -process (as a string) and the corresponding Python type: - - CWrapper.registerType( "int" , ctypes.c_int ) - CWrapper.registerType( "double" , ctypes.c_double ) - CWrapper.registerType( "char*" , ctypes.c_char_p ) - CWrapper.registerType( "int*" , ctypes.POINTER( ctypes.c_int )) - - - - -2.2.1 Fundamental types - -All the fundamental C types are known to the ctypes library, and -transparently converted <-> the correct Python type. - - C type ctypes Python type - -------------------------------------------------------- - int c_int int - float c_float float - double c_double double - ..... - char * (NULL terminated) c_char_p string - -------------------------------------------------------- - -All these types can be created by calling them with an optional -initializer of the correct type and value: - - cint = ctypes.c_int( 42 ) - -With the function ctypes.POINTER() you can create pointers to the -fundamental types - but this is approaching dangerous territory! - - -2.2.2 User defined types and classes - -Let us say you have a C function which expects a pointer to an -abstract data type as input: - - void abstract_method( abstract_type * at ); - -And that you have implemented the Python class AT which wraps a -pointer to a at instance in a "c_ptr" attribute. If the AT class -contains a method: - - def from_param( self ): - return ctypes.c_void_p( self.c_ptr ) - -The AT class can then be used as a Python type in the argtypes -attributes. To make it available in the prototyping as well you must -register the type: - - CWrapper.registerType( "at" , AT ) - -Observe that the AT class should NOT be used as restype, i.e. for the -return value: - - /-- Warning: ----------------------------------------------\ - | The mapping of types generally works quite well, however | - | observe that the return type of C functions which return | - | an instance pointer is NOT the current class, but | - | c_void_p, that is because this return value is | - | internalized in the c_ptr attribute of the object. | - \----------------------------------------------------------/ - - -3 The C code - -The C libraries are to a large extent developed with abstract types -like: - - typedef struct { - ..... - ..... - } abs_type; - -and "methods" like: - - abs_type * abs_type_alloc( ) { } - void abs_type_free( abs_type * at ) {} - int abs_type_get_an_int( const abs_type * at) {} - void abs_type_set_an_int( abs_type * at , int value) {} - -it has therefore been relatively easy to map this onto Python classes -and give a pythonic feel to the whole thing. As a ground rule each C -file implements one struct; this struct is wrapped in a Python module -with the same name and a Python class with CamelCaps naming: - - C Python - ecl_kw.c implements ecl_kw_type ecl_kw.py implements EclKW - - - - -4 Common structure of the wrapping - -The wrapping of the libraries (currently only libecl is anything close -to complete) follow roughly the same structure: - - 1. Load the shared library - - 2. For each C "class" you wish to wrap { - - 2.1. Prototype the C functions you wish to use by creating Python - function objects with correct restype and argtypes - attributes for the functions you are interested in, using - the CWrapper.prototype() function. - - 2.2. Create the Python class - based on the prototype functions - from 2.1 above. - - } - -The following excerpt from ecl_kw.y illustrates this: - - ------------------------------------------------------------ - # Load the shared library - import libecl <--- Step 1 - - Class EclKW: <-------------� - # Pure Python code to implement the EclKW class, | - # based on the functions prototyped below. | Step 3 - .... <-------------� - - # Create a wrapper instance which wraps the libecl library. <--------� - cwrapper = CWrapper( libecl.lib ) | - | - # Register the type mapping "ecl_kw" <-> EclKW | Step 2 - cwrapper.registerType( "ecl_kw" , EclKW ) | - | - # Prototype the functions needed to implement the Python class | - cfunc.alloc_new = cwrapper.prototype("long ecl_kw_alloc( char* , int , int )") - cfunc.free = cwrapper.prototype("void ecl_kw_free( ecl_kw )") | - .... <--------� - ------------------------------------------------------------ - -These three steps are described in more detail in 4.1 - 4.3 below. - - -4.1 Loading the shared library - -For loading the shared library 'libecl.so' there is a module -'libecl.py' which calls the actual loading function, load() in -ert.cwrap.clib.py. Subsequently the library instance will be a 'lib' -attribute of the the libecl module, so that subsequent functions which -need access to the function objects pointing to C functions, like -e.g. the prototype function in the CWrapper class, will use this -attribute. - -If you take a detailed look at some of the "import libxxx" statements -they might seem superfluous, in the sense that the active scope does -not reference the imported library, however in many cases the "import -libxxxx" statements are for side-effects only. - - -4.2 Prototyping - -As mentioned in section 2.1 we must "tell" the function pointers the -type of the return value and the input arguments. This is done by -setting the restype and argtypes attributes. For the getenv() function -which takes one (char *) input and returns a (char *) pointer this -would be: - - clib_handle.getenv.restype = ctypes.c_char_p - clib_handle.getenv.argtypes = [ ctypes.c_char_p ] - -In the ert python code this is achieved with the helper class CWrapper -implemented in the module ert.cwrap.cwrap: - - from ert.cwrap.cwrap import CWrapper - - lib = clib.load( "libc" ) - wrapper = CWRapper( lib ) - getenv = wrapper.prototype("char* getenv( char* )") - -The prototype() method of the CWrapper class is essentially one -massive regular expression which parses the "char* getenv( char* )" -string and assigns the correct restype and argtypes attributes. - - -All the calls to the prototype() method are typically assembled in the -bottom of the file implementing a class defintion. E.g. the bottom of -the ecl_kw.py file looks like this: - - # 1: Load the shared library - cwrapper = CWrapper( libecl.lib ) - - # 2: Register type mapping ecl_kw <--> EclKW - cwrapper.registerType( "ecl_kw" , EclKW ) - - # 3: Prototype the C functions we will use from Python - cfunc = CWrapperNameSpace("ecl_kw") - cfunc.get_size = cwrapper.prototype("int ecl_kw_get_size( ecl_kw )") - cfunc.get_type = cwrapper.prototype("int ecl_kw_get_type( ecl_kw )") - cfunc.iget_char_ptr = cwrapper.prototype("char* ecl_kw_iget_char_ptr( ecl_kw , int )") - cfunc.iset_char_ptr = cwrapper.prototype("void ecl_kw_iset_char_ptr( ecl_kw , int , char*)") - cfunc.iget_bool = cwrapper.prototype("bool ecl_kw_iget_bool( ecl_kw , int)") - cfunc.iset_bool = cwrapper.prototype("bool ecl_kw_iset_bool( ecl_kw , int, bool)") - cfunc.alloc_new = cwrapper.prototype("c_void_p ecl_kw_alloc( char* , int , int )") - .... - .... - -This makes the function cfunc.get_size, cfunc.get_type, -cfunc.iget_char_ptr, ... available as first class Python functions, -and the Python classes can be implemented based on these -functions. Observe that the return value from "ecl_kw_alloc()" is -c_void_p' and not ecl_kw - that is because this return value is -internalized in the c_ptr of the object (this is not particularly -elegant, and could probably be improved upon..??). - -The prototyped functions is quite low-level, and they should __NOT__ -be used outside the file scope where they are defined. - - -4.3 Python classes - -All the main Python classes wrap a C based structure, and contain -roughly the same structure: - - * The underlying C structure is 'held' with the attribute c_ptr, - c_ptr is just a c_void_p instance which holds the pointer value of - the underlying C structure. The name c_ptr is just convention, and - could be anything. The type of c_ptr is 'c_void_p'. - - Mapping between a class instance and the C pointer held by this - instance is handled with the 'from_param' function: - - def from_param( self ): - return self.c_ptr - - The name 'from_param' is goverened by ctypes; the from_param - method is called automatically by the ctypes runtime. - - * The __init__() function, or alternatively a classmethod - constructor calls the C based constructor, like - e.g. ecl_grid_alloc() and internalize the return value in the - attribute 'c_ptr'; alternatively the c_ptr might come as a shared - reference from another function. - - * The __del__ function is called when the Python object is garbage - collected, then the corresponding C free function should be - called, e.g. ecl_grid_free(). - - - - - -5 Garbage collection and 'data_owner' - -The Python language has garbage collection, and as a user of the ert -python code you should just assume that the garbage collection works -for the ert py types. However, if you want to wrap new types (or maybe -fix a bug ...) you should understand how the the ert py classes -interact with garbage collection. - -When the Python interpreter decides that an object will be collected, -the __del__() method of that object is called, and any non-standard -cleanup code can be called here. In the case of ert py, this is where -the C based destructor can be called. The important point is that it -differs from case to case whether the C based constructor should -indeed be called. - -The EclKW class illustrates this quite well. Broadly speaking you can -instantiate an EclKW instance in two ways: - - * Through the EclwKW.new() function - * As a reference to an existing ecl_kw through an EclFile instance. - -In the first case the c_ptr of the EclKW instance will point to fresh -storage dedicated to this EclKW, and when the EclKW instance goes out -of scope the memory should be freed with a call to the C function -ecl_kw_free(). - - EclKW (Python) ecl_kw (C) - c_ptr - | --------- - \--------------->| PORO | - | REAL | - | 10000 | - |-------| - | 0.15 | - | 0.19 | - | ... | - --------- - - -In the second case an EclFile instance has been created, and this -object points to ecl_file C struct which is essentially a container -containing many ecl_kw instances. We can then query the EclFile -instance to get reference to the various ecl_kw keywords, i.e. for a -small restart file: - - -EclFile (Python) ecl_file (C) EclKW (Python) - c_ptr c_ptr - | ------------ | - \---------------->| PRESSURE | | - | SGAS <---+------------------/ - | SWAT | - | .... | - ------------ - -The situaton above (which is quite typical) could for instance come -about from: - - restart_file = ecl.EclFile( "ECLIPSE.X0057" ) - sgas_kw = restart_file.iget_named_kw( "SGAS" , 0 ) - -Now, when the sgas_kw object goes out of scope it is important that -the SGAS ecl_kw in the ecl_file file container is _NOT_ deleted, that -keyword is owned by the ecl_file container and the whole thing will -crash and burn if the ecl_kw is destroyed bypassing the container. The -sgas keyword will be destroyed when the restart_file object goes out -scope at a later stage. - -To facilitate this difference in behaviour when an EclKW goes out -scope the object contains a field 'data_owner', and the __del__() -method looks like this: - - def __del__( self ): - if self.data_owner: - cfunc.free( self ) - -I.e. the C based free() function is only called if the object is the -owner of the underlying C structure. This technique is implemented in -many of the objects. In the current code the data_owner field is set -alongside with the c_ptr field, and not modified during the lifetime -of the object (i.e. the implementation does not support 'hostile -takeover' or 'orphaning' of objects). - - - -6 Installation in Equinor - -In Equinor the ert Python libraries are installed in the /project/res -hierarchy. diff --git a/python/txt-doc/import.txt b/python/txt-doc/import.txt deleted file mode 100644 index 677b017880..0000000000 --- a/python/txt-doc/import.txt +++ /dev/null @@ -1,231 +0,0 @@ -Python has a system for packages and sub packages which maps to the -filesystem; there are many different ways to import symbols into the -Python interpreter context. These things are exploited in the ert python -bindings, here we try to document the python apporach to packages and -importing in general, and the conventions used in ert python in special. - -Example module (file: example.py) --------------- - - def example_sum(a,b): - return a+b - - example_variable = 100 - - class ExampleClass: - def __init__(self , **kwargs): - .... - - def method( self , arg1 , arg2 ) - .... - -We will use this example module in the following. - - - - - -1. Modules and symbols ----------------------- - -A module is a python source file (or alternatively a shared library with -a prescribed structure). When importing a module all the symbols in the -module will become available in calling scope, with the module name as -namespace prefix: - - import example - - var = example.example_variable - sum = example.example_sum( 100 , 100) - instance = example.ExampleClass() - -Observe that even though we have imported the 'example' module, we -still need to use the 'example' prefix when referring to the symbols -found in the example module. It is important to remember that the -module object is only a collection of symbols - you can not use the -module itself, only the symbols contained in the module can be put to -useful work. - -It is also possible to import the symbols from 'example' directly into -the current namespace, without creating the intermediate namespace -'example', that is done using the 'from import ' -statement: - - from example import example_variable,example_sum,ExampleClass - - var = example_variable - sum = example_sum - inst = ExampleClass() - -With this approach the symbols from 'example' are imported directly into -the current namespace, and the 'example' prefix should not be used when -referring to the 'example_variable','example_sum' and 'ExampleClass' -symbols. In the example above we have explicitly named the symbols we are -after, it is also possible to import all symbols undiscriminately: - - from example import * - - .... - .... - -Finally it is possible to perform name transliteration during the -import: - - from example import example_variable as EXAMPLE_VARIABLE - - print "100 = %s??" % EXAMPLE_VARIABLE - -Here we have imported the example_variable symbol from the 'example' -module and renamed it to 'EXAMPLE_VARIABLE' in the calling scope. - - - -2. Packages ------------ - -It is convenient to group related modules together in a package -structure; a module is just a python source file. In the same manner a -package is just a directory - WITH THE MAGICAL FILE __init.py__. - -For instance for ert the top level package is just the the 'ert' -directory. Looking on the filesystem for an ert installation it will -look something like this: - - ert/ <-- Top level package / directory. - ert/__init__.py <-- Magic file signalling that 'ert' is a package. - ert/ecl/ <-- ecl is subpackage. - ert/ecl/__init__.py <-- Magic file signalling that 'ecl' is a package. - ert/ecl/ecl_grid.py <-- Normal module in the ecl package. - ert/ecl/ecl_sum.py <-- Normal module in the ecl package. - .... - ert/util/ <-- The util subpackage. - ert/util/__init__.py <-- Magic file that 'util' is a package. - ert/util/stringlist.py <-- Normal module in the util package. - ert/util/tvector.py <-- Normal module in the util package. - - ert/ - -The important thing about packages is that they are normal filesystem -directories, with the magic __init__.py file which signals that this is -indeed a package. When importing a package the python code in the -corresponding __init__.py file will be loaded and imported, that can be -used as a means to perform package initialisation of various kinds. - -The statement: - - import ert - -works, but as long as ert is a package (i.e. only a directory) this is -not immediately very interesting[1], we have to continue importing until -we have actually imported a module with real content: - - import ert.ecl - -Observe how the directories/subdirectories of the filesystem is -translated to a dotted notation in package/module space. The statement -'import ert.ecl' will import the ert.ecl subpackage, but this is still -not any symbols usable for anything. Observe that the 'import ert.ecl' -statement will evaluate the '__init__.py' files found along the -way. ecl_grid.py is a module in the ecl package, so this statement will -finally give us something usable: - - import ert.ecl.ecl_grid - -The module ecl_grid contains the implementation of the EclGrid class -which can actually be used for something. To instantiate an EclGrid -instance we have to go through the following hoops: - - import ert.ecl.ecl_grid - .... - .... - grid = ert.ecl.ecl_grid.EclGrid( "ECLIPSE.EGRID" ) - | | | | - | | | | - Package--/ | | \-------- Symbol (Class definition) - | | - Subpackage ---/ \---- Module - -The 'ert.ecl.ecl_grid' namespace is quite massive. It is easy to -simplify this using the from import as: - - from ert.ecl.ecl_grid import EclGrid <-- Explicitly import symbol - .... into current namespace. - .... - grid = EclGrid( "ECLIPSE.EGRID" ) - -By convention the ert python distribution offers some limited -simplifications of this procedure. - - -3. Interaction with PYTHONPATH - -When issuing the import statement: - - import ert.ecl.ecl_grid - -The directory 'ert/' must be on PYTHONPATH. In principle it is possible -to circumvent the lowest level packge, and go directly to the 'ecl' -subpackage: - - import ecl.ecl_grid - -but then the 'ecl/' directory must be on PYTHONPATH. For the default -ert-python distribution only the 'ert/' directory will be on the import -path, i.e. the "import ecl.ecl_grid' will not work directly. - - -4. Conventions applied in ert - -In the ert-python distribution some conventions have been adhered to; -they can of course be changed: - - 1. All the __init__.py files are empty, i.e. no automagic namespace - manipulations. It is tempting to fill up the __init__.py files - with further import statements in an attempt to shorten and - simplify the imports. Some of that was done in the initial - dsitributions of ert, but it gets very easy to be too smart and - shoot oneself in the foot. - - 2. All subpackages have a module with the same name as the package, - i.e. the ecl package has a module ecl - this module imports all - the symbols from all the modules in the package with no namespace - prefix, i.e. the ecl module looks like: - - from ecl_grid import EclGrid - from ecl_region import EclRegion - from ecl_kw import EclKW - ..... - - The point of this is a minor simplification, you can then issue - e.g. - - from ert.ecl.ecl import * - .... - grid = EclGrid("ECLIPSE.EGRID") - - To get all the symbols from the ecl package directly into the - current namespace, or alternatively you can use - - import ert.ecl.ecl as ecl - .... - grid = ecl.EclGrid("ECLIPSE.EGRID") - - and then use all the symbols from the ecl package under the - common 'ecl' prefix. The functionality of the xxx module in the - xxx package could have been achieved with the __init__.py - but - then you would get it unconditionally. - - 3. The ert python distribution is organized with one subpackage per - library in the ERT C distribution. In each subpackage there is a - module with the same name as the library which this package - wraps, i.e. the ecl package, which wraps the libecl library, - contains a module libecl.py. This module again contains the - ctypes trickery to actually load the libecl.so shared library. - - -[1]: The big disclaimer here is that you can use the '__init__.py' to - perform additional action when importing the 'ert' package. In - particular all symbols defined in the '__init__.py' module/file - will be part of the ert packages namespace. As described in the - chapter '4. Conventions applied in ert' no such trickery is applied - in the ert distribution, all the __init__.py files are empty. diff --git a/python/txt-doc/install.txt b/python/txt-doc/install.txt deleted file mode 100644 index c93fa2e78d..0000000000 --- a/python/txt-doc/install.txt +++ /dev/null @@ -1,88 +0,0 @@ -Installing the ert-python code -============================== -There is currently no functionality provided to install the ert python -code in a way which ensures that it automagically works; however the -required manual process is quite simple. Essentially you need to -install the python package ert/ in a location where the python -interpreter can find it, and the shared libraries must be installed so -that the runtime linker can find them. - -Installing the python modules ------------------------------ -The package directory ert/ and all it's subdirectories must be -installed somewhere where your python interpreter will find it. That -is either in one of the default locations[1], or you can put it -in an arbitrary location and subsequently update your PYTHONPATH -environment variable to contain the directory where you put the ert/ -directory. - -[1]: To find out what your default locations are you can run the - following code in an interactive python shell: - - >>> import sys - >>> print sys.path - -All the python modules should be installed as world readable; for a -minor performance gain it is possible to bytecompile during the -install with a small python script based on the compile function in -the py_compile module. - -Installing the shared libraries -------------------------------- -The ctypes module is based on the dlopen() system call to open the shared -libecl.so. The final dlopen() call is not passed any path information, i.e. the -runtime linker must be able to locate the shared libraries based solely on the -normal runtime linking conventions. This means that you have two alternatives -when installing the shared libraries: - - 1. Install all the libraries in one of the standard library locations - like /usr/local/lib. - - 2. Install the libraries in an arbitrary directory, and then - subsequently update the LD_LIBRARY_PATH variable to point to this - directory. - -The shared libraries you will need is 'libecl.so' - -The test environment --------------------- -In the test/ directory there are two small shell scripts local_bash -and local_csh which will modify the current shell (i.e. local_xxx -should be sourced) so that the tests in the test/ directory can -be run on the source distribution. These two shell scripts illustrate -the necessary modifications to PYTHONPATH and LD_LIBRARY_PATH when -installing in non-standard locations. - -Problems --------- -Apart from normal bugs there are two possible problems when installing -ert-python: - - 1. The python interpreter can not find the ert package. In this case - the error will be something like this: - - >>> import ert - Traceback (most recent call last): - File "", line 1, in - ImportError: No module named ert - - The ert/ directory is not located in the right place, or the - PYTHONPATH variable has not been updated correctly. - - 2. The runtime linker can not find one of the shared libraries: - - import ert.ecl.ecl_grid - Traceback (most recent call last): - File "", line 1, in - File "ert/ecl/ecl_grid.py", line 30, in - from ert.util.tvector import DoubleVector # Requires merging of typemaps .... - File "ert/util/tvector.py", line 46, in - import libutil - File "ert/util/libutil.py", line 31, in - clib.load("libg2c.so.0") - File "ert/cwrap/clib.py", line 49, in load - raise ImportError( error_msg ) - ImportError: Sorry - failed to load shared library:libg2c.so.0 - - Tried in: - You might need to update your LD_LIBRARY_PATH variable diff --git a/python/txt-doc/tips.txt b/python/txt-doc/tips.txt deleted file mode 100644 index bc01b31a7c..0000000000 --- a/python/txt-doc/tips.txt +++ /dev/null @@ -1,618 +0,0 @@ -Table of contents: - - 1. Static and shared libraries - 2. About gen_data / gen_param and gen_obs. - 3. Some tips for implementing a obs_script - 4. About the ERT filesystem - 5. Installing ERT software in Equinor - -********************************************************************** - -1. Static and shared libraries ------------------------------- - -The ert application is based on the internal libraries -libutil,libecl,librms,libsched,libconfig,libplot,libjob_queue and -libenkf. When creating a final ert executable this is done by linking -in static versions of all these libraries. The consistent use of -static libraries makes the system much more robust for updates, and -also easier to have several versions of ERT installed side-by-side. - -The gcc linker will by default link with the shared version of a -library, so if both libXXX.a and libXXX.so are found the shared -version libXXX.so will be used. When linking ert the linker is told -where to locate the internal libraries, but no further -dynamic / --static options are given. Since only static versions of the internal -libraries can be found the resulting linking will be: - - * All the internal libraries are linked statically. - - * All standard libraries like libz, libpthread and liblapack are - linked dynamically. - -This has worked quite OK for a long time, but the advent of Python -bindings, both for the Python wrapper and for the gui have increased -the complexity, the Python bindings require shared -libraries. Currently the shared libraries are just installed in a -slib/ subdirectory beside the lib/ directory, i.e. for e.g. libecl we -have: - - libecl/src/... - libecl/include/... - libecl/lib/libecl.a - libecl/slib/libecl.so - -The normal unix way is to have the shared and static libraries located -in the same place, but that will not work with the current ert link -procedure: - - * Just putting libXXX.so and libXXX.a in the same location without - any updates to the link routine will result in the linker using - the shared versions. - - * Passing the -static link option to gcc will result in a fully - static ert, i.e. also the standard libraries will be linked in - statically. - -Both of these solutions are unsatisfactory. Currently the shared -libaries are installed globally as: - - /project/res/x86_64_RH_X/lib/python/lib/libXXX.so - -This location is used both by the Python wrappers and the gui. ---- - -It is not entirely clear to me how to achieve the goals: - - * The main ert application links with the static version of the - internal libraries. - - * The shared and static version of the internal libraries can - coexist in the same location. - -One solution might be to pass the library to link with explicitly to -the linker, i.e. instead of the normal link command: - - gcc -o exe object1.o object2.o -L/path/to/lib1 -L/path/to/lib2 -l1 -l2 - -where you tell gcc where to search and which libraries to use, you can -alteranatively specify the library files fully on the link command like: - - gcc -o exe object1.o object2.o /path/to/lib1/lib1.a /path/to/lib2/lib2.a - -But how to tell SCons this? - -********************************************************************** - -2. About gen_data / gen_param and gen_obs. ------------------------------------------ - -The most general datatype in ert is the GEN_DATA type. ERT will just -treat this as a vector of numbers, with no structure assigned to -it. In the configuration file both GEN_DATA and GEN_PARAM can be used: - - GEN_PARAM: For parameters which are not changed by the forward - model, i.e. like porosity and permeability. - - GEN_DATA: Data which is changed by the forward model, and therefore - must be loaded at the end of each timestep. The arch-typical - example of a GEN_DATA instance would be seismic data. - -Internally in ERT everything is implemented as gen_data. The -flexibility of the gen_data implementation is a good thing, however -there are some significant disdvantages: - - * Since the gen_data_config object contains very limited meta-data - information it is difficult to capture user error. Typically what - happens is that: - - - The user error is not discovered before long out in the - simulation, and when discovered possibly only as a - util_abort(). - - - User error is not discovered at all - the user just gets other - results than anticipated. - - * The implementation is quite complex, and "different" from the - other datatypes. This has led to numerous bugs in the past; and - there are probably still bugs and inconsistenceis buried in the - gen_data implementation. - -When configuring a gen_data instance you tell ERT which file to look -for when loading the results from the forward model. When the forward -model is complete and loading of results starts the following -happens: - - 1. The gen_data instance will look for the specified filename; if - the file is found it will be loaded. - - 2. If the file is not found, we will just assume size == 0 and - continue. That the file is not found is perfectly OK. - - 3. When the size of the gen_data instance has been verified the - gen_data will call the functions gen_data_config_assert_size() - which will assert that all ensemble members have the same size - - if not things will go belly up with util_abort(). - -Potential problems with this (the strict mapping between size and -report_step can be out of sync): - - 1. If you have problems with your forward model and are "trying - again" old files left lying around can create problems. - - 2. If your forward model is a multi step model, where several steps - have gen_data content there will be a conflict. - -Both of the problems can be reduced by using a gen_data result file -with an embedded %d format specifier, this will be replaced with the -report_step when looking for a result file. - - -The final complexity twist is the ability for the forward model to -signal that some datapoints are missing - for whatever reason. If for -instance the forward model should produce the file "DATA" it can -optionally also prouce the file "DATA_active" which should be a -formatted file with 0 and 1 to denote inactive and active elements -respectively. Before the gen_data instance can be used in EnKF -updating the active/inactive statis must be the same for all ensemble -members, this is achieved by calling the function -gen_data_config_update_active() which will collect active/inactive -statistics according to AND: - - activ[index] = AND( active[index,iens=0] , active[index,iens=1] , - ...) - -The final active mask is stored with an enkf_fs_case_tstep() call, so -that it can be recovered for a later manual analysis step. This code -is from september/october 2010 - and there are rumors to be a bug or -two here, my first suspect is with save/restore functionality in the -functions gen_data_config_update_active() and -gen_data_config_load_active(). - -********************************************************************** - -3. Some tips for implementing a obs_script ------------------------------------------- - -There are two different configuration systems present in the ERT -code. libconfig/src/config.c implements the "config" system, whereas -the "conf" system is implememented in libconfig/src/conf.c. The "conf" -system is only used in the observation system, whereas the "config" -system is used for the main configuration file and also some other -small areas of the code. The occurence of two different config systems -is of course a major embarressement, it should all have been xml :-( - -Since the observation file is implemented with the "conf" system, that -is what applies in this case. - -Concrete tips: - - 1. Modify the "enkf_conf_class" instance which is created in the - enkf_obs_get_obs_conf_class() function to allow for two - additional arguments, for instance OBS_SCRIPT and SCRIPT_ARG. - It is probably also necessary to relax some of the constraints in - the gen_obs_class definition?? - - Observe that the "conf" system is strongly key=value oriented, - that means that it is difficult to set a list of arguments with - one key, to solve this I suggest using quotes and a util function - to split on " ". - - ..... - OBS_SCRIPT = /path/to/som/script/make_obs.py - SCRIPT_ARG = "ARG1 ARG2 ARG3 ...ARG17" - .... - - 2. I suggest that the only "rule" for the script is that it should - produce a stdout stream like: - - value1 - value2 - value3 - .... - error1 - error2 - error3 - .... - - this can then be easily captured to a temporary file by setting - the stdout redirection of the util_spawn() function, and then - subsequently the 100% normal way of creating a gen_obs instance - can be used. Input arguments / input files / e.t.c. to the - OBS_SCRIPT should be given in the SCRIPT_ARG option - fully - specified by the user. - -********************************************************************** - -4. About the ERT filesystem ---------------------------- - -The system for storing information in ert is quite large and -complex. The top level system is implemented in the file enkf_fs.c, -seen from ERT everything should be accessible as enkf_fs_xxxx() -functions. - -4.1 The different data types - -The storage system in ERT operates with three different types of -data/keywords: - - static: These are static fields from the ECLIPSE restart files which - are needed to be able to restart an ECLIPSE simulation. These - keywords are only interesting for the ability to restart, and - never inspected by ERT itself. Corresponds to the enkf_var_type - (see enkf_types.h) of STATIC_STATE. - - parameter: These are the parameter which we are updating with EnKF - like e.g. the permx field and for instance a MULTFLT - multiplier. Corresponding to an enkf_var_type value of PARAMETER. - - dynamic: These represent data which are updated by the forward - model, this amounts to both state data like e.g. the pressure - (enkf_var_type == DYNAMIC_STATE) and result data like e.g. the - watercut in a well (enkf_var_type == DYNAMIC_RESULT). - - -4.2 The coordinates: (kw , iens , tstep , state ) - -To uniquely specify an enkf_node instance we need three/four -coordinates, all the enkf_fs functions take these parameters as -input. The required coordinates are: - - kw: This is the string the keyword is given in the config file, - i.e. for the pressure this is "PRESSURE" and for the watercut in - well P-5 it is WWCT:P-5. Many of the enkf_fs functions take an - enkf_node instance as argument, and then the kw is obviously read - off directly from the node and not passed explicitly as a parameter. - - iens: This is just the member number we are after, observe that the - counting starts with offset zero. - - tstep: This is the timestep we are after, this numbering is the - ECLIPSE report steps (ECLIPSE is unfortunately not just any FORWARD - model, but still has severe influence on the structure of ERT :-( ) - -Observe that the state "coordindate" is not treated like a first class -coordinate in the same manner as iens and tstep. Exactly how the state -coordinate is handled differs for the different data types: - - static: Ignores the state flag completely. - - dynamic: For the dynamic data the enkf_fs layer will select either - the dynamic_forecast or the dynamic_analyzed driver depending on - the value of state. - - parameter: For parameters which do not have an intrinsic internal - dynamics the enkf_fs layer will use the enkf identity: - - Forecast( X(t) ) = Analyzed( X(t-1) ) - - so if you ask for the analyzed at step 't' the enkf_fs layer - will query the database for (iens , 't'), whereas if you ask for - the forecast at tstep 't' the enkf_fs layer will go looking for - (iens , 't - 1'). When it comes to parameters the enkf_fs layer - will continue looking for (t , t-1 , t-2 , ... , 0) all the way - back to the initially sampled values. - - -4.3 Reading and writing enkf_node state. - -The most important piece of information to read and write for ERT are -the enkf_node instances, i.e. the parameters we are sampling/updating -and the data we are loading from the forward model. The saving of an -enkf_node goes roughly like this: - - 1. The function enkf_node_store( ) is called. The enkf_node_store() - function will use the store() function pointer and invoke one of - type specific store functions: field_store() / summary_store() / - ... which does the actual work. - - 2. The enkf_node_store() function gets a buffer_type (buffer_type is - implemented in libutil/src/buffer.c) instance as input argument, - and everything stored by enkf_node_store() and store() function - pointer should be "written" as bytes into this buffer. NOT - directly to the filesystem in any way. - - 3. When the enkf_node_store() function has returned the enkf_fs layer - will take the now full buffer_type instance and pass this on to the - fs driver which will actually store the buffer in whatever way it - implements. - -Loading an enkf_node is essentially the reverse process, with store -<-> load. - - -4.4 Reading and writing the index - kw_list - -The ERT filesystem implements something called alternatively "index" -or "kw_list". This a quite crufty and inelegant consequence of to much -ECLIPSE focus. The topic is related to storing/reassembling ECLIPSE -restart information. The story goes about like this: - - 1. A forward model has completed, and ERT loads an ECLIPSE restart - file. The ECLIPSE restart file might contain e.g. the keywords - (example only): - - SEQNUM , INTEHEAD, DOUBHEAD, SGRP, PRESSURE, SWAT, SOMAX, RPORV - - These keywords come in three categories: - - a) Keywords we are updating with ERT, i.e. typically "PRESSURE" - and "SWAT" in the example above. - - b) Keywords which are needed to perform a restart, - e.g. INTEHEAD, DOUBHEAD and SGRP. - - c) Keywords which can be ignored, e.g. SOMAX and RPORV. - - ERT uses the function ecl_config_include_static_kw() to - differentiate between cases b) and c). - - 2. For the static keywords which ERT determines it needs to store, - i.e. case b) above ERT will automatically create the corresponding - enkf_node instances (if they do not already exists), and then - subsequently store the results. I.e. for the case above we will - get the pseudo code: - - enkf_node_store( "PRESSURE" ) - enkf_node_store( "SWAT" ) - - if (!has_node( "INTEHEAD")) - create_node( "INTEHEAD" ) - enkf_node_store( "INTEHEAD" ) - - if (!has_node( "DOUBHEAD")) - create_node( "DOUBHEAD" ) - enkf_node_store( "DOUBHEAD" ) - - if (!has_node( "SGRP")) - create_node( "SGRP" ) - enkf_node_store( "SGRP" ) - - 3. When we want to create a corresponding restart files, we must - reassemble the keywords: - - INTEHEAD, DOUBHEAD, SGRP, PRESSURE, SWAT - - in the right order. - -Now - the point is that when we store the nodes in the database we -loose track of the ordering of the keywords - and then ECLIPSE goes -belly up. I.e. we must keep track of the original order of the -keywords, that is what the index/kw_list is used for. - - -4.5 The different driver types - -The enkf_fs layer does not directly store nodes to disk, instead that -task is passed on to a driver. When the enkf_fs layer has completed -filling up a buffer instance with the data to store, that buffer is -passed on to the apropriate driver. The driver is a structure with -function pointers to functions for doing the very basic low-level -read/write operations and also whatever state needed by the -driver. The use of these pluggable drivers for read and write -operations certainly increase the complexity quite a lot, however it -also certainly gives good flexibility. At the moment someone could -implement BDB storage (a good idea) or Amazon S3 storage (a bad idea) -with only very limited modifications to enkf_fs.c and essentially no -changes whatsoever to the rest of the ERT code. - -The drivers are included in the enkf_fs structure as: - - .... - fs_driver_type * dynamic_forecast; - fs_driver_type * dynamic_analyzed; - fs_driver_type * parameter; - fs_driver_type * eclipse_static; - fs_driver_index_type * index; - .... - -I.e. all the different variable types in section 4.1 have their own -private driver. The index variable is used for storing the kw_list -(see section 4.4). fs_driver_type is an abstract type without any -implementation (I think !). - - -4.5.1 The plain driver - -The first driver was the plain driver; this driver creates a deep -directory structure and stores all nodes in a separate file. This has -the following characteristica: - - * Surprisingly good write performance - * Catastrophic read performance - * The excessive number of small files (~ 10^7 for a large - simulation) is *totally* unacceptable. - -All in all the plain driver should not be used for serious work. The -plain driver was obviously bad design already when written, but as -long as the enkf_fs layer was written with abstract drivers the design -of a more suitable fs driver could easily be postponed! - - -4.5.2 The block_fs driver - -The block_fs driver is based on creating block_fs instances -(block_fs_type is implemented in libutil/src/block_fs.c). The block_fs -instances are binary files which are open through the duration of the -program, the block_fs system then has an api for reading and writing -(key,value) pairs to this file. - -The block_fs_driver/block_fs combination is quite complex, but it has -not had any hickups for about 1.5 years of extensive use in -Equinor. Observe that if you pull the plug on ERT you might loose some -of the data which has been stored with the block_fs driver, but partly -written and malformed data will be detected and discarded at the next -boot. You are therefore guaranteed (add as many quotes you like to the -guarantee - but this has at least worked 100% up until now) that no -bogus data will be loaded after an unclean shutdown. When shut down -cleanly the block_fs will create an index file which can be used for -faster startup next time, in the case of an unclean shutdown the index -must be built from the data file. In the case of large files this can -take some time <= 30 seconds? - -If you have set storage root to "Storage" the case "Prior" will create -the following directory structure with the block_fs driver: - - Storage/Prior/mod_0 - /mod_1 - /mod_2 - .... - /mod_31 - -Each of the mod_xx directories will contain block_fs mount/data/index -files for each of the drivers. The reason to use the mod_xxx -directories is mainly to increase multithreaded performance during -both read and write, in addition the resulting files do not get that -large (2GB limit and so on). The mod_xxx is used as follows: - - Ensemble member iens is stored in directory: mod_(mod(iens,32)) - -I.e. ensemble members 0,32,64,96,.. are stored in directory mod_0, -whereas ensemble members 7,39,71 and 103 are stored in mod_7. - - -The resulting files are quite opaque, and impossible to work with by -using normal filesystem/editor/shell commands. In -libutil/applications/block_fs/ there are some utilities for working -with these files which can be used for various forms of crash -recovery, problem inspection and so on. - -In addition an sqlite based driver has been written, it worked ok but -performance turned out to be quite poor. - - -4.6 Filesystem metadata - -The metadata required to instantiate (i.e. "mount") a enkf_fs -filesystem is contained in the file "enkf_mount_info"; this is a -binary file with the following content: - - FS_MAGIC_ID /* A magic number */ - CURRENT_FS_VERSION /* The version of the filesystem; - has been 104 since september 2009. */ - ------- - / DRIVER_CATEGORY /* Element from fs_driver_enum in fs_types.h */ - | DRIVER_IMPLEMENTATION /* Element from fs_driver_impl in fs_types.h - \ Extra driver info /* Whatever information needed by the driver - ------- implementation given by DRIVER_IMPLEMENTATION. */ - - CASES - -The block [DRIVER_CATEGORY, DRIVER_IMPLEMENTATION, Extra ..] is -repeated five times to cover all the driver categories -DRIVER_PARAMETR, DRIVER_STATIC, DRIVE_DYNAMIC_FORECAST, -DRIVER_DYNAMIC_ANALYZED and INDEX. - -Unfortunately there have been some hickups with the enkf_mount_info -file in the past, it should probably have been an ascii file. If there -are problems with it it is "quite safe" to just delete the mount info -file and then restart ERT. When ERT finds old data files (in the case -of block_fs) it will just load these and continue. However if you do -this you must use the ERT menu option to recreate all your cases, ERT -will then find the existing data and continue (more or less) happily. - - -4.7 Reading and writing "other" files - -The main part (i.e. more than 99%) of the enkf_fs implementation is -devoted to reading and writing of enkf_node instances. However there -are typically some other situations during a simulation where it is -interesting to store filenames with per member or per timestep -information - this is available thorugh the functions at the bottom of -enkf_fs. Not very important functions, but convenient enough. - -********************************************************************** - -5. Installing ERT software in Equinor -------------------------------------- - -Installation of research software in Equinor is according to the -general guideline: - - 1. Log in to a computer in Trondheim with the correct version of - RedHat installed; the files will be automagically distributed to - other locations within a couple of minutes. - - 2. Copy the files you want to copy into the - /project/res/x86_64_RH_??/ directory. - - 3. Update the metadata on the files you copy: - - a) chgrp res - b) chmod a+r g+w - c) For executables: chmod a+x - -For the simple programs like the ECLIPSE programs summary.x & -convert.x this can be easily achieved with the SDP_INSTALL of -SConstruct files: - - bash% scons SDP_INSTALL - -Unfortunately it has been hard to get SCons to -behave according to the requirements listed above; for the more -extensive installation procedures there are therefore simple Python -scripts "install.py" which should be invoked after the SCons build is -complete. - - -5.1 Installing ert - the tui - -In the directory libenkf/applications/ert_tui there is a install.py -script. This script will do the following: - - 1. Copy the "ert" binary found in the current directory to: - - /project/res/x86_64_RH_???/bin/ert_release/ert_ - - where is the current svn version number. Observe that the - script will refuse to install if the svn version number is not - "pure", i.e. "3370" is okay, "3370M" or "3100:3370" is not OK. - - - 2. Update the permissions & ownsherhip on the target file. - - - 3. Update the symlink: - - /project/res/x86_64_RH_??/bin/ert_latest_and_greatest - - to point to the newly installed file. - - -5.2 Installing python code (including gert) - -The approach to installing Python code is the same for both the gui -(which is mainly Python) and for ERT Python. The installation scripts -are: - - python/ctypes/install.py - libenkf/applications/ert_gui/bin/install.py - -These scripts will (should ??): - - 1. Install the ERT shared libraries like libecl.so and so on to - - /project/res/x86_64_RH_???/lib/python/lib - - These shared libraries are the same for both ERT Python and gert. - - 2. Install python files (with directories) to - - /project/res/x86_64_RH_???/lib/python - - All Python files are byte compiled, producing the pyc files - - observe that this might induce permission problems. - - 3. Update the modes and ownership on all installed files. - - - -5.3 Installing the etc files - -In the directory etc/ there is a install.py script. This script will: - - 1. Copy the full content of the etc/ERT directory to - /project/res/etc/ERT. - - 2. Update the modes and ownership on all installed files.