mdf_reader package

Submodules

mdf_reader.mdf_blocks module

This module is part of the mdf_parser.

The md_blocks module contains classes for all required blocks of a mdf-file

Objects of the different mdf format blocks are instantiated by the MDFParser or by the classes within mdf_blocks itself.

Short description of the block classes

MDFBlock

Is the base class for all other classes

Provides methods for the interpretation of the mdf standard

Provides methods to manipulate strings

Methods is this class a common for all other classes

MDHFileHeader

Name in MDF: Identification block

Identification of the file as MDF file and MDF version

class mdf_reader.mdf_blocks.DataFormat[source]

Bases: object

Definition of format names according to MDF manual.

Notes

The coding is one-index-based and cyclic on 256, meaning that 256 = 1, 257 = 2,…
To obtain the correct name, do : DataFormat[ data_format_nr % 256 - 1]

bytes = {'BOOL': 2, 'BOOL16': 2, 'BOOL32': 4, 'BOOL64': 8, 'BOOL8': 1, 'CHAR': 1, 'DOUBLE': 8, 'FLOAT': 4, 'INT16': 2, 'LINK': 4, 'LONG': 4, 'LONG DOUBLE': 10, 'LONGLONG': 8, 'REAL': 6, 'REAL48': 6, 'SHORT': 2, 'UCHAR': 1, 'UINT16': 2, 'UINT32': 4, 'UINT64': 8, 'UINT8': 1, 'ULONG': 4, 'ULONGLONG': 8, 'USHORT': 2, 'text': None, 'ymdhms': 4}

name = 'LINK'

np_dtypes = {'BOOL': dtype('bool'), 'BOOL16': dtype('bool'), 'BOOL32': dtype('bool'), 'BOOL64': dtype('bool'), 'BOOL8': dtype('bool'), 'CHAR': dtype('int64'), 'DOUBLE': dtype('float64'), 'FLOAT': dtype('float64'), 'INT16': dtype('int16'), 'LINK': dtype('int64'), 'LONG': dtype('int64'), 'LONG DOUBLE': dtype('int64'), 'LONGLONG': dtype('int64'), 'REAL': dtype('float64'), 'REAL48': dtype('float64'), 'SHORT': dtype('int16'), 'UCHAR': dtype('int64'), 'UINT16': dtype('uint16'), 'UINT32': dtype('uint32'), 'UINT64': dtype('uint64'), 'UINT8': dtype('int64'), 'ULONG': dtype('uint64'), 'ULONGLONG': dtype('uint64'), 'USHORT': dtype('uint16'), 'text': dtype('int64'), 'ymdhms': dtype('uint64')}

class mdf_reader.mdf_blocks.DataSetRecord(file_pointer, verbose=1)[source]

Bases: MDFBlock

DataSetRecord contains all information specified by MDF for the Dataset Record

Parameters:

file_pointer (object) – Point to the file stream
verbose (int) – verbosity level

type

type of the data record

Type:: int

size

size of the data record

Type:: int

version

verbosity level

Type:: int

frame_offset

offset of the current frame

Type:: int

byte_to_ndarray(byte_array: object, frame_size: int, n_frames_to_read: int) → ndarray[source]

Turn a byte array into a numpy array

Parameters:

byte_array (object) – Binary array containing all the records
frame_size (int) – Size of a single frame
n_frames_to_read (int) – Number of records to read

Returns:

The numpy array with the converted that

Return type:

ndarray

class mdf_reader.mdf_blocks.MDFBlock[source]

Bases: object

Base class to define a block in the mdf file

static pretty(string)[source]

Removes tailing zero strings from a string

Parameters:: string (str) – The string to clean
Returns:: string with removed tailing zero strings
Return type:: str

read_format(fp, data_type, number=1)[source]

Parameters:

fp (file object) – pointer to the current file
data_type (dtype) – Type of the data to read
number (int, optional) – Number of items to read. Default value = 1

Returns:

Unpacked data

Return type:

dtype

read_string(fp, n_characters)[source]

Read a string consisting of n_characters starting at the current position of the file pointer

Parameters:

fp (IOStream) – file pointer to the current data
n_characters (int) – Number of characters to read

Returns:

The string read from the fp file pointer

Return type:

str

Notes

It is also ensured that the file pointer is positioned at the end of the n_characters after reading

static unpack_byte_array(byte_array, data_type)[source]

Parameters:

byte_array (ndarray) – Array with the data to unpack
data_type (dtype) – Type of the data

Returns:

Unpacked data

Return type:

ndarray

class mdf_reader.mdf_blocks.MDHFileHeader(file_pointer)[source]

Bases: MDFBlock

MDHFileHeader contains all information specified by MDF for the ID BLOCK

Parameters:: mdf_stream (IOStream) – reference to mdf file

version

Version string, such as 2.1

Type:: str

version_minor

Minor version (1)

Type:: UINT16

version_major

Major version (2)

Type:: UINT16

status_record_position

Refers to status of type 10. Must be 0 if unused

Type:: LONG

created_by

Generated by: 0=User, 1=MLab

Type:: LONG

mdf_header_size

Size of the header (normally 72)

Type:: LONG

store_type

storage method: 0. Multiplexed 1. Block-wise (currently unused)

Type:: LONG

file_type

File type

Type:: LONG

frame_size

Size of the data frame in bytes

Type:: LONG

no_of_data_sets

Number of datasets in the file

Type:: LONG

day

Day of the date at which the MDF file was created

Type:: LONG

month

Month of the date at which the MDF file was created

Type:: LONG

year

Year of the date at which the MDF file was created

Type:: LONG

hour

Hour of the time at which the MDF file was created

Type:: LONG

minute

Minute of the time at which the MDF file was created

Type:: LONG

second

Second of the time at which the MDF file was created

Type:: LONG

mdf_reader.mdf_blocks.set_logging_level(logger, verbose=1)[source]

function to set the level of the logger

Parameters:

logger – handle to the logger
verbose – 0=silent, 1=info, 2=debug (Default value = 1)

Raises:

AssertionError – In case a non valid option is passed

mdf_reader.mdf_parser module

A module for reading microlab MDF file. Usage

import mdf_parser.mdf_parser as mdf

mdf_object = mdf.MDFParser(file_name)

Author: Eelco van Vliet 29-2-2015

class mdf_reader.mdf_parser.MDFParser(mdf_file, import_data=True, include_columns=None, exclude_columns=None, verbose=1, convert_datetime=True, resample_data=False, constant_sample_rate=True, replace_record_names={}, log_level=30, date_time_label='DateTime', date_time_match_string='^_DateTime32$', load_date_time=True, set_relative_time_column=False, include_date_time=False)[source]

Bases: object

The MDFParser class contains methods for reading mdf files.

Parameters:

mdf_file (str) – Path to a binary the follows MDF 3.3 specification
import_data (bool , optional) – Flag to enable to import the data, default = True. If False, only the header information is read.
include_columns (list) – List with columns to import. Default value = []. If empty, all columns are included
exclude_columns (list) – List with colums to exclude. Default value =[]. If empty, none are excluded
verbose (int) – Set the logging level. Obsolete 0. Silent 1. Normal info 2. Debugging
convert_datetime (int, optional) – Translate the ymdhms integer into a data time string
resample_data (bool, optional, False) – The sampled data is not completely uniformly sampled. To enforce an equidistant sampling, set this flag to true
constant_sample_rate (bool, optional) – If true, use the sample rate for the clock, otherwise, the ymdhms is leading. Defaults to True
replace_record_names (dict, optional) – A dictionary with records names which we want to replace from A1 to B1
date_time_label (str, optional) – Default label to assign to the Date time string. Default = “DateTime”
date_time_match_string (str, optional) – The date time column is selected based on this match string. Default = “_DateTime32”
load_date_time (bool, optional) – Always read the date time information channel, even if it is not explicitly mentioned in the filter list. Defaults to True
set_relative_time_column (bool, optional) – If true, create a column time_r in seconds with the relative time starting at t=0 s. Defaults to False

Examples

Reading an MDF file is done by creating a MDFParser object with a file_name as first argument.

>>> file_name = "../data/AMS_BALDER_110225T233000_UTC222959.mdf"
>>> header_object = MDFParser(mdf_file=file_name, import_data=False)
>>> names = header_object.make_report()

If the import_data flag would have been set to True, the header_object class would have been created and all MDF data would be put in a data frame header_object.data. In this example, however, we only read the header information of the MDF file first. As a next step, we can make a selection of the data columns we want to import. In this way the reading time of an MDF data file can be reduced significantly as only the selected data needs to be imported. The data available in the mdf file can be explored by using the make_report() method. which writes all channels to screen. Now, we are going to select the MRU_Roll data first.

>>> from tabulate import tabulate
>>> names_labels_and_groups = header_object.set_column_selection(
...     filter_list=["MRU_Roll"], include_date_time=True)
>>> header_object.import_data()
>>> print(tabulate(header_object.data.head(5), headers="keys", tablefmt="psql"))
+----------------------------+------------+
| DateTime                   |   MRU_Roll |
|----------------------------+------------|
| 2011-02-25 23:30:00        |    0.01207 |
| 2011-02-25 23:30:00.040000 |    0.01207 |
| 2011-02-25 23:30:00.080000 |    0.01207 |
| 2011-02-25 23:30:00.120000 |    0.01204 |
| 2011-02-25 23:30:00.160000 |    0.01204 |
+----------------------------+------------+

The names_labels_and_groups now contains 3 lists, but we don’t use it now. For more information about the return values, look at the docstring of the set_column_selection method.

Because we have added the include_date_time flag, the DataTime column is read by default and set as the index of the DataFrame. You can do this multiple times if you want to add more columns. The include_data_time does not have to be given again as we already have imported the DateTime. So let’s import the MRU Roll Pitch Heave channels as well. We do this with a regular expression matching all the channels names starting with MRU_R, MRU_P, or MRU_H

>>> names_labels_and_groups = header_object.set_column_selection(
...    filter_list=["MRU_[RPH]"])
>>> header_object.import_data()
>>> print(tabulate(header_object.data.head(5), headers="keys", tablefmt="psql"))
+----------------------------+------------+-------------+-------------+
| DateTime                   |   MRU_Roll |   MRU_Heave |   MRU_Pitch |
|----------------------------+------------+-------------+-------------|
| 2011-02-25 23:30:00        |    0.01207 |     -0.1051 |  -0.0001869 |
| 2011-02-25 23:30:00.040000 |    0.01207 |     -0.1051 |  -0.0001869 |
| 2011-02-25 23:30:00.080000 |    0.01207 |     -0.1051 |  -0.0001869 |
| 2011-02-25 23:30:00.120000 |    0.01204 |     -0.1078 |  -0.0002593 |
| 2011-02-25 23:30:00.160000 |    0.01204 |     -0.1078 |  -0.0002593 |
+----------------------------+------------+-------------+-------------+

Since all data is stored in the Pandas Dataframe header_object.data we can plot the data using all Pandas/matplotlib plotting capabilities. This is demonstrated in the example notebook.

import_data(set_relative_time_column=None)[source]

Import the binary data from the dta file

Parameters:: set_relative_time_column (bool or None, optional) – If true, store the relative time in the time_r column. Default is None, which means that the value as stored during initialization of the class is taken. This is False by default, but can also be passed through the constructor arguments.

import_header(mdf_file)[source]

Read the header data from the mdf file

Parameters:: mdf_file – the name of the mdf header file
Returns:: nothing
Return type:: type

make_report(show_loaded_data_only=False)[source]

Make a report of the records available in the mdf file

Parameters:: show_loaded_data_only (bool, optional) – If True, only show the data columns that have been loaded. Default = False, which means that all channels are shown
Returns:: List of the reported columns. We can use this list to obtain the channel name by the index
Return type:: list

set_column_selection(filter_list, set_on_exclude_list=False, include_date_time=None)[source]

Select the data to import based on a list of regular expressions

Parameters:

filter_list (list) – A list with regular expression in which the first filter is always applied on the name field and the next filters are all applied to the label field of the record.
set_on_exclude_list (bool, optional) – By default, the selected columns are added to the include list. If this value is true, set the selection on the excluded list. Defaults to False
include_date_time (bool, optional) – Include the date time field by default (without specification in the filter_list). Handy for the examples as you don’t have to specify the DateTime explicitly. Defaults to None, implying that the setting is taken from the constructor and is set to False.

Returns:

Selection of name columns along with a list of the () group selection

Return type:

tuple (name_list, label_list, group_list)

Notes

The data reader allows passing a list of exclude_columns and include_columns by which you can select which column is actually read. With the routine, lists can be created by a regular expression filter

mdf_reader.mdf_parser.convert_ymdhms_to_data_time(ymdhrs_array, sample_rate=1, constant_sample_rate=True)[source]

Convert the binary year month day hour minutes seconds representation into a readable data/time string

Parameters:

ymdhrs_array (binary) – array with the ymdhrs datatime integers
sample_rate (float, optional) – the sampling rate of the signal (Default value = 1)
constant_sample_rate (bool, optional) – If True assume that the sample read is leading (Default value = True)

Returns:

DateTime pandas index array

Return type:

type

Notes

In the first version of the script, the ymdhrs was taken as leading and the number of samples per seconds we corrected to take care of missing samples or too many samples in a second. It appears that the sample rate is really constant and that the clock time may vary. Setting this flag true takes the sample rate leading

mdf_reader.mdf_parser.decode_ymdhms(ymdhms)[source]

The year month day hour minute seconds are stored in the 4-byte integer

Parameters:: ymdhms (int) – A 4-byte integer containing the date time according to the MTF manual
Returns:: The ISO Data time string
Return type:: type

mdf_reader.mdf_parser.main(args)[source]

The main routine for testing purpose

Parameters:: args (list) – Command line arguments

mdf_reader.mdf_parser.parse_args(args)[source]

Parse command line parameters

Parameters:: args (list) – Command line parameters as a list of strings
Returns:: command line parameters
Return type:: argparse.Namespace

mdf_reader.mdf_parser.run()[source]

mdf_reader package

Submodules

mdf_reader.mdf_blocks module

Short description of the block classes

MDFBlock

MDHFileHeader

mdf_reader.mdf_parser module

Module contents