mdf_reader package
Submodules
mdf_reader.mdf_blocks module
This module is part of the mdf_parser.
The md_blocks module contains classes for all required blocks of a mdf-file
Objects of the different mdf format blocks are instantiated by the MDFParser or by the classes within mdf_blocks itself.
Short description of the block classes
MDFBlock
Is the base class for all other classes
Provides methods for the interpretation of the mdf standard
Provides methods to manipulate strings
Methods is this class a common for all other classes
MDHFileHeader
Name in MDF: Identification block
Identification of the file as MDF file and MDF version
- class mdf_reader.mdf_blocks.DataFormat[source]
Bases:
objectDefinition of format names according to MDF manual.
Notes
The coding is one-index-based and cyclic on 256, meaning that 256 = 1, 257 = 2,…
To obtain the correct name, do : DataFormat[ data_format_nr % 256 - 1]
- bytes = {'BOOL': 2, 'BOOL16': 2, 'BOOL32': 4, 'BOOL64': 8, 'BOOL8': 1, 'CHAR': 1, 'DOUBLE': 8, 'FLOAT': 4, 'INT16': 2, 'LINK': 4, 'LONG': 4, 'LONG DOUBLE': 10, 'LONGLONG': 8, 'REAL': 6, 'REAL48': 6, 'SHORT': 2, 'UCHAR': 1, 'UINT16': 2, 'UINT32': 4, 'UINT64': 8, 'UINT8': 1, 'ULONG': 4, 'ULONGLONG': 8, 'USHORT': 2, 'text': None, 'ymdhms': 4}
- name = 'LINK'
- np_dtypes = {'BOOL': dtype('bool'), 'BOOL16': dtype('bool'), 'BOOL32': dtype('bool'), 'BOOL64': dtype('bool'), 'BOOL8': dtype('bool'), 'CHAR': dtype('int64'), 'DOUBLE': dtype('float64'), 'FLOAT': dtype('float64'), 'INT16': dtype('int16'), 'LINK': dtype('int64'), 'LONG': dtype('int64'), 'LONG DOUBLE': dtype('int64'), 'LONGLONG': dtype('int64'), 'REAL': dtype('float64'), 'REAL48': dtype('float64'), 'SHORT': dtype('int16'), 'UCHAR': dtype('int64'), 'UINT16': dtype('uint16'), 'UINT32': dtype('uint32'), 'UINT64': dtype('uint64'), 'UINT8': dtype('int64'), 'ULONG': dtype('uint64'), 'ULONGLONG': dtype('uint64'), 'USHORT': dtype('uint16'), 'text': dtype('int64'), 'ymdhms': dtype('uint64')}
- class mdf_reader.mdf_blocks.DataSetRecord(file_pointer, verbose=1)[source]
Bases:
MDFBlockDataSetRecord contains all information specified by MDF for the Dataset Record
- class mdf_reader.mdf_blocks.MDFBlock[source]
Bases:
objectBase class to define a block in the mdf file
- read_format(fp, data_type, number=1)[source]
- Parameters:
fp (file object) – pointer to the current file
data_type (dtype) – Type of the data to read
number (int, optional) – Number of items to read. Default value = 1
- Returns:
Unpacked data
- Return type:
dtype
- read_string(fp, n_characters)[source]
Read a string consisting of n_characters starting at the current position of the file pointer
- Parameters:
fp (IOStream) – file pointer to the current data
n_characters (int) – Number of characters to read
- Returns:
The string read from the fp file pointer
- Return type:
Notes
It is also ensured that the file pointer is positioned at the end of the n_characters after reading
- class mdf_reader.mdf_blocks.MDHFileHeader(file_pointer)[source]
Bases:
MDFBlockMDHFileHeader contains all information specified by MDF for the ID BLOCK
- Parameters:
mdf_stream (IOStream) – reference to mdf file
- version_minor
Minor version (1)
- Type:
UINT16
- version_major
Major version (2)
- Type:
UINT16
- status_record_position
Refers to status of type 10. Must be 0 if unused
- Type:
LONG
- created_by
Generated by: 0=User, 1=MLab
- Type:
LONG
- mdf_header_size
Size of the header (normally 72)
- Type:
LONG
- store_type
storage method: 0. Multiplexed 1. Block-wise (currently unused)
- Type:
LONG
- file_type
File type
- Type:
LONG
- frame_size
Size of the data frame in bytes
- Type:
LONG
- no_of_data_sets
Number of datasets in the file
- Type:
LONG
- day
Day of the date at which the MDF file was created
- Type:
LONG
- month
Month of the date at which the MDF file was created
- Type:
LONG
- year
Year of the date at which the MDF file was created
- Type:
LONG
- hour
Hour of the time at which the MDF file was created
- Type:
LONG
- minute
Minute of the time at which the MDF file was created
- Type:
LONG
- second
Second of the time at which the MDF file was created
- Type:
LONG
- mdf_reader.mdf_blocks.set_logging_level(logger, verbose=1)[source]
function to set the level of the logger
- Parameters:
logger – handle to the logger
verbose – 0=silent, 1=info, 2=debug (Default value = 1)
- Raises:
AssertionError – In case a non valid option is passed
mdf_reader.mdf_parser module
A module for reading microlab MDF file. Usage
import mdf_parser.mdf_parser as mdf
mdf_object = mdf.MDFParser(file_name)
Author: Eelco van Vliet 29-2-2015
- class mdf_reader.mdf_parser.MDFParser(mdf_file, import_data=True, include_columns=None, exclude_columns=None, verbose=1, convert_datetime=True, resample_data=False, constant_sample_rate=True, replace_record_names={}, log_level=30, date_time_label='DateTime', date_time_match_string='^_DateTime32$', load_date_time=True, set_relative_time_column=False, include_date_time=False)[source]
Bases:
objectThe MDFParser class contains methods for reading mdf files.
- Parameters:
mdf_file (str) – Path to a binary the follows MDF 3.3 specification
import_data (bool , optional) – Flag to enable to import the data, default = True. If False, only the header information is read.
include_columns (list) – List with columns to import. Default value = []. If empty, all columns are included
exclude_columns (list) – List with colums to exclude. Default value =[]. If empty, none are excluded
verbose (int) – Set the logging level. Obsolete 0. Silent 1. Normal info 2. Debugging
convert_datetime (int, optional) – Translate the ymdhms integer into a data time string
resample_data (bool, optional, False) – The sampled data is not completely uniformly sampled. To enforce an equidistant sampling, set this flag to true
constant_sample_rate (bool, optional) – If true, use the sample rate for the clock, otherwise, the ymdhms is leading. Defaults to True
replace_record_names (dict, optional) – A dictionary with records names which we want to replace from A1 to B1
date_time_label (str, optional) – Default label to assign to the Date time string. Default = “DateTime”
date_time_match_string (str, optional) – The date time column is selected based on this match string. Default = “_DateTime32”
load_date_time (bool, optional) – Always read the date time information channel, even if it is not explicitly mentioned in the filter list. Defaults to True
set_relative_time_column (bool, optional) – If true, create a column time_r in seconds with the relative time starting at t=0 s. Defaults to False
Examples
Reading an MDF file is done by creating a MDFParser object with a file_name as first argument.
>>> file_name = "../data/AMS_BALDER_110225T233000_UTC222959.mdf" >>> header_object = MDFParser(mdf_file=file_name, import_data=False) >>> names = header_object.make_report()
If the import_data flag would have been set to True, the header_object class would have been created and all MDF data would be put in a data frame header_object.data. In this example, however, we only read the header information of the MDF file first. As a next step, we can make a selection of the data columns we want to import. In this way the reading time of an MDF data file can be reduced significantly as only the selected data needs to be imported. The data available in the mdf file can be explored by using the make_report() method. which writes all channels to screen. Now, we are going to select the MRU_Roll data first.
>>> from tabulate import tabulate >>> names_labels_and_groups = header_object.set_column_selection( ... filter_list=["MRU_Roll"], include_date_time=True) >>> header_object.import_data() >>> print(tabulate(header_object.data.head(5), headers="keys", tablefmt="psql")) +----------------------------+------------+ | DateTime | MRU_Roll | |----------------------------+------------| | 2011-02-25 23:30:00 | 0.01207 | | 2011-02-25 23:30:00.040000 | 0.01207 | | 2011-02-25 23:30:00.080000 | 0.01207 | | 2011-02-25 23:30:00.120000 | 0.01204 | | 2011-02-25 23:30:00.160000 | 0.01204 | +----------------------------+------------+
The names_labels_and_groups now contains 3 lists, but we don’t use it now. For more information about the return values, look at the docstring of the set_column_selection method.
Because we have added the include_date_time flag, the DataTime column is read by default and set as the index of the DataFrame. You can do this multiple times if you want to add more columns. The include_data_time does not have to be given again as we already have imported the DateTime. So let’s import the MRU Roll Pitch Heave channels as well. We do this with a regular expression matching all the channels names starting with MRU_R, MRU_P, or MRU_H
>>> names_labels_and_groups = header_object.set_column_selection( ... filter_list=["MRU_[RPH]"]) >>> header_object.import_data() >>> print(tabulate(header_object.data.head(5), headers="keys", tablefmt="psql")) +----------------------------+------------+-------------+-------------+ | DateTime | MRU_Roll | MRU_Heave | MRU_Pitch | |----------------------------+------------+-------------+-------------| | 2011-02-25 23:30:00 | 0.01207 | -0.1051 | -0.0001869 | | 2011-02-25 23:30:00.040000 | 0.01207 | -0.1051 | -0.0001869 | | 2011-02-25 23:30:00.080000 | 0.01207 | -0.1051 | -0.0001869 | | 2011-02-25 23:30:00.120000 | 0.01204 | -0.1078 | -0.0002593 | | 2011-02-25 23:30:00.160000 | 0.01204 | -0.1078 | -0.0002593 | +----------------------------+------------+-------------+-------------+
Since all data is stored in the Pandas Dataframe header_object.data we can plot the data using all Pandas/matplotlib plotting capabilities. This is demonstrated in the example notebook.
- import_data(set_relative_time_column=None)[source]
Import the binary data from the dta file
- Parameters:
set_relative_time_column (bool or None, optional) – If true, store the relative time in the time_r column. Default is None, which means that the value as stored during initialization of the class is taken. This is False by default, but can also be passed through the constructor arguments.
- import_header(mdf_file)[source]
Read the header data from the mdf file
- Parameters:
mdf_file – the name of the mdf header file
- Returns:
nothing
- Return type:
- make_report(show_loaded_data_only=False)[source]
Make a report of the records available in the mdf file
- set_column_selection(filter_list, set_on_exclude_list=False, include_date_time=None)[source]
Select the data to import based on a list of regular expressions
- Parameters:
filter_list (list) – A list with regular expression in which the first filter is always applied on the name field and the next filters are all applied to the label field of the record.
set_on_exclude_list (bool, optional) – By default, the selected columns are added to the include list. If this value is true, set the selection on the excluded list. Defaults to False
include_date_time (bool, optional) – Include the date time field by default (without specification in the filter_list). Handy for the examples as you don’t have to specify the DateTime explicitly. Defaults to None, implying that the setting is taken from the constructor and is set to False.
- Returns:
Selection of name columns along with a list of the () group selection
- Return type:
tuple (name_list, label_list, group_list)
Notes
The data reader allows passing a list of exclude_columns and include_columns by which you can select which column is actually read. With the routine, lists can be created by a regular expression filter
- mdf_reader.mdf_parser.convert_ymdhms_to_data_time(ymdhrs_array, sample_rate=1, constant_sample_rate=True)[source]
Convert the binary year month day hour minutes seconds representation into a readable data/time string
- Parameters:
- Returns:
DateTime pandas index array
- Return type:
Notes
In the first version of the script, the ymdhrs was taken as leading and the number of samples per seconds we corrected to take care of missing samples or too many samples in a second. It appears that the sample rate is really constant and that the clock time may vary. Setting this flag true takes the sample rate leading
- mdf_reader.mdf_parser.decode_ymdhms(ymdhms)[source]
The year month day hour minute seconds are stored in the 4-byte integer
- mdf_reader.mdf_parser.main(args)[source]
The main routine for testing purpose
- Parameters:
args (list) – Command line arguments