Implementing new file formats
If you are interested in adding support for a new file format, please create a new issue to start a discussion. Please also attach a zip file with example data that can later on be used during testing.
If you are familiar with GitHub, please create a pull request and make sure that
the file format reader is located in
afmformats.formats.fmt_NAME
(it may be a directory or a file, depending on the complexity)the file format displays correctly in the docs and the docs compile without errors:
cd docs pip install -r requirements.txt sphinx-build . _build # and open _build/index.html in a browser
you updated the CHANGELOG
your code is fully tested (create test functions in
tests/test_fmt_NAME.py
) and all other tests pass (There are a few general tests that all file format readers must pass):pip install pytest pytest tests
the data files for examples are named according to
fmt-NAME-MOD_filename.suffix
whereMOD
can be e.g.fd
for force-distance data.
If you cannot or will not work with GitHub, you may paste your code in the corresponding issue. If the file format is not too complicated, let’s just hope that things don’t get messy.
Basic file format reader structure
The best way to understand how file formats work in afmformats is to take
a look at the file formats implemented already.
For the sake of clarity, here is a file format reader template
:
import pathlib
import numpy as np
__all__ = ["load_my_format"]
def load_my_format(path, callback=None, meta_override=None):
"""Loads AFM data from my format
This is the main function for loading your file format. Please
add a description here.
Parameters
----------
path: str or pathlib.Path or io.TextIOBase
path to a .tab file
callback: callable
function for progress tracking; must accept a float in
[0, 1] as an argument.
meta_override: dict
if specified, contains key-value pairs of metadata that
are used when loading the files
(see :data:`afmformats.meta.META_FIELDS`)
"""
if meta_override is None:
meta_override = {}
path = pathlib.Path(path)
# Here you would start parsing your data and metadata from `path`
# You should specify as many metadata keys as possible. See
# afmformats.meta.DEF_ALL for a list of valid keys.
metadata = {"path": path}
# Valid column names are defined in afmformats.afm_data.known_columns.
data = {"force": np.linspace(1e-9, 5e-9, 100),
"height (measured)": np.linspace(2e-6, -1e-6, 100)}
metadata.update(meta_override)
dd = {"data": data,
"metadata": metadata}
if callback is not None:
callback(1)
# You may also return a list with more items in case the file format
# contains more than one curve.
return [dd]
recipe_myf = {
"descr": "A short description",
"loader": load_my_format,
"suffix": ".myf",
"modality": "force-distance",
"maker": "designer of file format",
}
A few notes:
The
recipe_myf
contains the recipe for loading the file format into afmformats. It must be registered inafmformats/formats/__init__.py
.You may call the
callback
function with a floating point value between 0 and 1 (progress tracking) in-between of your loading steps if you expect that your file format reader is slow (e.g. several curves have to be loaded). This will give users of e.g. PyJibe visual feedback on how long they will have to wait.The
meta_override
dictionary is useful if you file format does not contain essential metadata such as spring constant or sensitivity. In such cases, you can raise anafmformats.errors.MissingMetaDataError
to signal PyJibe that it should ask the user for the missing metadata. For an example, please see the AFM workshop file format.
Optimizing data import
In most cases, it is not neccessary to actually load the data from disk in the
load_my_format method, especially if you have to parse large binary blobs or
text files. In such cases, you can make use of the lazy loaders implemented in afmformats.
For metadata, you can use afmformats.meta.LazyMetaValue
and for
data columns, you can use afmformats.lazy_loader.LazyData
.
The JPK file reader makes heavy usage of those classes.