Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FitFileEncoder for writing FIT files #58

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xmedeko
Copy link
Contributor

@xmedeko xmedeko commented Mar 9, 2018

Fix #8

I have started with FitFileEncoder, see the test_encoder.py. I will rebase this PR while fixing the FitFileEncoder and edit this initial message.

The messages are created by DataMessageCreator just to keep quite a lot of code outside of the core DataMessage.

Cannot write:

  • compound fields,
  • accumulated fields,
  • developer fields.

But these fields are not necessary for an application generated FIT files. The compound and accumulated fields are designed to save a few bits of the resulting FIT file, which is not necessary for a server generated files and, furthermore, I need to use simple FIT features only to make some weak FIT parsers also working.

@dtcooper @pR0Ps Please review the code.
I would appreciate to move some code changes out of this PR and merge it to master to make this PR smaller and simpler to review:

  • fileish_open, see RFCT utils.fileish_open with a write possibility #54
  • RFTC Crc, see the commit "RFCT move crc computation to records.Crc, add test". Also speeds up the FitFile when CRC is ignored.
  • Move apply_scale_offset to the records.py, what about to merge it to the field.render code?
  • Solve the optimisations Remove RecordBase to speedup processing #57
  • Rename FitFileDataProcessor.process_xxx methods to parse_xxx
  • Add BaseTyp.invalid_value, so as the parse does not need to be defined in most of times.

Thank you

0x07: BaseType(name='string', identifier=0x07, fmt='s', parse=parse_string, unparse=unparse_string, in_range=lambda x: x),
0x88: BaseType(name='float32', identifier=0x88, fmt='f', invalid_value=_FLOAT32_INVALID_VALUE,
parse=lambda x: None if math.isnan(x) else x,
in_range=lambda x: x if -3.4028235e+38 < x < 3.4028235e+38 else _FLOAT32_INVALID_VALUE),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer if this constant was named for readability along with _FLOAT32_INVALID_VALUE:

_FLOAT32_RANGE = (-3.4028235e+38, 3.4028235e+38)

in_range=lambda x: x if _FLOAT32_RANGE[0] < x < _FLOAT32_RANGE[1] else _FLOAT32_INVALID_VALUE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should polish this part, maybe subclass BaseType and use merge in_range into unparse.

:rtype bool"""
if isinstance(obj, (str, bytes)):
return False
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify by using the Iterable type from the collections module:

from collections import Iterable

def is_iterable(obj):
    return not isinstance(obj, (str, bytes)) and isinstance(obj, Iterable)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://stackoverflow.com/questions/1952464
isinstance(obj, Iterable) does not cover all use cases. However, for most of situations, isinstance is OK. Maybe should do it to speed up the code a bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I actually had no idea that defining __getitem__ on a class would cause it to become iterable. Seems like a legacy feature that was kept in for backwards compatibility after PEP234 was implemented. I did some tests in 2.7/3.6 and it seems that the basic types all implement __iter__ now so as long as custom classes do the same it should be fine.

scripts/fitdump Outdated
@@ -98,7 +98,7 @@ def main(args=None):

fitfile = fitparse.FitFile(
options.infile,
data_processor=fitparse.StandardUnitsDataProcessor(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fitdump script is user-facing and should therefore convert units to SI by default. Getting the raw values from the file could be useful in some cases, but should be put behind a flag (--raw-values?) if it's going to be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is a mistake, thanks.

# Python 2 compat
try:
num_types = (int, float, long)
str = basestring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str is still used as a typecheck in get_messages in the following code:

names = set([
    int(n) if (isinstance(n, str) and n.isdigit()) else n
    for n in names
])

However, looking at it now, this code can probably be refactored to:

def try_int(obj):
    try:
        return int(obj)
    except ValueError:
        return obj

names = set(try_int(n) for n in names)

This would remove the typecheck so the shim wouldn't be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this code should be removed - the caller should make sure a message number is int, not str if wants to use the number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pR0Ps Should I remove this from the FitFile?

:param profile_version: profile version.
:param data_processor: custom data processor.
"""
self.protocol_version = 1.0 if protocol_version is None else float(protocol_version)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not put these defaults in the function definition? Would be more self-documenting.

Copy link
Contributor Author

@xmedeko xmedeko Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not any reason, just a mistake, thanks.

self._crc.update(data)

def _write_struct(self, data, fmt, endian='<'):
if fmt.startswith('<') or fmt.startswith('>'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, allowing multiple ways to specify a parameter can get confusing (how do they interact?). In this case I would either raise an error if the endianness is specified in both parameters or just not allow it in the fmt at all since (as far as I can tell) it's only used for the CRC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's a design flaw in the fitparse base, already. E.g. see test.py func generate_fitfile() - there is copy&paste of the struct.pack spec for the FIT header and Crc. I've tried to fix that for Crc at least. But I can copy&paste the endianness for the Crc, too.

from .utils import fileish_open, FitParseError


class FitFileEncoder(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this class allows for creating chained FIT files (basically multiple FIT files concatenated together, see the commit that added support for parsing them). Not saying it has to, just letting you know that it's a possibility since it's a bit of an obscure part of the spec.

In fact, if you want to support chained FIT files, it should probably be taken care of by higher-level code that basically just does an itertools.chain(*fileishs) for the fileishs written by this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, the code itself allows to append to a file, too. Just the file has to be opened by the user code. So, chaining is the task for the user code. Should not burden the core code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that it shouldn't be done as part of the normal encoding since it's at a higher level than messing around with the bytes of the file, but that doesn't mean it can't be included in the library.

If it's a common operation, pushing it into user code just means that everyone is going to have to write their own implementation. A better way would be to provide all the functionality the user would need, but let them decide when to use it. For example, one way would be to have a small method chain on the class that does something like (untested):

def chain(self, *fits):
    if not self.completed:
        raise Exception("Can't chain more files onto an incomplete FIT file")

    for fit in fits:
        if isinstance(fit, FitFile):
            data = fit._file
        else:
            data = fileish_open(fit, 'rb')

        while True:
            block = data.read(BLOCK_SIZE)
            if not block:
                break
            self._write(block)

This way the library can check for issues (like appending data to an incomplete fit datastream) as well as handle compatibility with library objects (the FitFile shim), but it doesn't complicate the rest of the encoding process in any way.

Copy link
Contributor Author

@xmedeko xmedeko Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can already chain with this code:

with FitFileEncoder(open(fitname, 'ab') as fit:
   ...
with FitFileEncoder(open(fitname, 'ab') as fit:
   ...

I have not seen any particular support for chaining in the FIT SDK, I think they support it the same way.

@xmedeko
Copy link
Contributor Author

xmedeko commented Apr 12, 2018

@pR0Ps I've addressed most of the comments by amend. See the rest.
Do not want to merge it into the master until #57 is resolved.

@vlcvboyer
Copy link

+1 writing fit files will be useful !
Please see my PR on @xmedeko fork which includes some scripts to fix FIT files or combine them into multi-sport ativities:
[pull request on xmedeko fork] (xmedeko#1)
Unfortunately there are conflicts with upstream.... I've not changed anything yet as I can see some different opinions in this stream...

@23chrischen
Copy link

+1 would love to see this merged as well!

@Kypaz
Copy link

Kypaz commented Jun 5, 2019

+1, would love to have this merged in master. Will have to clone this manually to get access to this new feature for now ! Thanks for this !!

@xmedeko
Copy link
Contributor Author

xmedeko commented Jun 5, 2019

I am sorry but I do not plan to maintain this PR and resolve conflicts due to the development in the master. Personally, I have decided do abandon FIT writing (and parsing, too) in pure Python since the FIT format is loosely specified and complex. I recommend to make some FIT<-->JSON tool from the original FIT SDK (e.g. C#, Java, C++) and just call this tool from Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: creating FIT file from any readable format (CSV, XML, JSON or any other)
5 participants