New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: building on windows #12
Conversation
b4c1f70
to
d546f24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @dopiera and @kboroszko)
bld.sh, line 1 at r1 (raw file):
#!/usr/bin/env bash
I'm guessing this file got here by accident, right?
WORKSPACE, line 112 at r1 (raw file):
https://github.com/protocolbuffers/protobuf/issues/7076
I think we need more explanation that that.
How about this:
Because of a bug in protocol buffers (protocolbuffers/protobuf#7076), new versions of this project fail to compile on Windows. The problem hinges on OPTIONAL
being defined as an empty string under Windows. This makes the preprocessor remove every mention of OPTIONAL
from the code, which causes compilation failures. This temporary workaround renames the name of the protobuf value OPTIONAL
to OPIONAL
. This should be safe as it does not affect the generated protobufs.
WORKSPACE, line 118 at r1 (raw file):
-i.bak
what do we need those *.bak
s for?
tensorflow_io/core/kernels/bigtable/serialization.h, line 28 at r1 (raw file):
values as byte buffers
I think it should be "byte buffers as values".
tensorflow_io/core/kernels/bigtable/serialization.h, line 32 at r1 (raw file):
XDR seems to match what HBase does.
I think we need to elaborate on why we need two implementations.
I think we should say the following:
HBase stores integers as big-endian and floats as IEEE754 (also big-endian). Given that integer endianness does not always match float endianness, and the fact that there are architectures where it is neither little nor big (BE-32), implementing this properly is non-trivial. Ideally, we would use a library to do that. XDR matches what HBase does, but it is not easily available on Windows, so we decided to go with a hybrid approach. On Windows we assume that integer endianness matches float endianness and implement the deserialization ourselves and everywhere else we use XDR. For that reason we provide two implementations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 5 unresolved discussions (waiting on @dopiera)
WORKSPACE, line 112 at r1 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
https://github.com/protocolbuffers/protobuf/issues/7076
I think we need more explanation that that.
How about this:
Because of a bug in protocol buffers (protocolbuffers/protobuf#7076), new versions of this project fail to compile on Windows. The problem hinges onOPTIONAL
being defined as an empty string under Windows. This makes the preprocessor remove every mention ofOPTIONAL
from the code, which causes compilation failures. This temporary workaround renames the name of the protobuf valueOPTIONAL
toOPIONAL
. This should be safe as it does not affect the generated protobufs.
Done.
WORKSPACE, line 118 at r1 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
-i.bak
what do we need those
*.bak
s for?
Cargo cult.
tensorflow_io/core/kernels/bigtable/serialization.h, line 28 at r1 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
values as byte buffers
I think it should be "byte buffers as values".
Actually, I don't get why. Each cell has a value, and this value is represented as bytes, no?
tensorflow_io/core/kernels/bigtable/serialization.h, line 32 at r1 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
XDR seems to match what HBase does.
I think we need to elaborate on why we need two implementations.
I think we should say the following:
HBase stores integers as big-endian and floats as IEEE754 (also big-endian). Given that integer endianness does not always match float endianness, and the fact that there are architectures where it is neither little nor big (BE-32), implementing this properly is non-trivial. Ideally, we would use a library to do that. XDR matches what HBase does, but it is not easily available on Windows, so we decided to go with a hybrid approach. On Windows we assume that integer endianness matches float endianness and implement the deserialization ourselves and everywhere else we use XDR. For that reason we provide two implementations
Done.
bld.sh, line 1 at r1 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
I'm guessing this file got here by accident, right?
yup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @dopiera)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @dopiera)
WORKSPACE, line 118 at r1 (raw file):
Previously, kboroszko (Kajetan Boroszko) wrote…
Cargo cult.
It turns out that we have this because without it the builiding on MacOS mysteriously fails. It's some problem with sed
on OSX, but i didn't spend much time debugging it. All I know is it breaks, when I remove the .bak
. Other sed's in this file use this option as well, I assume for the same reason. I don't think it's worth investigating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @dopiera and @kboroszko)
tensorflow_io/core/kernels/bigtable/serialization.h, line 28 at r1 (raw file):
Previously, kboroszko (Kajetan Boroszko) wrote…
Actually, I don't get why. Each cell has a value, and this value is represented as bytes, no?
I think it's a tautology to say that data is stored as byte buffers - I don't think there's an alternative. After all, ints, floats, protobufs or images are all stored as byte buffers.
I think that when you say "byte buffers as values", you indicate that the only type of the value is a byte buffer, which is what we want to indicate here.
Either way, I think the reader will get what we mean, so fix it only if you like.
tests/test_bigtable/test_serialization.py, line 40 at r8 (raw file):
) ): test_case.assertEqual(values[i].numpy(), r.numpy()[0])
For floats and doubles you want assertAlmostEqual
.
Also, why isn't this a method in BigtableReadTest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @dopiera)
tests/test_bigtable/test_serialization.py, line 40 at r8 (raw file):
Previously, dopiera (Marek Dopiera) wrote…
For floats and doubles you want
assertAlmostEqual
.Also, why isn't this a method in
BigtableReadTest
?
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @dopiera)
This PR fixes building on Windows by:
This change is