Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSVC 2017 build fails with x86 architecture (v2.0.0-rc2) #91

Open
taxen opened this issue Jun 6, 2018 · 17 comments · Fixed by #104
Open

MSVC 2017 build fails with x86 architecture (v2.0.0-rc2) #91

taxen opened this issue Jun 6, 2018 · 17 comments · Fixed by #104

Comments

@taxen
Copy link

taxen commented Jun 6, 2018

Using MSVC 2017 Community on Windows 10 and CMake 3.11.3, I am trying to build Windows binary from jumanpp-2.0.0-rc2.tar.xz. However, I have some trouble to ask the developers for help. (On Linux, I got working binary from the same source. Many thanks!)

Procedure

  1. Install MSVC 2017 Comunnity version (chose default set for Desktop development with C++) and CMake 3.11.3 (chose Windows win64-x64 ZIP as the installer gave an error).
    jumanpp01
    jumanpp02
    jumanpp05
    jumanpp03
  2. Download jumanpp-2.0.0-rc2.tar.xz and extract into C:\data\jumanpp-2.0.0-rc2 (I used 7-Zip 18.05).
  3. As instructed https://github.com/ku-nlp/jumanpp#building-from-a-package , open a terminal and do the following with the result.
    jumanpp04
C:\data\jumanpp-2.0.0-rc2>mkdir bld

C:\data\jumanpp-2.0.0-rc2>cd bld

C:\data\jumanpp-2.0.0-rc2\bld>cmake ..  -DCMAKE_BUILD_TYPE=Release    -DCMAKE_INSTALL_PREFIX="C:\Program Files\jumanpp"
-- Building for: Visual Studio 15 2017
-- Selecting Windows SDK version 10.0.17134.0 to target Windows 10.0.16299.
-- The C compiler identification is MSVC 19.14.26430.0
-- The CXX compiler identification is MSVC 19.14.26430.0
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.14.26428/bin/Hostx86/x86/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.14.26428/bin/Hostx86/x86/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.14.26428/bin/Hostx86/x86/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.14.26428/bin/Hostx86/x86/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Found Threads: TRUE
-- Could NOT find Protobuf (missing: Protobuf_LIBRARIES Protobuf_INCLUDE_DIR)
Not using Protobuf-based components
-- Eigen3... OK

    Static feature generation:
        COMMAND cgtest02_codegen_binary cgtest02 Test02 C:/data/jumanpp-2.0.0-rc2/bld/src/core/codegen/gen
        DEPENDS cgtest02_codegen_binary cg_2_spec.h
        OUTPUT C:/data/jumanpp-2.0.0-rc2/bld/src/core/codegen/gen/cgtest02.cc C:/data/jumanpp-2.0.0-rc2/bld/src/core/codegen/gen/cgtest02.h
        LIBS jpp_core


    Static feature generation:
        COMMAND jpp_jumandic_cg_codegen_binary jpp_jumandic_cg JumandicStatic C:/data/jumanpp-2.0.0-rc2/bld/src/jumandic/gen
        DEPENDS jpp_jumandic_cg_codegen_binary shared/jumandic_spec.cc
        OUTPUT C:/data/jumanpp-2.0.0-rc2/bld/src/jumandic/gen/jpp_jumandic_cg.cc C:/data/jumanpp-2.0.0-rc2/bld/src/jumandic/gen/jpp_jumandic_cg.h
        LIBS jpp_jumandic_spec

C:/data/jumanpp-2.0.0-rc2/bld/src/jumandic/gen
-- Configuring done
-- Generating done
-- Build files have been written to: C:/data/jumanpp-2.0.0-rc2/bld
  1. Doubleclick "C:\data\jumanpp-2.0.0-rc2\bld\jumanpp.sln" and launch MSVC 2017.
  2. After some messages and project loading is done, choose ALL_BUILD from Solution Explorer and build with solution configuration of Release (changed from Debug) and Win32 (as default).
    jumanpp06
  3. After a while, I get 19 errors and many warnings and the build fails. Attached gz file is the exact output of the build (1210 lines long).
    jumanpp07
    jumanpp-build-log.txt.gz

Since Linux build is fine and it actually works pretty well, something on my Windows is not right. Some advice would be really appreciated.

@eiennohito
Copy link
Contributor

I am using the same procedure (the only difference is that I was using mingw-msys64 cmake and not the Windows installer) and it works (with AppVeyor CI as well).
I'll try to dig into it.

A question: does this script work for you? (A build out of MSVS GUI, but still using MSVC toolkit).

mkdir build-dir
cd build-dir
cmake -G "Visual Studio 15 2017 Win64" ..
cmake --build . --config Release
ctest -C Release --output-on-failure

@taxen
Copy link
Author

taxen commented Jun 7, 2018

Wow! I did exactly you suggested me to do and voila!
Only ctest output is enough, I think, but in case you need I attach the whole log file (6272 line long).

C:\data\jumanpp-2.0.0-rc2\build-dir>ctest -C Release --output-on-failure
Test project C:/data/jumanpp-2.0.0-rc2/build-dir
      Start  1: jpp_util_test
 1/10 Test  #1: jpp_util_test ....................   Passed    0.07 sec
      Start  2: testing_infra_test
 2/10 Test  #2: testing_infra_test ...............   Passed    0.05 sec
      Start  3: jpp_core_tests
 3/10 Test  #3: jpp_core_tests ...................   Passed    0.27 sec
      Start  4: jpp_core_analysis_tests
 4/10 Test  #4: jpp_core_analysis_tests ..........   Passed    0.72 sec
      Start  5: jpp_codegen_tests
 5/10 Test  #5: jpp_codegen_tests ................   Passed    0.09 sec
      Start  6: jpp_core_spec_tests
 6/10 Test  #6: jpp_core_spec_tests ..............   Passed    0.06 sec
      Start  7: jpp_core_train_tests
 7/10 Test  #7: jpp_core_train_tests .............   Passed    0.20 sec
      Start  8: jpp_jumandic_tests
 8/10 Test  #8: jpp_jumandic_tests ...............   Passed    0.33 sec
      Start  9: jpp_bug_tests
 9/10 Test  #9: jpp_bug_tests ....................   Passed    0.14 sec
      Start 10: jpp_rnn_tests
10/10 Test #10: jpp_rnn_tests ....................   Passed    0.22 sec

100% tests passed, 0 tests failed out of 10

Total Test time (real) =   2.24 sec

jumanpp-build-log2.txt.gz
jumanpp08

Some further questions:

  1. After the build is done, I manually grab "{where source is located}\jumanpp-2.0.0-rc2\build-dir\src\jumandic\Release\jumanpp_v2.exe" and "{where source is located}\jumanpp-2.0.0-rc2\model\jumandic.jppmdl" and put them into "C:\Program Files\jumanpp" and "C:\Program Files\jumanpp\libexec\jumanpp" respectively (as config file "{where source is located}\jumanpp-2.0.0-rc2\model\jumandic.conf.in" suggested to put the dic file like that). Is there a command to install after build?

  2. I also copy the config file and put it in the binary directory, rename it to jumandic.conf and try to run, but fails (tried with original file name as well). I set a model option in command line right now. How can I direct the binary to use the config file automatically without -c option? Using -c option to specify the config file fails because I put it under C:\Program Files\jumanpp which include a space. Escaping a space by \ does not seem to work for me. Strange escape is not even needed in command line when double quoted.

C:\Program Files\jumanpp>jumanpp_v2.exe -c jumandic.conf sample.txt
failed to load model from disk: InvalidState: could not access file: "C:/Program errcode=123
backtrace:
    jumanpp::util::MappedFile::open at c:\data\jumanpp-2.0.0-rc2\src\util\mmap_impl_win32.h:66
    jumanpp::core::model::FilesystemModel::open at c:\data\jumanpp-2.0.0-rc2\src\core\impl\model_io.cc:111
    jumanpp::core::JumanppEnv::loadModel at c:\data\jumanpp-2.0.0-rc2\src\core\env.cc:11
    jumanpp::jumandic::JumanppExec::init at c:\data\jumanpp-2.0.0-rc2\src\jumandic\shared\jumandic_env.cc:26

Here is my content of jumandoc.conf

--model="C:/Program Files/jumanpp/libexec/jumanpp/jumandic.jppmdl"
--rnn-nce-bias=5.62844432562
--rnn-unk-constant=-3.4748115191
--rnn-unk-length=-2.92994951022
--feature-weight-perceptron=1
--feature-weight-rnn=0.0176

  1. Is there any way to control output character code? Above terminal is mintty. Windows default terminal won't display the output correctly.

Thank you!

@eiennohito
Copy link
Contributor

Nice to know that the build have passed. That means that at least MSBuild files generated by CMake are sane. I wonder that is the problem with the VS Solution.

  1. We want to build an installer for Windows. Build an Windows installer #81. Installing from the CMake source build... is a bit advanced thing on Windows.
  2. Thats a known bug :P. Please don't use spaces in the path names for the time being. I will fix it together with Resolve model paths from config relative to config location #83.
  3. I do not plan to add any character code convertion functionality in the binary itself. You can use (or write) a wrapper script if you want.

@taxen
Copy link
Author

taxen commented Jun 7, 2018

Thank you for clearing those things out. Totally understandable especially about output code 3.
This issue is now solved, I think.

@eiennohito
Copy link
Contributor

eiennohito commented Jun 7, 2018

I have looked over your build log once more.
It seems that by default CMake tries to generate build files for 32-bit.
I haven't tested that configuration.

It probably should work, but a lot of Juman++ internal optimizations assume that we are using a 64-bit platform and have a sane number of 64-bit registers, which is obviously false on x86.

@taxen
Copy link
Author

taxen commented Jun 7, 2018

I once tried
cmake -G "Visual Studio 15 2017" ..
but not
cmake -G "Visual Studio 15 2017 Win64" ..
So... Win64 makes the difference, I see.

@eiennohito eiennohito changed the title MSVC 2017 build fails (v2.0.0-rc2) MSVC 2017 build fails with x86 architecture (v2.0.0-rc2) Jun 7, 2018
@DoumanAsh
Copy link
Contributor

DoumanAsh commented Nov 30, 2018

Visual Studio 15 2017 is always 32 bit project and you need append Win64 for 64-bit builds

It should be possible to add warning to CMake to generate at least warning in such cases

@eiennohito
Copy link
Contributor

Win32 build should not fail. It is a bug.
It should be significantly slower (my guess is ~10x) than Win64 version though.

@DoumanAsh
Copy link
Contributor

Oh ok, I can check it out on my windows laptop since I have toolchain

@eiennohito
Copy link
Contributor

eiennohito commented Dec 1, 2018

Tests still fail, so I reopen this issue.

Still, I don't know if this is a good idea to support 32 bit architecture at all. The problem is: Juman++ memory maps analysis model and it can be large: 300m. When embedding this thing into other applications you can run out of virtual memory very fast on 32 bits.

Also, hashing algorithm I use assumes that we have a lot of 64-bit integer registers. That will cause slowdown on x32 (I don't know how much though). Hashing is the cornerstone of the analysis speed by the way.

@DoumanAsh
Copy link
Contributor

DoumanAsh commented Dec 1, 2018

@eiennohito Yeah, sorry, I suspect yesterday I tested it incorrectly.

Looking at tests:

TEST_CASE("murmur hash works with strings") {
  auto string = "this is a test string";
  auto ptr = reinterpret_cast<const jumanpp::u8*>(string);
  auto hv = murmurhash3_memory(ptr, ptr + sizeof(string), 10);
  CHECK(hv == 0x2e62c68031e4659dULL);
}

Even tests themself assume 64 bit integers, but internally murmur_hash uses size_t so you can easily lose accuracy

@eiennohito
Copy link
Contributor

eiennohito commented Dec 1, 2018

That's a bug, it should use uint64_t.
Otherwise models will not be portable.

@DoumanAsh
Copy link
Contributor

There are actually lots of warnings about implicit casts between size_t and uint64_t so to be sure I'd suggest to go through all of them... but it is lots to take on, so might take a while

@eiennohito
Copy link
Contributor

At least MurmurHash internal hashing representation should use fixed 64 bit integers.

But sized integer hygiene is... not good, yep.

@DoumanAsh
Copy link
Contributor

@eiennohito Just letting you know I for now give up with murmurhash3_memory tests, I just do not see where narrowing could happen that starts failing test.
I doubt it is critical issue since x64 is recommended to be used, but well maybe someone else will try to tackle it. Or me at later time.

@eiennohito
Copy link
Contributor

Thanks for digging. I'm busy with a paper and upcoming PhD thesis right now, so I can't guarantee when I look into it, but I will try.

@eiennohito
Copy link
Contributor

If your project is sorta largish, I strongly recommend you to use Juman++ via gRPC. You will be protected from crashes and other weird things that way.

https://github.com/eiennohito/jumanpp-grpc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants