Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: C++ runtime on Windows #2806

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Conversation

HolyWu
Copy link
Contributor

@HolyWu HolyWu commented May 2, 2024

Description

Fix and enable C++ runtime on Windows.

Fixes #2247
Fixes #2371
Fixes #2484

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@HolyWu
Copy link
Contributor Author

HolyWu commented May 4, 2024

Some observations about the failed tests.

@github-actions github-actions bot added the component: tests Issues re: Tests label May 4, 2024
Copy link
Collaborator

@gs-olive gs-olive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great and thank you very much for the contribution. Added a few questions/comments. The comments on the test cases + added fixes look good and switching np.dtype("i") to np.int32 for that case is fine to include in the converter.

.github/workflows/build-test-windows.yml Outdated Show resolved Hide resolved
.github/workflows/build_wheels_windows.yml Outdated Show resolved Hide resolved
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels May 7, 2024
@github-actions github-actions bot removed component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels May 8, 2024
@gs-olive gs-olive added the ciflow/binaries/all Build for all Python Versions label May 17, 2024
Copy link

pytorch-bot bot commented May 17, 2024

No ciflow labels are configured for this repo.
For information on how to enable CIFlow bot see this wiki

@gs-olive
Copy link
Collaborator

gs-olive commented May 22, 2024

Hi @HolyWu - I wanted to share a few notes on the C++ runtime on Windows after some local testing:

  • When compiling locally, I had to upgrade the Bazel version to 6.3.2 so it could find cl and other C++ WIndows dependencies. Still, it fails during the ninja build. Have you seen the following error before?
../core/conversion/conversion.h(5): fatal error C1083: Cannot open include file: 'NvInfer.h': No such file or directory
  • I then tried downloading the artifact from the GHA job and installing, which worked.

Based on the GHA-installed package, I ran the CI-failing tests, for which I have the following results:

  • test_bert_base_uncased for both export and compile fails with transformers==4.41.0, but passes with transformers==4.39.3. Could you try fixing this version in the .yml?
  • test_linear_3_multi_dim_matrix + test_linear_4_multi_dim_matrix - these pass/fail nondeterministically for me on Windows. They fail more often and with larger variance than they do on Linux, but failures on this test have been seen on both systems. Needs further investigation on my end.
  • test_bert_base_uncased in the Torchscript path causes a Windows fatal exception on my machine as well - this should be skipped on Windows only while it is investigated.
    • The above is also resolved by using transformers==4.39.3 on my machine

HolyWu added 27 commits June 4, 2024 20:59
Linux workflow also does this way so the cuda version of TensorRT doesn't necessarily have to match up PyTorch's cuda version.
They don't always get installed automatically on CI and hence lead to "Can not find nvinfer" error.
Installing torchvision with legacy resolver could cause the installed version of torch conflict with the required version from torchvision, leading to errors like `RuntimeError: operator torchvision::nms does not exist` or `AttributeError: partially initialized module 'torchvision' has no attribute 'extension'`
@narendasan
Copy link
Collaborator

Fixes: #2645

@narendasan narendasan linked an issue Jun 7, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/binaries/all Build for all Python Versions cla signed component: api [Python] Issues re: Python API component: build system Issues re: Build system component: tests Issues re: Tests
Projects
None yet
4 participants