Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flang] LLVM_ENABLE_RUNTIMES=flang-rt #110217

Open
wants to merge 1,655 commits into
base: users/meinersbur/flang_runtime_move-files
Choose a base branch
from

Conversation

Meinersbur
Copy link
Member

@Meinersbur Meinersbur commented Sep 27, 2024

Extract Flang's runtime library to use the LLVM_ENABLE_RUNTIME mechanism.

Motivation:

  • Consistency with LLVM's other runtime libraries (compiler-rt, libc, libcxx, openmp offload, ...)
  • Allows compiling the runtime for multiple targets at once using the LLVM_RUNTIME_TARGETS configuration options
  • Installs the runtime into the compiler's per-target resource directory so it can be automatically found even when cross-compiling

Potential future directions:

  • Uses CMake's support for compiling Fortran files, including dependency resolution of Fortran modules
  • Improve robustness of compiling libomp.mod when openmp is available
  • Remove Flang's dependency from flang-rt's RTNAME function declarations (tblgen?)
  • Reduce Flang's build-time dependency from flang-rt's REAL(16) support

See RFC discussion at https://discourse.llvm.org/t/rfc-use-llvm-enable-runtimes-for-flangs-runtime/80826

Patch series:

Patch for lab.llvm.org buildbots:

@Meinersbur Meinersbur changed the base branch from main to users/meinersbur/flang_runtime_move-files September 27, 2024 18:58
Copy link

github-actions bot commented Oct 1, 2024

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r d65a4bcf0d76932e2f6078a7fc7db6ab922fd3fb...02b4731a1b8e02ed600cfb4ba523af836c98e3e8 flang-rt/test/NonGtestUnit/lit.cfg.py flang-rt/test/Unit/lit.cfg.py flang-rt/test/lit.cfg.py flang/test/lit.cfg.py
View the diff from darker here.
--- flang-rt/test/lit.cfg.py	2025-01-02 15:54:24.000000 +0000
+++ flang-rt/test/lit.cfg.py	2025-01-02 18:07:01.586322 +0000
@@ -78,23 +78,21 @@
         "%clang",
         command=FindTool("clang"),
         extra_args=isysroot_flag,
         unresolved="fatal",
     ),
-    ToolSubst("%cc",
-        command=config.cc,
-        extra_args=isysroot_flag,
-        unresolved="fatal"
-    ),
+    ToolSubst("%cc", command=config.cc, extra_args=isysroot_flag, unresolved="fatal"),
 ]
 llvm_config.add_tool_substitutions(tools)
 
 # Let tests find LLVM's standard tools (FileCheck, split-file, not, ...)
 llvm_config.with_environment("PATH", config.llvm_tools_dir, append_path=True)
 
 # Include path for C headers that define Flang's Fortran ABI.
-config.substitutions.append(("%include", os.path.join(config.flang_source_dir, "include")))
+config.substitutions.append(
+    ("%include", os.path.join(config.flang_source_dir, "include"))
+)
 
 # Library path of libflang_rt.a (for lib search path when using non-Flang driver for linking)
 config.substitutions.append(("%libdir", config.flang_rt_output_resource_lib_dir))
 
 # Additional library depedendencies the that Flang driver does not add itself.

Copy link

github-actions bot commented Oct 1, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@Meinersbur
Copy link
Member Author

Meinersbur commented Oct 1, 2024

I reverted the change that moved flang/module/*.f90 to FortranRuntime/module/*.f90. A lot of Flang tests depend on the *.mod files to be available. Moving the modules to FortranRuntime would either require moving those tests to FortranRuntime as well, or mark them as requiring the module_files (REQUIRES: module_files) and build those before check-flang:

flowchart TB
 subgraph Flang
   direction TB
   flang-new
   check-flang
 end
 subgraph FortranRuntime
   direction TB
   iso_fortran_env_impl("iso_fortran_env_impl.f90")
   lib("libFortranRuntime.a")
   module_files
   check-FortranRuntime
 end

 flang-new --> iso_fortran_env_impl --> lib --> check-FortranRuntime
 flang-new --> module_files --> check-flang
 flang-new --> check-flang
 flang-new --> check-FortranRuntime
Loading

Except the PPC modules, the mod files are not target-specific and hence would not profit from compiling multiple targets from a bootstrap build anyway. However, they are products of the Flang compiler and therefore IMHO logically fits better to the runtime. Also, it would have been nice to make use of CMake's dependency resolution rather than hardcoding Into the CMakeLists.txt. I think it is best to keep that decision out of the current patch.

Meinersbur added a commit that referenced this pull request Oct 7, 2024
Due to missing lit.cfg, will only be executing again after #110217.
@Meinersbur Meinersbur marked this pull request as ready for review October 7, 2024 20:57
@Meinersbur Meinersbur requested a review from a team as a code owner October 7, 2024 20:57
@Meinersbur Meinersbur changed the title [Flang][DRAFT] LLVM_ENABLE_RUNTIMES=FortranRuntime [Flang] LLVM_ENABLE_RUNTIMES=FortranRuntime Oct 7, 2024
)

enable_cuda_compilation(FortranRuntime "${supported_files}")
enable_omp_offload_compilation("${supported_files}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just removed or moved somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had expected that I moved those to add_fortranruntime_library but apparently I didn't. I re-added it with the latest push, including some build fixes since it doesn't even work with the trunk version. Even flang-new doesn't compile with the error:

ld.lld: error: undefined symbol: __cudaRegisterLinkedBinary_9067bd07_21_binary_to_decimal_cpp_c17d0b68
>>> referenced by tmpxft_00002a72_00000000-6_binary-to-decimal.cudafe1.cpp
>>>               binary-to-decimal.cpp.o:(__sti____cudaRegisterAll()) in archive lib/libFortranDecimal.a

This is because libFortranDecimal is compiled as CUDA source (but not libFortranCommon) and needed for flang-new. libFortranDecimal for FortranRuntime and Flang should really be compiled independently of each other, as this PR does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this CUDA build actually used? I'm interested in removing it if possible and just building this on top of my other GPU libraries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have plan to use it in CUDA Fortran implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that going to be supported upstream? clang is a CUDA compiler so flang can be as well, but since offloading in flang is already using my driver code (AFAIK) it might be easier to just make it common.

Copy link
Contributor

@clementval clementval Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it 's gonna be supported upstream. We are actively working on all the pieces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, any changes I make probably won't impact this since it'll be a separate build. But I would greatly prefer if we stop having many different ways of doing things in the LLVM offloading space.

FortranRuntime/CMakeLists.txt Outdated Show resolved Hide resolved
llvm/runtimes/CMakeLists.txt Outdated Show resolved Hide resolved
runtimes/CMakeLists.txt Outdated Show resolved Hide resolved
llvm/runtimes/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the global setting of flags is the last comment I have from the perspective of "integration with the runtimes build". I haven't reviewed anything else inside flang-rt/ though.

flang-rt/CMakeLists.txt Outdated Show resolved Hide resolved
@jplehr
Copy link
Contributor

jplehr commented Oct 31, 2024

I tested this locally and it appears that it requires are more modern CMake version than what was installed (3.22).
According to the LLVM docs (https://releases.llvm.org/12.0.0/docs/GettingStarted.html#id8) currently CMake 3.20 is the minimum required version.

@Meinersbur Meinersbur requested a review from jhuber6 November 6, 2024 14:36
flang-rt/CMakeLists.txt Outdated Show resolved Hide resolved
flang-rt/CMakeLists.txt Outdated Show resolved Hide resolved
llvm/runtimes/CMakeLists.txt Show resolved Hide resolved
llvm/cmake/modules/LLVMExternalProjectUtils.cmake Outdated Show resolved Hide resolved
@jhuber6 jhuber6 changed the title [Flang] LLVM_ENABLE_RUNTIMES=FortranRuntime [Flang] LLVM_ENABLE_RUNTIMES=flang-rt Nov 6, 2024
@Meinersbur Meinersbur requested a review from sscalpone November 7, 2024 15:48
@jhuber6
Copy link
Contributor

jhuber6 commented Nov 7, 2024

Overall I think the patch is fine pending some naming nits, my one concern is the -U__GLIBCXX stuff, because undefining internal vars seems really sketchy. Do we use -nostdlib++ to make sure we don't link the C++ library?

@jeanPerier
Copy link
Contributor

jeanPerier commented Nov 8, 2024

When building the runtime out of tree and running check-flang-rt, I am seeing an error about LLVM header not found

edit: My -DLLVM_DIR was incorrect, never mind.

clementval and others added 24 commits January 2, 2025 17:02
Introduce cuf.sync_descriptor to be used to sync device global
descriptor after pointer association.

Also move CUFCommon so it can be used in FIRBuilder lib as well.
Convert the op to a new entry point in the runtime
`CUFSyncGlobalDescriptor`
#121292)

This extension adds 12 instructions that conditionally load an immediate
value.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
R_RISCV_{ADD*/SUB*} relocations are kept only when feature relax
enabled. So it is better to add relax to the test, so that relocs can be
reserved for processing by the jitlink. That's what this test really
wants to test.
Change the global variable reference to a member access of another
variable `ctx`. In the future, we may pass through `ctx` to functions to
eliminate global variables.

Pull Request: #119835
Fixes a regression introduced in commit
da00c60

This functionality was originally added in commit
5834996

Co-authored-by: Tomasz Kaminski <tomasz.kaminski@sonarsource.com>
The commit 22b7b84
made the symbols provided by shared libraries "defined",
and thus effectively made it impossible to generate non-pie
dynamically linked executables using
--unresolved-symbols=import-dynamic.

This commit, based on #109249,
fixes it by checking sym->isShared() explictly.
(as a bonus, you don't need to rely on
--unresolved-symbols=import-dynamic
anymore.)

Fixes #107387
Code sequence for tls-desc in large code model is not expected to be
scheduled according to psABI 2.30.

A later commit will fix it.
This ensures these classes are visible only to the appropriate
translation unit and allows for more optimizations.
The debug section map was using MachO section names (with the "__" prefix), but
DWARFContext expects section names with the object format prefix stripped off.
This was preventing DWARFContext from accessing the debug_str section,
resulting in bogus source name strings.
It wraps the body of namespace with additional newlines, turning this code:
```
namespace N {
int function();
}
```
into the following:
```
namespace N {

int function();

}
```

---------

Co-authored-by: Owen Pan <owenpiano@gmail.com>
This optimization was introduced by #70469.

Like AArch64, we allow tail expansions for 3 on RV32 and 3/5/6
on RV64.

This can simplify the comparison and reduce the number of blocks.
Similar to a9e75b1: During MachOPlatform bootstrap we need to defer actions
until essential platform functionality has been loaded, but the platform itself
may be loaded under a concurrent dispatcher so we have to guard against the
deferred actions vector being accessed concurrently.

This fixes a probablistic failure in the ORC runtime regression tests on
Darwin/x86-64 that was spotted after edca1d9 (which turned on concurrent
linking by default in llvm-jitlink).
The gcov version is set to 11.1 (compatible with gcov 9) even if
`-Xclang -coverage-version=` specified version is less than 11.1.

Therefore, we can drop producer support for version < 11.1.
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from #118787
All the sources of `llvm-min-tblgen` are also used for `llvm-tblgen`,
with identical compilation flags. Reuse the object files of
`llvm-min-tblgen` for `llvm-tblgen` by applying the usual source
structure of an executable: One file per executable which named after
the executable name containing the (in this case trivial) main function,
which just calls the tblgen_main in TableGen.cpp. This should also clear
up any confusion (including mine) of where each executable's main
function is.

While this slightly reduces build time, the main motivation is ccache.
Using the hard_link
option, building the object files for `llvm-tblgen` will result in a
hard link to the same object file already used for `llvm-min-tblgen`. To
signal the build system that the file is new, ccache will update the
file's time stamp. Unfortunately, time stamps are shared between all
hard-linked files s.t. this will indirectly also update the time stamps
for the object files used for `llvm-tblgen`. At the next run, Ninja will
recognize this time stamp discrepancy to the expected stamp recorded in
`.ninja_log` and rebuild those object files for `llvm-min-tblgen`, which
again will also update the stamp for the `llvm-tblgen`... . This is
especially annoying for tablegen because it means Ninja will re-run all
tablegenning in every build.

I am using the hard_link option because it reduces the cost of having
multiple build-trees of the LLVM sources and reduces the wear to the SSD
they are stored on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.