Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sycl web #14302

Closed
wants to merge 3,359 commits into from
Closed

Sycl web #14302

wants to merge 3,359 commits into from

Conversation

iagarwa
Copy link
Contributor

@iagarwa iagarwa commented Jun 26, 2024

Did get my changes here. so created draft

paperchalice and others added 30 commits June 22, 2024 17:34
…to an RAII class (#94854)

Modify MachineFunctionProperties in PassModel makes `PassT P;
P.run(...);` not work properly. This is a necessary compromise.
… (#95025)

In ContinuationIndenter::mustBreak, a break is required between a
template declaration and the function/class declaration it applies to,
if the template declaration spans multiple lines.

However, this also includes template template parameters, which can
cause extra erroneous line breaks in some declarations.

This patch makes template template parameters not be counted as template
declarations.

Fixes llvm/llvm-project#93793
Fixes llvm/llvm-project#48746
…(#96384)

Buildbot `clang-ppc64le-rhel` failed with:
```sh
error: 'MFPropsModifier' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported]
note: add a deduction guide to suppress this warning
```
after #94854. This PR adds deduction guide explicitly to suppress
warning.
When unifying the ResolveExecutable implementations in #96256, I missed
that RemoteAwarePlatform was able to resolve executables more
aggressively. The host platform can rely on the current working
directory to make relative paths absolute and resolve things like home
directories.

This should fix command-target-create-resolve-exe.test.
This formatter doesn't currently provide much value. It only formats
`SourceLocation` and `QualType`. The only formatting it does for
`QualType` is call `getAsString()` on it.

The main motivator for the removal however is that the formatter
implementation can be very slow (since it uses the expression evaluator
in non-trivial ways).

Not infrequently do we get reports about LLDB being slow when debugging
Clang, and it turns out the user was loading `ClangDataFormat.py` in
their `.lldbinit` by default.

We should eventually develop proper formatters for Clang data-types, but
these are currently not ready. So this patch removes them in the
meantime to avoid users shooting themselves in the foot, and giving the
wrong impression of these being reference implementations.
Fold `mul (uitofp i1 X), Y` to `select i1 X, Y, 0.0` when the `mul` is
`nnan` and `nsz`

Proof: https://alive2.llvm.org/ce/z/_stiPm
We're ultimately expected to return an APValue simply pointing to
the CallExpr, not any useful value. Do that by creating a global
variable for the call.
…k. NFC

This was added by 507efbc
([MC] Fold A-B when A is a pending label or A/B are separated by a
MCFillFragment) to account for pending labels and is now unneeded after
the removal of pending labels (7500646).
The checks when building a thunk to decide if an arg needed to be cast
to/from an integer or redirected via a pointer didn't match how arg
types were changed in `canonicalizeThunkType`, this caused LLVM to ICE
when using vector types as args due to incorrect types in a call
instruction.

Instead of duplicating these checks, we should check if the arg type
differs between x64 and AArch64 and then cast or redirect as
appropriate.
… (#96396)

Reapply 4a7bf42
which was reverted in 34d44eb

Not sure why there are tests elsewhere in clang that rely on the output
of clang-format, but they were wrong
…at_provider (#95704)

The original implementation of HelperFunctions::consumeHexStyle always
sets Style when it returns true, but this is difficult for a compiler
to understand since it requires seeing that Str starts
with either an "x" or an "X" when starts_with_insensitive("x")
return true.
In particular, g++ 12 warns that HS may be used uninitialized
in the format_provider::format caller.

Change HelperFunctions::consumeHexStyle to return an optional
HexPrintStyle and to make the fact that Str necessarily starts
with an "X" when all other cases do not apply more explicit.
This helps both the compiler and the human reader of the code.

Co-authored-by: Sven Verdoolaege <[email protected]>
#95197 and 7500646 eliminated all raw
`new MCXXXFragment`. We can now place fragments in a bump allocator.
In addition, remove the dead `Kind == FragmentType(~0)` condition.

~CodeViewContext may call `StrTabFragment->destroy()` and need to be
reset before `FragmentAllocator.Reset()`.
Tested by llvm/test/MC/COFF/cv-compiler-info.ll using asan.

Pull Request: llvm/llvm-project#96402
https://reviews.llvm.org/D67249 added content hash (see
-fvalidate-ast-input-files-content) using llvm::hash_code (size_t).
The hash value is 32-bit on 32-bit systems, which was unintentional.

Fix #96379: #96136 switched the hash function to xxh3_64bit but did not
update the ContentHash type, leading to mismatch between ASTReader and
ASTWriter.
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This is part 3 of 4 PRs. It sets the ground work for using the
intrinsics in HLSL.

Add HLSL frontend apis for `acos`, `asin`, `atan`, `cosh`, `sinh`, and
`tanh`
llvm/llvm-project#70079
llvm/llvm-project#70080
llvm/llvm-project#70081
llvm/llvm-project#70083
llvm/llvm-project#70084
llvm/llvm-project#95966
…n-constants

If f(Y) simplifies to Y, replace with Y. This requires Y to be
non-undef.

Closes #94719
Follow-up to 05ba5c0. uint32_t is
preferred over const MCExpr * in the section stack uses because it
should only be evaluated once. Change the paramter type to match.
Functions that have the `nvvm.kernel` attribute should have 0 results.
The `gpu.func` op lowering accounts for memref arguments/results (both
"normal" and bare-pointer supported), but the `gpu.return` op lowering
did not. The lowering produced invalid IR that did not verify.

This commit uses the same lowering strategy as for `func.return` in the
`gpu.return` lowering. (The C++ implementation is copied. We may want to
share some code between `func` and `gpu` lowerings in the future.)
Define subtarget features for atomic fmin/fmax support.

The flat/global support is a real messe. We had float/double support at
the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them.
gfx11 removed the f64 versions again.

gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.
labath and others added 27 commits June 25, 2024 10:52
…IEs (#96484)

If ParseStructureLikeDIE (or ParseEnum) encountered a declaration DIE,
it would call FindDefinitionTypeForDIE. This returned a fully formed
type, which it achieved by recursing back into ParseStructureLikeDIE
with the definition DIE.

This obscured the control flow and caused us to repeat some work (e.g.
the UniqueDWARFASTTypeMap lookup), but it mostly worked until we tried
to delay the definition search in #90663. After this patch, the two
ParseStructureLikeDIE calls were no longer recursive, but rather the
second call happened as a part of the CompleteType() call. This opened
the door to inconsistencies, as the second ParseStructureLikeDIE call
was not aware it was called to process a definition die for an existing
type.

To make that possible, this patch removes the recusive type resolution
from this function, and leaves just the "find definition die"
functionality. After finding the definition DIE, we just go back to the
original ParseStructureLikeDIE call, and have it finish the parsing
process with the new DIE.

While this patch is motivated by the work on delaying the definition
searching, I believe it is also useful on its own.
…#96514)

This PR fixes llvm/llvm-project#96513.

The way of creation of array type constant was incorrect: instead of
creating [1, 1, 1] or [1, 1, 1, 1, 1, ....] constants, the same [1]
constant was always created, substituting original composite constants.
This in its turn led to a situation when only one of constants might
exist in the code without emitting invalid code, the second constant
would be eventually rewritten to the first constant, because a key to
address both was an array of a single element (like [1]).

This PR fixes the issue and purges from the code unneeded copy/pasted
clone of the function that creates an array constant.
…nes (#95269)

This patch extends #73964 and adds optimisation of load SVE intrinsics
when predicate is zero.
The handling of `PointerType` is similar to `HeapType`. The only
difference is that allocated flag is generated for `HeapType` and
associated flag for `PointerType`. The tests for pointer to allocatable
strings are disabled for now. I will enable them once #95906 is merged.

The debugging in GDB looks like this:
    
      integer, pointer :: par2(:)
      integer, target, allocatable :: ar2(:) 
      integer, target :: sc
      integer, pointer :: psc
      allocate(ar2(4))
      par2 => ar2
      psc => sc
    
    19        par2 => ar2
    (gdb) p par2
    $3 = <not associated>
    (gdb) n
    20        do i=1,5
    (gdb) p par2
    $4 = (0, 0, 0, 0)
    (gdb) ptype par2
    type = integer (4)
    (gdb) p sc
    $5 = 3
    (gdb) p psc
    $6 = (PTR TO -> ( integer )) 0x7fffffffda24
    (gdb) p *psc
    $7 = 3
…ing for generic types (#89217)

This patch is intended to be the first of a series with end goal to
adapt atomic optimizer pass to support i64 and f64 operations (along
with removing all unnecessary bitcasts). This legalizes 64 bit readlane,
writelane and readfirstlane ops pre-ISel

---------

Co-authored-by: vikramRH <[email protected]>
This change adds methods like buildGetFPEnv and similar for opcodes that
represent manipulation on floating-point state.
This changes the behaviour in C++03 mode because we'll now use the
builtin on Clang, but I don't think that's much of a problem.
This header used three-space indentation in a number of places.
Reformat it completely.
This FIXME has already been addressed in #89358
Instead for iterating over all VFs when computing costs, simply iterate
over the VFs available in the created VPlans.

Split off from llvm/llvm-project#92555.

This also prepares for moving the check if any vector instructions will
be generated to be based on VPlan, to unblock recommitting
llvm/llvm-project#92555.
Without the store, the vector loop body is empty. Add a store to avoid
that, while not impacting the induction resume values that are created.
This patch implements lowering of the GlobalAddress, BlockAddress,
JumpTable and BR_JT. Also patch adds legal support of the BR_CC
operation for i32 type.
Some of these are just old, while others previously did not use
UTC due to missing features that have since been implemented
(such as signature matching).
Since we mark the pseudos as mayLoad but do not provide any MMOs,
isSafeToMove conservatively returns false, stopping MachineLICM from
hoisting the instructions. PseudoLA_TLS_{LD,GD} does not actually expand
to a load, so stop marking that as mayLoad to allow it to be hoisted,
and for the others make sure to add MMOs during lowering to indicate
they're GOT loads and thus can be freely moved.
… (#95061)

This patch augments the HIPAMD driver to allow it to target AMDGCN
flavoured SPIR-V compilation. It's mostly straightforward, as we re-use
some of the existing SPIRV infra, however there are a few notable
additions:

- we introduce an `amdgcnspirv` offload arch, rather than relying on
using `generic` (this is already fairly overloaded) or simply using
`spirv` or `spirv64` (we'll want to use these to denote unflavoured
SPIRV, once we bring up that capability)
- initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU
targets, as it would require some relatively intrusive surgery in the
HIPAMD Toolchain and the Driver to deal with two triples
(`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively)
- in order to retain user provided compiler flags and have them
available at JIT time, we rely on embedding the command line via
`-fembed-bitcode=marker`, which the bitcode writer had previously not
implemented for SPIRV; we only allow it conditionally for AMDGCN
flavoured SPIRV, and it is handled correctly by the Translator (it ends
up as a string literal)

Once the SPIRV BE is no longer experimental we'll switch to using that
rather than the translator. There's some additional work that'll come
via a separate PR around correctly piping through AMDGCN's
implementation of `printf`, for now we merely handle its flags
correctly.
  CONFLICT (content): Merge conflict in llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  CONFLICT (content): Merge conflict in clang/test/Driver/sycl-linker-wrapper-image.cpp
  CONFLICT (content): Merge conflict in clang/lib/Driver/Driver.cpp
  CONFLICT (content): Merge conflict in clang/lib/Driver/ToolChains/HIPAMD.cpp
Test needs update after cbf6e93 2024-05-28 [clang codegen] Delete
unnecessary GEP cleanup code. (#90303).

Change made by  @premanandrao
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.