Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply DataArray patch and update Bukkit adapters. #2181

Open
wants to merge 1 commit into
base: v3
Choose a base branch
from

Conversation

JayemCeekay
Copy link
Contributor

Overview

Fixes #1938

Description

This pull request implements the patch for the use of type ambiguous DataArrays created by @SirYwell. This is necessary for the upcoming implementation of updated Forge and Fabric adapters in order to allow the registration and use of more than 65535 blockstates.

Submitter Checklist

  • Make sure you are opening from a topic branch (/feature/fix/docs/ branch (right side)) and not your main branch.
  • Ensure that the pull request title represents the desired changelog entry.
  • New public fields and methods are annotated with @since TODO.
  • I read and followed the contribution guidelines.

@JayemCeekay JayemCeekay requested a review from a team as a code owner April 17, 2023 16:00
@github-actions github-actions bot added the Feature This PR adds a new feature label Apr 17, 2023
@SirYwell SirYwell requested a review from dordsor21 April 17, 2023 16:25
@SirYwell
Copy link
Member

SirYwell commented Jun 1, 2023

I updated this branch to reflect the latest changes on main

@SirYwell SirYwell changed the base branch from main to v3 August 18, 2023 18:34
@SirYwell SirYwell force-pushed the feature/DataArrays branch from f84b635 to 95bc238 Compare August 18, 2023 19:20
@SirYwell
Copy link
Member

Retargeted to v3 and rebased, though I need to re-review the changes myself because there were some ugly merge conflicts that probably broke something.

@github-actions
Copy link

Please take a moment and address the merge conflicts of your pull request. Thanks!

1 similar comment
@github-actions
Copy link

Please take a moment and address the merge conflicts of your pull request. Thanks!

Copy link
Member

@dordsor21 dordsor21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda wonder if we should just move entirely to int based arrays. The amount of extra method calls for setting blocks required it quite substantial and will have a performance impact. I don't really see the memory impact being particularly large.

Also, clipboards will need to be investigated as they are very char based (and moving to an int-based system will halve the capacity of disk-based solutions)

@@ -894,9 +881,9 @@ public char[] update(int layer, char[] data, boolean aggressive) {
} else {
// The section's palette is the global block palette.
for (int i = 0; i < 4096; i++) {
char paletteVal = data[i];
char paletteVal = (char) data.getAt(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this (and the equivalent in all other adapters) should use int rather than char, else, the method breaks down when using an IntDataArray. Methods that return char should also be changed to int in adapters (e.g. adaptToChar, adaptToChar etc. The cached char[] array for ibdToStateOrdinal can also just be int[] - the memory impact is pretty negligible (in the 100s of KB)

@@ -236,7 +237,7 @@ public <T extends Future<T>> T call(IChunkSet set, Runnable finalize) {
public char get(int x, int y, int z) {
final int layer = (y >> 4) - getMinSectionPosition();
final int index = (y & 15) << 8 | z << 4 | x;
return blocks[layer][index];
return (char) blocks[layer].getAt(index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise move to int for this method in all adapters too

CachedBukkitAdapter adapter,
short[] nonEmptyBlockCount
) {
short nonAir = 4096;
int num_palette = 0;
for (int i = 0; i < 4096; i++) {
char ordinal = set[i];
char ordinal = (char) set.getAt(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should all be moved to int as well

if (set == null || get == null) {
return false;
}
char ordinal;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use int

});

public final CleanableThreadLocal<int[]> PALETTE_TO_BLOCK_INT = new CleanableThreadLocal<>(
() -> new int[Character.MAX_VALUE + 1], a -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be initialised to block types cache size

* Convert raw int array to palette
* @return palette
*/
public Palette toPalette(int layerOffset, int[] blocks) {
Copy link
Member

@dordsor21 dordsor21 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods being removed should be deprecated for removal on main branch

protected int maxSectionPosition;
protected int sectionCount;

static BiomeType getBiomeType(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a particular fan of having a random biome method here, I'm not really sure that this class is needed and I think it ends up adding more clutter than it aims to remove

* Equivalent to {@link CharSetBlocks} without any attempt to make thread-safe for improved performance.
* This is currently only used as a "copy" of {@link CharSetBlocks} to provide to
* Equivalent to {@link DataArraySetBlocks} without any attempt to make thread-safe for improved performance.
* This is currently only used as a "copy" of {@link DataArraySetBlocks} to provide to
* {@link com.fastasyncworldedit.core.queue.IBatchProcessor} instances for processing without overlapping the continuing edit.
*
* @since 2.6.2
Copy link
Member

@dordsor21 dordsor21 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this should be renamed to ThreadUnsafeDataArrayBlocks

@@ -485,6 +477,23 @@ default IChunkSet processSet(IChunk chunk, IChunkGet get, IChunkSet set, boolean
}
}

private boolean isProcessExtra(IChunkSet set, boolean processExtra, int layer, DataArray arr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldProcessExtra?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably actually call the method sometihng that implies it's doing something to the data at all

Copy link

Please take a moment and address the merge conflicts of your pull request. Thanks!

@SirYwell
Copy link
Member

The casts to char should only be present in the bukkit module, but I'm fine with changing it to int there too.

I kinda wonder if we should just move entirely to int based arrays. The amount of extra method calls for setting blocks required it quite substantial and will have a performance impact. I don't really see the memory impact being particularly large.

I don't expect the methods to have measurable overhead over direct array access (we have one additional memory indirection until we have value classes, the methods are small, the JVM can inline them easily). Memory overhead of always using int[] might be fine, but that also mean more cache misses. It also might be interesting in future to explore Foreign Memory based approaches, in which case the DataArray abstraction would be useful too. So I think it's not worth to not have the DataArray abstraction.

Also, clipboards will need to be investigated as they are very char based (and moving to an int-based system will halve the capacity of disk-based solutions)

I started investigating that in https://github.com/IntellectualSites/FastAsyncWorldEdit/tree/feature/disk-based-clipboard, but there are more things to consider.

@dordsor21
Copy link
Member

dordsor21 commented Nov 20, 2023

The casts to char should only be present in the bukkit module, but I'm fine with changing it to int there too.

I'm not sure that's something we can 100% assume with datapacks, etc?

I don't expect the methods to have measurable overhead over direct array access (we have one additional memory indirection until we have value classes, the methods are small, the JVM can inline them easily). Memory overhead of always using int[] might be fine, but that also mean more cache misses. It also might be interesting in future to explore Foreign Memory based approaches, in which case the DataArray abstraction would be useful too. So I think it's not worth to not have the DataArray abstraction.

I'm not sure if an increased cache miss likelihood is really a large performance hit? I already assume that large edits that cannot be done per-chunk (large clipboard operations, any recursive operations) are already missing L1 (and very possibly L2 and L3 cache too) as the number of chunks being loaded and re-loaded is quite large. And then if a whole chunk's block arrays (max 200 KiB with char, 400 with int) are missing L3 cache we should be attempting to do something about that. Besides, I doubt that most servers have sole access to a single CPU to be able to be making full using the L1-3 cache anyway, and I would expect lots of cache misses due to running websites, other servers, services etc anyway. Switching to all int[] is more maintainable too

I started investigating that in feature/disk-based-clipboard, but there are more things to consider.

Ah yeah I'd forgotten about that, looks to make it work

@SirYwell
Copy link
Member

I'm not sure that's something we can 100% assume with datapacks, etc?

AFAIK it's still not possible to have additional block types/states. Custom biomes are possible though.

I'm not sure if an increased cache miss likelihood is really a large performance hit

Most likely not, my point was more that we'd need to measure to really know the impact here (and even then, there are many parts that probably aren't true anymore in the next Java release).
I'd really like to keep the DataArray abstraction, there are a few things that are far easier to understand with that approach (e.g. avoiding System.arraycopy in the middle of already complex code, filling arrays, etc). It generally is more future-proof and will potentially allow for a bunch of optimizations in future.

Copy link

Please take a moment and address the merge conflicts of your pull request. Thanks!

@SirYwell SirYwell force-pushed the v3 branch 2 times, most recently from a0de1d0 to 9fc47e9 Compare October 26, 2024 09:26
@SirYwell
Copy link
Member

I made this PR compatible with the Vector API changes now - somewhat hacky, but seems to work.

Copy link
Member

@dordsor21 dordsor21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of previous comments still apply

@@ -188,6 +188,7 @@ public synchronized void init(Extent extent, IChunkCache<IChunkGet> get, IChunkC
}
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove newline


@Override
public <V> void processSet(final DataArray get, final BinaryOperator<? extends Vector<V>> op) {
// TODO is this sane???
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the alternative would be to litter generics for vectors through a lot of classes (if that's even possible?)

@@ -184,18 +183,12 @@ public char get(int x, int y, int z) {
return defaultOrdinal;
}
final int index = (y & 15) << 8 | z << 4 | x;
return blocks[layer - minSectionPosition][index];
return (char) blocks[layer - minSectionPosition].getAt(index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs moving to int

@@ -485,6 +477,23 @@ default IChunkSet processSet(IChunk chunk, IChunkGet get, IChunkSet set, boolean
}
}

private boolean isProcessExtra(IChunkSet set, boolean processExtra, int layer, DataArray arr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably actually call the method sometihng that implies it's doing something to the data at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature This PR adds a new feature v3
Projects
3 participants