Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some unsafe magic for minimal optimizations, add .idea to .gitignore and fix some typos #71

Closed
wants to merge 5 commits into from

Conversation

Kleinmarb
Copy link

Thanks for the amazing contribution to open source url parsing!

I am using this lib right now and thought that I should look into it a bit,
I ended up doing minor changes.

I understand if the unsafe code gets rejected, but I even left some safety comments.

Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Can you double check if we have tests covering the changed lines? Just in case.

}

// Safety: Prior to transmuting we checked if value is bigger than 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the mapping (number to enum) as a comment to here? Let's not also remove the knowledge we have since it might not be obvious at the first glance.

}

// Safety: Prior to transmuting we checked if value is bigger than 6
// and returned the default in that case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the mapping as a comment here as well?

@anonrig
Copy link
Member

anonrig commented Aug 2, 2024

Also can you fix clippy errors?

src/lib.rs Outdated

// Safety: Prior to transmuting we checked if value is bigger than 2
// and returned the default in that case.
unsafe { std::mem::transmute(value) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that it is extremely unsafe. Ref: https://x.com/tenellous/status/1819506534675894296

Is there any benchmark result you could share with us, that makes this change/risk worthy to land?

@Kleinmarb
Copy link
Author

I added tests and comments that illustrate which number maps to which enum variant.

Regarding the clippy errors, I don't get any.

Now let's talk about performance:
If I'm honest there will be a performance difference but it will be negligible.
But the code I wrote is 100% safe, you can trust me on that one since I added a#[repr(u32)] on the enum which allows for safe transmution between value as u32 and the enum as long as value is not bigger than the number of variants of the enum -1, but we check for that.

If you still think its risky, just deny the commit, it's alright for me.

Copy link
Collaborator

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the maintainer so this isn't my call, but for my taste this isn't worth it when the same exact benefit can be obtained without any unsafe.

Comment on lines +83 to +91
// Safety: Prior to transmuting we checked if value is bigger than 2
// and returned the default in that case.
//
// Mapping:
// - 0 => Domain
// - 1 => IPV4
// - 2 => IPV6
// - any other number maps to NotSpecial
unsafe { std::mem::transmute(value as u32) }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Safety: Prior to transmuting we checked if value is bigger than 2
// and returned the default in that case.
//
// Mapping:
// - 0 => Domain
// - 1 => IPV4
// - 2 => IPV6
// - any other number maps to NotSpecial
unsafe { std::mem::transmute(value as u32) }
match value {
0 => Self::Domain,
1 => Self::IPV4,
2 => Self::IPV6,
_ => Self::Domain,
}

If you replace the transmute with the same match block as before, the generated code (and therefore the performance) will be identical: https://x.com/tenellous/status/1819528511201464821

I'm all for performance improvement, but here there's only unsafe risk without any benefit.

Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kleinmarb
Copy link
Author

Oh ok I should have done the assembly check myself. Thanks for pointing out that it produces the same exact instructions.

@Kleinmarb Kleinmarb closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants