Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match all TLDs in TextComponent url Pattern #3719

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Outfluencer
Copy link
Collaborator

List of all tlds can be found here (the longest is 24 chars) http://data.iana.org/TLD/tlds-alpha-by-domain.txt

example domain that has an online website:
This is the longest (decoded its https://whois.nic.vermögensberatung/ containing the german ö letter)
https://whois.nic.xn--vermgensberatung-pwb/

List of all tlds can be found here (the longest is 24 chars)
http://data.iana.org/TLD/tlds-alpha-by-domain.txt
example domain that has an online website
https://whois.nic.xn--vermgensberatung-pwb/
@md-5
Copy link
Member

md-5 commented Aug 5, 2024

Is this necessary? It'll dramatically expand what is automatically.linked

@Janmm14
Copy link
Contributor

Janmm14 commented Aug 5, 2024

I think we could support longer tlds, but only with a requirement of http(s) or www in front, as a missed space after a dot in chat can happen often.
I do not think its feasible to check the tld after matching agaisnt a set of known tlds, as the amount of tlds is ever increasing (needs updating tld list every now and then) and many complete words also have thier own tld nowadays.

Also I do not think that regex is matching umlauts/specialchars in tlds

@Outfluencer
Copy link
Collaborator Author

Outfluencer commented Aug 5, 2024

i will change it so that if the matcher has found http(s) it will decreae the second level domain size to allow 1 so https://x.com would work currently the minimum is 2 chars

the top level domain will be increased to 2-24 and allow dashes as they are in some long tlds

"^(?:https?://([-\w_\.]+\.[a-z-]{2,24})|([-\w_\.]{2,}\.[a-z]{2,4}))(/\S*)?$"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants