Class InternetDomainName
- java.lang.Object
-
- com.google.common.net.InternetDomainName
-
@Beta @GwtCompatible public final class InternetDomainName extends java.lang.Object
An immutable well-formed internet domain name, such ascomorfoo.co.uk. Only syntactic analysis is performed; no DNS lookups or other network interactions take place. Thus there is no guarantee that the domain actually exists on the internet.One common use of this class is to determine whether a given string is likely to represent an addressable domain on the web -- that is, for a candidate string
"xxx", might browsing to"http://xxx/"result in a webpage being displayed? In the past, this test was frequently done by determining whether the domain ended with a public suffix but was not itself a public suffix. However, this test is no longer accurate. There are many domains which are both public suffixes and addressable as hosts;"uk.com"is one example. As a result, the only useful test to determine if a domain is a plausible web host ishasPublicSuffix(). This will returntruefor many domains which (currently) are not hosts, such as"com", but given that any public suffix may become a host without warning, it is better to err on the side of permissiveness and thus avoid spurious rejection of valid sites.During construction, names are normalized in two ways:
- ASCII uppercase characters are converted to lowercase.
- Unicode dot separators other than the ASCII period (
'.') are converted to the ASCII period.
The normalized values will be returned from
toString()andparts(), and will be reflected in the result ofequals(Object).Internationalized domain names such as
网络.cnare supported, as are the equivalent IDNA Punycode-encoded versions.- Since:
- 5.0
-
-
Field Summary
Fields Modifier and Type Field Description private static CharMatcherDASH_MATCHERprivate static JoinerDOT_JOINERprivate static java.lang.StringDOT_REGEXprivate static SplitterDOT_SPLITTERprivate static CharMatcherDOTS_MATCHERprivate static intMAX_DOMAIN_PART_LENGTHMaximum size of a single part of a domain name.private static intMAX_LENGTHMaximum length of a full domain name, including separators, and leaving room for the root label.private static intMAX_PARTSMaximum parts (labels) in a domain name.private java.lang.StringnameThe full domain name, converted to lower case.private static intNO_PUBLIC_SUFFIX_FOUNDValue ofpublicSuffixIndexwhich indicates that no public suffix was found.private static CharMatcherPART_CHAR_MATCHERprivate ImmutableList<java.lang.String>partsThe parts of the domain name, converted to lower case.private intpublicSuffixIndexThe index in theparts()list at which the public suffix begins.
-
Constructor Summary
Constructors Constructor Description InternetDomainName(java.lang.String name)Constructor used to implementfrom(String), and from subclasses.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private InternetDomainNameancestor(int levels)Returns the ancestor of the current domain at the given number of levels "higher" (rightward) in the subdomain list.InternetDomainNamechild(java.lang.String leftParts)Creates and returns a newInternetDomainNameby prepending the argument and a dot to the current name.booleanequals(java.lang.Object object)Equality testing is based on the text supplied by the caller, after normalization as described in the class documentation.private intfindPublicSuffix()Returns the index of the leftmost part of the public suffix, or -1 if not found.static InternetDomainNamefrom(java.lang.String domain)Returns an instance ofInternetDomainNameafter lenient validation.inthashCode()booleanhasParent()Indicates whether this domain is composed of two or more parts.booleanhasPublicSuffix()Indicates whether this domain name ends in a public suffix, including if it is a public suffix itself.booleanisPublicSuffix()Indicates whether this domain name represents a public suffix, as defined by the Mozilla Foundation's Public Suffix List (PSL).booleanisTopPrivateDomain()Indicates whether this domain name is composed of exactly one subdomain component followed by a public suffix.booleanisUnderPublicSuffix()Indicates whether this domain name ends in a public suffix, while not being a public suffix itself.static booleanisValid(java.lang.String name)Indicates whether the argument is a syntactically valid domain name using lenient validation.private static booleanmatchesWildcardPublicSuffix(java.lang.String domain)Does the domain name match one of the "wildcard" patterns (e.g.InternetDomainNameparent()Returns anInternetDomainNamethat is the immediate ancestor of this one; that is, the current domain with the leftmost part removed.ImmutableList<java.lang.String>parts()Returns the individual components of this domain name, normalized to all lower case.InternetDomainNamepublicSuffix()Returns the public suffix portion of the domain name, ornullif no public suffix is present.InternetDomainNametopPrivateDomain()Returns the portion of this domain name that is one level beneath the public suffix.java.lang.StringtoString()Returns the domain name, normalized to all lower case.private static booleanvalidatePart(java.lang.String part, boolean isFinalPart)Helper method forvalidateSyntax(List).private static booleanvalidateSyntax(java.util.List<java.lang.String> parts)Validation method used by to ensure that the domain name is syntactically valid according to RFC 1035.
-
-
-
Field Detail
-
DOTS_MATCHER
private static final CharMatcher DOTS_MATCHER
-
DOT_SPLITTER
private static final Splitter DOT_SPLITTER
-
DOT_JOINER
private static final Joiner DOT_JOINER
-
NO_PUBLIC_SUFFIX_FOUND
private static final int NO_PUBLIC_SUFFIX_FOUND
Value ofpublicSuffixIndexwhich indicates that no public suffix was found.- See Also:
- Constant Field Values
-
DOT_REGEX
private static final java.lang.String DOT_REGEX
- See Also:
- Constant Field Values
-
MAX_PARTS
private static final int MAX_PARTS
Maximum parts (labels) in a domain name. This value arises from the 255-octet limit described in RFC 2181 part 11 with the fact that the encoding of each part occupies at least two bytes (dot plus label externally, length byte plus label internally). Thus, if all labels have the minimum size of one byte, 127 of them will fit.- See Also:
- Constant Field Values
-
MAX_LENGTH
private static final int MAX_LENGTH
Maximum length of a full domain name, including separators, and leaving room for the root label. See RFC 2181 part 11.- See Also:
- Constant Field Values
-
MAX_DOMAIN_PART_LENGTH
private static final int MAX_DOMAIN_PART_LENGTH
Maximum size of a single part of a domain name. See RFC 2181 part 11.- See Also:
- Constant Field Values
-
name
private final java.lang.String name
The full domain name, converted to lower case.
-
parts
private final ImmutableList<java.lang.String> parts
The parts of the domain name, converted to lower case.
-
publicSuffixIndex
private final int publicSuffixIndex
The index in theparts()list at which the public suffix begins. For example, for the domain namewww.google.co.uk, the value would be 2 (the index of thecopart). The value is negative (specifically,NO_PUBLIC_SUFFIX_FOUND) if no public suffix was found.
-
DASH_MATCHER
private static final CharMatcher DASH_MATCHER
-
PART_CHAR_MATCHER
private static final CharMatcher PART_CHAR_MATCHER
-
-
Constructor Detail
-
InternetDomainName
InternetDomainName(java.lang.String name)
Constructor used to implementfrom(String), and from subclasses.
-
-
Method Detail
-
findPublicSuffix
private int findPublicSuffix()
Returns the index of the leftmost part of the public suffix, or -1 if not found. Note that the value defined as the "public suffix" may not be a public suffix according toisPublicSuffix()if the domain ends with an excluded domain pattern such as"nhs.uk".
-
from
public static InternetDomainName from(java.lang.String domain)
Returns an instance ofInternetDomainNameafter lenient validation. Specifically, validation against RFC 3490 ("Internationalizing Domain Names in Applications") is skipped, while validation against RFC 1035 is relaxed in the following ways:- Any part containing non-ASCII characters is considered valid.
- Underscores ('_') are permitted wherever dashes ('-') are permitted.
- Parts other than the final part may start with a digit, as mandated by RFC 1123.
- Parameters:
domain- A domain name (not IP address)- Throws:
java.lang.IllegalArgumentException- ifnameis not syntactically valid according toisValid(java.lang.String)- Since:
- 10.0 (previously named
fromLenient)
-
validateSyntax
private static boolean validateSyntax(java.util.List<java.lang.String> parts)
Validation method used by to ensure that the domain name is syntactically valid according to RFC 1035.- Returns:
- Is the domain name syntactically valid?
-
validatePart
private static boolean validatePart(java.lang.String part, boolean isFinalPart)Helper method forvalidateSyntax(List). Validates that one part of a domain name is valid.- Parameters:
part- The domain name part to be validatedisFinalPart- Is this the final (rightmost) domain part?- Returns:
- Whether the part is valid
-
parts
public ImmutableList<java.lang.String> parts()
Returns the individual components of this domain name, normalized to all lower case. For example, for the domain namemail.google.com, this method returns the list["mail", "google", "com"].
-
isPublicSuffix
public boolean isPublicSuffix()
Indicates whether this domain name represents a public suffix, as defined by the Mozilla Foundation's Public Suffix List (PSL). A public suffix is one under which Internet users can directly register names, such ascom,co.ukorpvt.k12.wy.us. Examples of domain names that are not public suffixes includegoogle,google.comandfoo.co.uk.- Returns:
trueif this domain name appears exactly on the public suffix list- Since:
- 6.0
-
hasPublicSuffix
public boolean hasPublicSuffix()
Indicates whether this domain name ends in a public suffix, including if it is a public suffix itself. For example, returnstrueforwww.google.com,foo.co.ukandcom, but not forgoogleorgoogle.foo. This is the recommended method for determining whether a domain is potentially an addressable host.- Since:
- 6.0
-
publicSuffix
public InternetDomainName publicSuffix()
Returns the public suffix portion of the domain name, ornullif no public suffix is present.- Since:
- 6.0
-
isUnderPublicSuffix
public boolean isUnderPublicSuffix()
Indicates whether this domain name ends in a public suffix, while not being a public suffix itself. For example, returnstrueforwww.google.com,foo.co.ukandbar.ca.us, but not forgoogle,com, orgoogle.foo.Warning: a
falseresult from this method does not imply that the domain does not represent an addressable host, as many public suffixes are also addressable hosts. UsehasPublicSuffix()for that test.This method can be used to determine whether it will probably be possible to set cookies on the domain, though even that depends on individual browsers' implementations of cookie controls. See RFC 2109 for details.
- Since:
- 6.0
-
isTopPrivateDomain
public boolean isTopPrivateDomain()
Indicates whether this domain name is composed of exactly one subdomain component followed by a public suffix. For example, returnstrueforgoogle.comandfoo.co.uk, but not forwww.google.comorco.uk.Warning: A
trueresult from this method does not imply that the domain is at the highest level which is addressable as a host, as many public suffixes are also addressable hosts. For example, the domainbar.uk.comhas a public suffix ofuk.com, so it would returntruefrom this method. Butuk.comis itself an addressable host.This method can be used to determine whether a domain is probably the highest level for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls. See RFC 2109 for details.
- Since:
- 6.0
-
topPrivateDomain
public InternetDomainName topPrivateDomain()
Returns the portion of this domain name that is one level beneath the public suffix. For example, forx.adwords.google.co.ukit returnsgoogle.co.uk, sinceco.ukis a public suffix.If
isTopPrivateDomain()is true, the current domain name instance is returned.This method should not be used to determine the topmost parent domain which is addressable as a host, as many public suffixes are also addressable hosts. For example, the domain
foo.bar.uk.comhas a public suffix ofuk.com, so it would returnbar.uk.comfrom this method. Butuk.comis itself an addressable host.This method can be used to determine the probable highest level parent domain for which cookies may be set, though even that depends on individual browsers' implementations of cookie controls.
- Throws:
java.lang.IllegalStateException- if this domain does not end with a public suffix- Since:
- 6.0
-
hasParent
public boolean hasParent()
Indicates whether this domain is composed of two or more parts.
-
parent
public InternetDomainName parent()
Returns anInternetDomainNamethat is the immediate ancestor of this one; that is, the current domain with the leftmost part removed. For example, the parent ofwww.google.comisgoogle.com.- Throws:
java.lang.IllegalStateException- if the domain has no parent, as determined byhasParent()
-
ancestor
private InternetDomainName ancestor(int levels)
Returns the ancestor of the current domain at the given number of levels "higher" (rightward) in the subdomain list. The number of levels must be non-negative, and less thanN-1, whereNis the number of parts in the domain.TODO: Reasonable candidate for addition to public API.
-
child
public InternetDomainName child(java.lang.String leftParts)
Creates and returns a newInternetDomainNameby prepending the argument and a dot to the current name. For example,InternetDomainName.from("foo.com").child("www.bar")returns a newInternetDomainNamewith the valuewww.bar.foo.com. Only lenient validation is performed, as describedhere.- Throws:
java.lang.NullPointerException- if leftParts is nulljava.lang.IllegalArgumentException- if the resulting name is not valid
-
isValid
public static boolean isValid(java.lang.String name)
Indicates whether the argument is a syntactically valid domain name using lenient validation. Specifically, validation against RFC 3490 ("Internationalizing Domain Names in Applications") is skipped.The following two code snippets are equivalent:
domainName = InternetDomainName.isValid(name) ? InternetDomainName.from(name) : DEFAULT_DOMAIN;try { domainName = InternetDomainName.from(name); } catch (IllegalArgumentException e) { domainName = DEFAULT_DOMAIN; }- Since:
- 8.0 (previously named
isValidLenient)
-
matchesWildcardPublicSuffix
private static boolean matchesWildcardPublicSuffix(java.lang.String domain)
Does the domain name match one of the "wildcard" patterns (e.g."*.ar")?
-
toString
public java.lang.String toString()
Returns the domain name, normalized to all lower case.- Overrides:
toStringin classjava.lang.Object
-
equals
public boolean equals(@Nullable java.lang.Object object)Equality testing is based on the text supplied by the caller, after normalization as described in the class documentation. For example, a non-ASCII Unicode domain name and the Punycode version of the same domain name would not be considered equal.- Overrides:
equalsin classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
-