关于set想说的(一)之Set的实现类及必要的方法
最近看到了《 in Java》的第17章 容器深入探究,17.6 Set和存储顺序。自己写了写测试代码,加深下理解。主要设计()方法(主要是为了方便打印),()方法,()方法,()方法。
结论
首先明确Set接口有三种不同的实现,()、()、()。
() :快速的定位、读取,会根据hash值来存放,因此读取出来的顺序未必就是插入的顺序。
():存入set容器中的内容是按照一定的顺序排列的。
():既可以实现快速的定位读取,又可以满足读取出来的顺序就是插入顺序。
除了、等内部就已经实现了()、()、()、()方法,都实现了接口,用户自己定义的类的实例(对象),如果要放在set,则要实现(),()、()方法,实现接口。
在《 in java》中说:“如果不为你的键覆盖()和()方法,那么使用散列的数据结构(, , , )就无法正确处理你的键”。
因为()用来确保你的键是唯一的,而如果你想只覆盖()方法,而不覆盖()方法(你认为可以使用父类#()方法,按照对象的地址来进行hash),但是在#()的源码注释中,有说:如果()方法返回true,则每个对象调用()方法也必须返回相同的hash值。
1.()方法
用来实现Set中元素的不重复性,如果不覆盖()()方法,默认使用父类的方法,则只是比较对象的引用是否相同。下面是#()方法的实现。
/*** Indicates whether some other object is "equal to" this one.* * The equals
method implements an equivalence relation* on non-null object references:*
* - It is reflexive: for any non-null reference value*
x
, x.equals(x)
should return* true
.* - It is symmetric: for any non-null reference values*
x
and y
, x.equals(y)
* should return true
if and only if* y.equals(x)
returns true
.* - It is transitive: for any non-null reference values*
x
, y
, and z
, if* x.equals(y)
returns true
and* y.equals(z)
returns true
, then* x.equals(z)
should return true
.* - It is consistent: for any non-null reference values*
x
and y
, multiple invocations of* x.equals(y) consistently return true
* or consistently return false
, provided no* information used in equals
comparisons on the* objects is modified.* - For any non-null reference value
x
,* x.equals(null)
should return false
.*
* * The equals method for class Object
implements * the most discriminating possible equivalence relation on objects; * that is, for any non-null reference values x
and* y
, this method returns true
if and only* if x
and y
refer to the same object* (x == y
has the value true
).*
* Note that it is generally necessary to override the hashCode* method whenever this method is overridden, so as to maintain the* general contract for the hashCode method, which states* that equal objects must have equal hash codes. ** @param obj the reference object with which to compare.* @return true
if this object is the same as the obj* argument; false
otherwise.* @see #hashCode()* @see java.util.Hashtable*/public boolean equals(Object obj) {
return (this == obj);}
代码里的注释也挺重要的,所以也复制过来了。注释主要讲的是:
(1)对于非空对象x,y,x.(y)返回true,当且 仅当(if and only if) x,y指向了相同的对象。
(2)对于非空对象及null,x.(null)返回false。
(3)方法具有自反性(),即对于非空对象x,x.(x)返回true。
(4)方法具有对称性(),即对于非空对象x,y,x.(y)与y.(x)的返回值是一样的。
(5)方法具有传递性(),即对于非空对象x,y,z,如果x.(y) y.(z) 均返回true,则x.(z)也应该返回true。
(6)方法具有一致性(),即对于非空对象x,y,x.(y)的返回值经过多次调用之后应与第一次的返回值保持一致。
(7)一个不成文的规定:当你覆盖()()方法时,最好也覆盖()()方法,因为,“相等的对象,其hash code也相等”。 当然了这不是强制规定,只是一个建议而已。覆盖了()方法而不覆盖()方法也可以。
使用IDEA的alt+组合键,添加代码时,会发现()方法和()方法时在一起的,如下所示:
2.()
()方法时为了实现和而实现的。只有知道对象的hash值,才能根据这个hash值确定 存放在散列表的槽的index。同样,如果不覆盖()()方法,默认使用父类的()方法。下面是#()的实现源码:
/*** Returns a hash code value for the object. This method is * supported for the benefit of hashtables such as those provided by * java.util.Hashtable
. * * The general contract of hashCode
is: *
* - Whenever it is invoked on the same object more than once during * an execution of a Java application, the hashCode method * must consistently return the same integer, provided no information * used in equals comparisons on the object is modified.* This integer need not remain consistent from one execution of an* application to another execution of the same application. *
- If two objects are equal according to the equals(Object)* method, then calling the
hashCode
method on each of * the two objects must produce the same integer result. * - It is not required that if two objects are unequal * according to the {@link java.lang.Object#equals(java.lang.Object)} * method, then calling the hashCode method on each of the * two objects must produce distinct integer results. However, the * programmer should be aware that producing distinct integer results * for unequal objects may improve the performance of hashtables.*
* * As much as is reasonably practical, the hashCode method defined by * class Object does return distinct integers for distinct * objects. (This is typically implemented by converting the internal * address of the object into an integer, but this implementation * technique is not required by the * Java"-2">TM programming language.)** @return a hash code value for this object.* @see java.lang.Object#equals(java.lang.Object)* @see java.util.Hashtable*/public native int hashCode();
(1)()方法是个方法。
(2)对相同对象多次调用()方法,返回的整型值应该是一样的。
(3) 规定,如果两个对象调用()方法,返回值为true,则这两个对象各自调用()返回的整型值必须相等。(好像也没这么绝对,至少代码运行不报错,但是执行的结果可能并没有达到预期)
如:
我在类中覆盖了()方法,而没有覆盖()方法,看代码如下:
public class Person {private String name;public Person(String name) {this.name = name;}public String getName() {return name;}public void setName(String name) {this.name = name;}@Overridepublic boolean equals(Object o) {if (this == o) return true;if (o == null || getClass() != o.getClass()) return false;Person person = (Person) o;return name.equals(person.name);}// @Override
// public int hashCode() {
// return name.hashCode();
// }@Overridepublic String toString() {return "Person{" +"name='" + name + '\'' +'}';}
}
Person p1 = new Person("fan");Person p2 = new Person("hehe");Person p3 = new Person("fan");Person p4 = new Person("axyz");Set personHashSet = new HashSet();Collections.addAll(personHashSet, p1, p2, p3, p4);System.out.println(personHashSet);
执行结果如下:
[Person{name='fan'}, Person{name='axyz'}, Person{name='hehe'}, Person{name='fan'}]
结果是不对的。
结果之所以不对主要原因是:
放进去的p1,p2,p3,p4他们的地址是不同的,又没有覆盖方法,使用默认的对象的,所以hash值不同。
当hash值相同了之后,按照hash表处理冲突的策略,在list或者树结构上按照方法来查找需要(用于更新值,也就是说已经存在这个值)或者(用于插入新值,在冲突列表中还不存在这个值)的位置。
的实现其实是基于的,其add操作调用的是的put方法。
IDEA自动生成的()方法不是很好, in java中给的比较好。
@Overridepublic boolean equals(Object o) {if (this == o) return true;//if (o == null || getClass() != o.getClass()) return false;/*if (!(o instanceof Person)) return false;Person person = (Person) o;return name.equals(person.name);*/return o instanceof Person &&(name.equals(((Person) o).name));}
看上去这个方法没有判断其参数o 是否为空,其实 已经悄悄的检查了参数o是否为null,如果o为null,则返回false。当的参数不为null且类型正确,则按照的方法进行比较。
(4)两个不相等(方法返回false)的对象,产生的**不必必须** 相同。但是建议产生不同的,说是有效率的提升。
3.()方法
()方法在打印对象时会调用。如果不覆盖()()方法,默认使用父类的()方法,这时会打印该类的名字以及该对象散列码的无符号十六进制表示(通过生成的)。下面是#()的实现源码:
/*** Returns a string representation of the object. In general, the* {@code toString} method returns a string that* "textually represents" this object. The result should* be a concise but informative representation that is easy for a* person to read.* It is recommended that all subclasses override this method.* * The {@code toString} method for class {@code Object}* returns a string consisting of the name of the class of which the* object is an instance, the at-sign character `{@code @}', and* the unsigned hexadecimal representation of the hash code of the* object. In other words, this method returns a string equal to the* value of:*
* * getClass().getName() + '@' + Integer.toHexString(hashCode())*
** @ a of the .*/ () { ().() + "@" + .(());}
从上面的注释可以看到,推荐所有子类覆盖()方法。
()方法
用户类要实现接口。这个方法主要用于将对象存放在()时保证顺序的。由于是接口,所以用户类必须要实现这个方法。
下面是接口的源码:
public interface Comparable {/*** Compares this object with the specified object for order. Returns a* negative integer, zero, or a positive integer as this object is less* than, equal to, or greater than the specified object.** The implementor must ensure sgn(x.compareTo(y)) ==* -sgn(y.compareTo(x)) for all x and y. (This* implies that x.compareTo(y) must throw an exception iff* y.compareTo(x) throws an exception.)**
The implementor must also ensure that the relation is transitive:* (x.compareTo(y)>0 && y.compareTo(z)>0) implies* x.compareTo(z)>0.**
Finally, the implementor must ensure that x.compareTo(y)==0* implies that sgn(x.compareTo(z)) == sgn(y.compareTo(z)), for* all z.**
It is strongly recommended, but not strictly required that* (x.compareTo(y)==0) == (x.equals(y)). Generally speaking, any* class that implements the Comparable interface and violates* this condition should clearly indicate this fact. The recommended* language is "Note: this class has a natural ordering that is* inconsistent with equals."**
In the foregoing description, the notation* sgn(expression) designates the mathematical* signum function, which is defined to return one of -1,* 0, or 1 according to whether the value of* expression is negative, zero or positive.** @param o the object to be compared.* @return a negative integer, zero, or a positive integer as this object* is less than, equal to, or greater than the specified object.** @throws NullPointerException if the specified object is null* @throws ClassCastException if the specified object's type prevents it* from being compared to this object.*/public int compareTo(T o);
}
(1)(y),如果x小于y,则返回负数;如果x=y,则返回0;如果x>y,则返回正数。
(2)注释中的sgn是一个数学函数,当参数小于0时返回-1;当参数等于0时返回0;当参数大于0时返回1。
(3)要求sgn((y)) ==-sgn((x))
(4)要求传递性 ,即:如果(y)与(z)均返回0,则(z)也必须返回0。
(5)必须要求,如果(y)返回0,则sgn((z))=sgn((z))。
(6)强烈推荐,但是不是严格要求,((y)==0) == (x.(y))。也就是说,最好满足这个条件。
结束语
这篇博文总结了一下Set的实现类的特点,以及实现类需要的方法。下篇博文 关于set想说的(二)之Set Demo 写一个Demo。