ArrayList源碼分析

ArrayList

原文見:Java 容器源碼分析之 ArrayList

概述

ArrayList是使用頻率最高的集合之一了,在需要使用List的情況下,往往都是優先考慮ArrayList。首先我們來看一下聲明:

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

ArrayList實現的幾個接口中,RandomAccess、Cloneable、Serializable都是標記接口,所以ArrayList是很純粹的List接口的實現,不像它兄弟LinkedList還實現了Deque接口,還要作為雙向隊列使用。

結構

transient Object[] elementData;

// 這個繼承自父類AbstractList
protected transient int modCount = 0;

ArrayList的名稱中我們就可以看出來,這是一個用數組實現的List,或者說是可變數組,數據就是存儲在elementData這個對象數組里。除了elementData我們還需要關注一個重要的成員變量modCountmodCount成員變量是繼承自父類AbstractListmodCount表示這個List被結構化修改的次數,結構化修改就是那些會改變List的大小的操作。modCount主要被用在迭代器上,如果一個List在迭代的過程中發生了結構化修改,就會導致結果出錯。在List迭代過程中,如果因為其它線程對List的操作,導致結構發生變化,那么迭代器就拋出ConcurrentModificationException,這就是迭代器的fail-fast機制。

添加元素

/**
 * Appends the specified element to the end of this list.
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 */
public void add(int index, E element) {
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

/**
 * Appends all of the elements in the specified collection to the end of
 * this list, in the order that they are returned by the
 * specified collection's Iterator.  The behavior of this operation is
 * undefined if the specified collection is modified while the operation
 * is in progress.  (This implies that the behavior of this call is
 * undefined if the specified collection is this list, and this
 * list is nonempty.)
 */
public boolean addAll(Collection<? extends E> c) {
    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount
    System.arraycopy(a, 0, elementData, size, numNew);
    size += numNew;
    return numNew != 0;
}

/**
 * Inserts all of the elements in the specified collection into this
 * list, starting at the specified position.  Shifts the element
 * currently at that position (if any) and any subsequent elements to
 * the right (increases their indices).  The new elements will appear
 * in the list in the order that they are returned by the
 * specified collection's iterator.
 */
public boolean addAll(int index, Collection<? extends E> c) {
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

    int numMoved = size - index;
    if (numMoved > 0)
        System.arraycopy(elementData, index, elementData, index + numNew,
                         numMoved);

    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

private void rangeCheck(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

private void rangeCheckForAdd(int index) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

有多個方法來給ArrayList添加元素,add(E e)是添加到數組末尾,add(int index, E element)是添加到指定位置,addAll(Collection<? extends E> c)批量添加元素到數組末尾,addAll(int index, Collection<? extends E> c)批量添加元素到指定位置。

本質上這幾個方法都是相同的,首先通過rangeCheck或者rangeCheckForAdd方法判斷index是否合法。然后通過ensureCapacityInternal方法來確保數組的容量足夠,該方法會先判斷當前數組容量是否足夠,如果不夠就進行擴容,待會會進行介紹。不過需要注意的是,添加元素是會造成ArrayList結構化改變的,所以modCount的值要增加。而源碼中將modCount自增操作放在了ensureCapacityInternal方法里,感覺有點怪怪的,從方法的命名中可以看出這個方法是用來確保數組容量的,但是卻在這個方法里修改了與方法容量無關的成員變量,所以我覺得設計得不是很合理。寫代碼的人也覺得自己這樣搞不是很合理,所以才通過注釋來說明。

ensureCapacityInternal(size + 1); // Increments modCount!!

接著剛才的話題,當確保數組的容量足夠之后,再通過靜態方法System.arraycopy()將元素拷貝到合適的位置,對原數組進行重新排序就可以了。當然,添加到末尾就不用考慮到數組重排序的問題了,直接將待添加元素放到末尾就可以了。最后修改size到相應的數值,添加元素的操作就完成了。

擴容

ArrayList是基于可變數組的,當底層數組容量不足時會進行擴容,以改變數組的容量。代碼如下:

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

前面那些ensure開頭的方法是用來檢測當前數組容量是否足夠容納minCapacity的,如果容量不足才會進行擴容,即調用grow(int capacity)方法,我們直接來看grow()方法。

grow()方法首先將數組容量擴張為原來的1.5倍,即int newCapacity = oldCapacity + (oldCapacity >> 1)這條語句。然后再判斷新容量是否滿足最小所需容量minCapacity,如果還是不能滿足,就將newCapacity設置為minCapacity。接下來要判斷newCapacity是否超過了最大允許的數組大小MAX_ARRAY_SIZE,如果超過了就調整為最大的int值。最后就是將原數組的值拷貝到新的數組上。

移除元素

/**
 * Removes the element at the specified position in this list.
 * Shifts any subsequent elements to the left (subtracts one from their
 * indices).
 */
public E remove(int index) {
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;
}

/**
 * Removes the first occurrence of the specified element from this list,
 * if it is present.  If the list does not contain the element, it is
 * unchanged.  More formally, removes the element with the lowest index
 */
public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

/*
 * Private remove method that skips bounds checking and does not
 * return the value removed.
 */
private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}

/**
 * Removes all of the elements from this list.  The list will
 * be empty after this call returns.
 */
public void clear() {
    modCount++;

    // clear to let GC do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

/**
 * Removes from this list all of the elements whose index is between
 * {@code fromIndex}, inclusive, and {@code toIndex}, exclusive.
 * Shifts any succeeding elements to the left (reduces their index).
 * This call shortens the list by {@code (toIndex - fromIndex)} elements.
 * (If {@code toIndex==fromIndex}, this operation has no effect.)
 */
protected void removeRange(int fromIndex, int toIndex) {
    modCount++;
    int numMoved = size - toIndex;
    System.arraycopy(elementData, toIndex, elementData, fromIndex,
                     numMoved);

    // clear to let GC do its work
    int newSize = size - (toIndex-fromIndex);
    for (int i = newSize; i < size; i++) {
        elementData[i] = null;
    }
    size = newSize;
}

其實移除元素的原理很簡單,就是通過System.arraycopy方法將需要保留的元素復制到正確的位置上,然后調整size的大小。最后為了防止內存泄露,需要顯式將不再使用的位置中存放的元素置為null。雖然原理簡單,但是需要注意的細節很多,大多是索引值方面的小細節。

接下來看一下批量刪除或者保留元素的方法。

/**
 * Removes from this list all of its elements that are contained in the
 * specified collection.
 */
public boolean removeAll(Collection<?> c) {
    Objects.requireNonNull(c);
    return batchRemove(c, false);
}

/**
 * Retains only the elements in this list that are contained in the
 * specified collection.  In other words, removes from this list all
 * of its elements that are not contained in the specified collection.
 */
public boolean retainAll(Collection<?> c) {
    Objects.requireNonNull(c);
    return batchRemove(c, true);
}

private boolean batchRemove(Collection<?> c, boolean complement) {
    final Object[] elementData = this.elementData;
    int r = 0, w = 0;
    boolean modified = false;
    try {
        for (; r < size; r++)
            //1) 移除c中元素,complement == false
            //   若elementData[r]不在c中,則保留
            //2)保留c中元素,complement == true
            //   若elementData[r]在c中,則保留
            if (c.contains(elementData[r]) == complement)
                elementData[w++] = elementData[r];
    } finally {
        // Preserve behavioral compatibility with AbstractCollection,
        // even if c.contains() throws.
        // 1)r == size, 則操作成功了
        // 2)r != size, c.contains拋出了異常,
        //      可能是因為元素和c中元素類型不兼容,或者c不支持null元素
        //      則將后面尚未檢查的元素向前復制
        if (r != size) {
            System.arraycopy(elementData, r,
                             elementData, w,
                             size - r);
            w += size - r;
        }
        if (w != size) {
            // clear to let GC do its work
            for (int i = w; i < size; i++)
                elementData[i] = null;
            modCount += size - w;
            size = w;
            modified = true;
        }
    }
    return modified;
}

其中,無論是批量移除removeAll()方法還是批量保留retainAll()方法,都是使用了batchRemove方法,我們直接來看這個方法。

先來說一下原理,首先通過便利整個數組,找出需要保留的元素,從索引0開始依次保存到elementData數組中。如果便利過程沒有異常出現(也就是r==size),則顯式將不再使用的位置中存放的元素置為null,讓GC回收。當然如果便利過程出現異常(r!=size),則要將未被便利的值拷貝到w索引及之后的位置。暫時不清楚對異常的處理是否合理。

查找與更新

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}

/**
 * Returns the index of the first occurrence of the specified element
 * in this list, or -1 if this list does not contain the element.
 * More formally, returns the lowest index <tt>i</tt> such that
 * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
 * or -1 if there is no such index.
 */
public int indexOf(Object o) {
    if (o == null) {
        for (int i = 0; i < size; i++)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = 0; i < size; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

/**
 * Returns the index of the last occurrence of the specified element
 * in this list, or -1 if this list does not contain the element.
 * More formally, returns the highest index <tt>i</tt> such that
 * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
 * or -1 if there is no such index.
 */
public int lastIndexOf(Object o) {
    if (o == null) {
        for (int i = size-1; i >= 0; i--)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = size-1; i >= 0; i--)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

/**
 * Returns the element at the specified position in this list.
 */
public E get(int index) {
    rangeCheck(index);

    return elementData(index);
}

/**
 * Replaces the element at the specified position in this list with
 * the specified element.
 */
public E set(int index, E element) {
    rangeCheck(index);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}

因為是基于數組實現的,所以查找元素和更新元素比較簡單。這幾個方法都沒有改變List的結構,所以不會修改modCount的值。

迭代

列表的迭代也是開發中經常使用到了,特別是使用for each語句進行迭代。因為Collection接口繼承了Iterable接口,ArrayList間接實現了Collection,所以需要實現Iterable接口的iterator()方法,下面我們來看一下。

public Iterator<E> iterator() {
    return new Itr();
}
/**
 * An optimized version of AbstractList.Itr
 */
private class Itr implements Iterator<E> {
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    public boolean hasNext() {
        return cursor != size;
    }

    @SuppressWarnings("unchecked")
    public E next() {
        checkForComodification();
        int i = cursor;
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            ArrayList.this.remove(lastRet);
            cursor = lastRet;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

迭代器中通過cursor來標注下一個待返回元素的索引值,還有一個lastRet來標注上一個被返回元素的索引值。ArrayList的實現不是線程安全的,其fail-fast機制的實現是通過modCount變量來實現的。在nextremove里都有checkForComodification()的方法,在該方法中,會比較Iterator創建時的modCount(expectedModCount)和當前的modCount的值是否相等。不過不相,證明在迭代器創建之后ArrayList的結構有被修改過,此時拋出ConcurrentModificationException異常。

需要注意的一點在于,remove()方法調用時,會判斷lastRet < 0,如果小于0,就會拋出異常。出現lastRet<0只有兩種情況,一種是剛創建迭代器,還未調用next()方法的時候,一種是調用過一次remove()方法后會把lastRet設置為-1。所以連續兩次調用remove()方法是會拋出異常的。

List接口還支持另一種迭代器ListIterator,它不僅可以使用next()向前迭代,還可以使用previous()向后迭代;不僅可以使用remove()在迭代中移除元素,還可以使用add()方法在迭代中添加元素。

小結

ArrayList內部使用數組實現,具有高效的隨機訪問的特性。但是插入和刪除元素時往往需要復制數組,開銷較大。在容器創建之后需要進行大量訪問,但插入和刪除操作使用較少的情況下比較適合使用ArrayList。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容

  • ArrayList是在Java中最常用的集合之一,其本質上可以當做是一個可擴容的數組,可以添加重復的數據,也支持隨...
    ShawnIsACoder閱讀 583評論 4 7
  • 每個 ArrayList 實例都有一個容量,該容量是指用來存儲列表元素的數組的大小。它總是至少等于列表的大小。隨著...
    Mervyn_2014閱讀 203評論 0 0
  • 定義 除了實現了List接口,還實現了RandomAccess,Cloneable, java.io.Serial...
    zhanglbjames閱讀 440評論 0 0
  • List List是一個維持內部元素有序的采集器,其中的每個元素都會擁有一個索引,每個元素都可以通過他的索引獲取到...
    dooze閱讀 408評論 0 4
  • 整體介紹 ArrayList實現了List接口,是一個常見的集合類,它有一下特點: 是順序容器,即元素存放的數據與...
    SeaRise閱讀 358評論 0 0