PET/CT圖像的紋理特征提取

Author: Zongwei Zhou 周縱葦
Weibo: @MrGiovanni
Email: zongweiz@asu.edu


Please cite this paper if you found it useful. Thanks!
Wang H, Zhou Z, Li Y, et al. Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images[J]. 2017, 7.

<img src="http://upload-images.jianshu.io/upload_images/1689929-519d3b5f49a4c31a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240" width="0.001" height="0.001"/>


目的

檢驗紋理特征對3d-PET/CT圖像分類的效果。

簡介

在使用傳統(tǒng)分類器的時候,和深度學(xué)習(xí)不一樣,我們需要人為地定義圖像特征,其實CNN的卷積過程就是一個個的濾波器的作用,目的也是為了提取特征,而這種特征可視化之后往往就是紋理、邊緣特征了。因此,在人為定義特征的時候,我們也會去定義一些紋理特征。在這次實驗中,我們用數(shù)學(xué)的方法定義圖像的紋理特征,分別計算出來后就可以放入四個經(jīng)典的傳統(tǒng)分類器(隨機森林,支持向量機,AdaBoost,BP-人工神經(jīng)網(wǎng)絡(luò))中分類啦。

工具

我使用的工具是MATLAB 2014b,建議版本高一點好,因為里面會更新很多的函數(shù)庫。實驗過程盡量簡化,本實驗的重點是檢驗紋理特征對PET/CT圖像分類的效果,因此,有些常規(guī)的代碼我們就用標(biāo)準(zhǔn)的函數(shù)庫足夠啦。

參考文檔

PORTS 3D Image Texture Metric Calculation Package


1. 直方圖-histogram

直方圖描述的是一幅圖像中各個像素的分布情況,也就是一個對像素做的統(tǒng)計圖。
對于一幅灰度圖像 I,它每個像素值的范圍是0-255,我們對這些像素點做一個統(tǒng)計,遍歷整幅圖像,統(tǒng)計像素值0,1,2,3,...,255分別出現(xiàn)的次數(shù)。統(tǒng)計完以后相當(dāng)于我們有了256個頻數(shù)(次數(shù)),再把它們轉(zhuǎn)化成頻率,也就是每個頻數(shù)除以總頻數(shù):

p(i) = P(i) / ∑P

以像素值作為橫坐標(biāo),對應(yīng)的頻率作為縱坐標(biāo),就可以得到這個灰度圖像 I 的直方圖啦。

1.1 舉栗子:CT圖像的直方圖

左圖是原始的CT圖像,右圖是該圖像的直方圖

1. CT圖像的像素值范圍是-1000~1000。相當(dāng)于我們需要統(tǒng)計2000個像素值的頻數(shù),這樣劃分的粒度有點太細(xì)了,因此

2. 將這-1000~1000的區(qū)間20等分,每個像素值投射到20個值。直接導(dǎo)致的結(jié)果是圖像看上去不那么豐富了,但是這樣有利于計算。

3. 分別統(tǒng)計這20個像素值出現(xiàn)的頻數(shù),除以總頻數(shù)轉(zhuǎn)化成頻率。這樣頻率介于[0,1],并且加和為1.

4. 以20個像素值為橫坐標(biāo),對應(yīng)的頻率為縱坐標(biāo),即可畫出這個CT圖像的直方圖。

The end of this 栗子.

1.2 直方圖的代碼實現(xiàn)

%%%%%
%%%%% Histogram-based computations:
%%%%%

% Compute the histogram of the ROI and probability of each voxel value:
vox_val_hist = zeros(num_img_values,1);
for this_vox_value = 1:num_img_values
    vox_val_hist(this_vox_value) = length(find((img_vol_subvol == this_vox_value) & (mask_vol_subvol == 1) ));
end

% Compute the relative probabilities from the histogram:
vox_val_probs = vox_val_hist / num_ROI_voxels;


% Compute the histogram_based metrics:
texture_metrics(1:6) = compute_histogram_metrics(vox_val_probs,num_img_values);

1.3 基于直方圖的PET/CT紋理特征

包括六個值,分別是:

(1) Mean

(2) Variance

(3) Skewness – set to 0 when σ=0

(4) Kurtosis – set to 0 when σ=0 (NOTE: “Kurtosis” and “Excess Kurtosis” differ in that Excess Kurtosis = Kurtosis – 3).

(5) Energy

(6) Entropy (NOTE: We will differentiate between the various entropy calculations in this document, specifying the distribution from which the entropy is computed)

1.4 紋理特征計算實現(xiàn)

%%% Overhead:

% The numerical values of each histogram bin:
vox_val_indices = (1:num_img_values)';

% The indices of non-empty histogram bins:
hist_nz_bin_indices = find(vox_val_probs);


%%% (1) Mean 
metrics_vect(1) = sum(vox_val_indices .* vox_val_probs);

%%% (2) Variance
metrics_vect(2) = sum( ((vox_val_indices - metrics_vect(1)).^2) .* vox_val_probs );

%%%%% IF standard variance is zero, so are skewness and kurtosis:
if metrics_vect(2) > 0
    
    %%% (3) Skewness
    metrics_vect(3) = sum( ((vox_val_indices - metrics_vect(1)).^3) .* vox_val_probs ) / (metrics_vect(2)^(3/2));

    %%% (4) Kurtosis
    metrics_vect(4) = sum( ((vox_val_indices - metrics_vect(1)).^4) .* vox_val_probs ) / (metrics_vect(2)^2);
    metrics_vect(4) = metrics_vect(4) - 3;
    
else
    
    %%% (3) Skewness
    metrics_vect(3) = 0;
    
    %%% (4) Kurtosis
    metrics_vect(4) = 0;
    
end

%%% (5) Energy
metrics_vect(5) = sum( vox_val_probs .^2 );
%%% (6) Entropy (NOTE: 0*log(0) = 0 for entropy calculations)
metrics_vect(6) = -sum( vox_val_probs(hist_nz_bin_indices) .* log(vox_val_probs(hist_nz_bin_indices)) );

注:vox_val_probs表示直方圖中的概率值向量,num_img_values表示像素值劃分了幾等分,相當(dāng)于上面的栗子中的20.


2. 灰度共生矩陣-GLCM/GTSDM

了解了直方圖,我們接下來看看灰度共生矩陣Grey-level co-occurrence matrix GLCM (also called grey tone spatial dependence matrix GTSDM)是個啥。說白了如果直方圖是簡單的像素概率統(tǒng)計,得到的統(tǒng)計結(jié)果是個一維的向量;GLCM就是兩個像素之間的共現(xiàn)(共同出現(xiàn))概率統(tǒng)計,得到的統(tǒng)計結(jié)果是個二維的向量。

鬧,沒看懂。

比如,一幅圖中,A處出現(xiàn)了像素值為x的值,如果在距離A處一個特定的地方出現(xiàn)了像素值為y的值,那么得到的GLCM中,坐標(biāo)(x,y)處的計數(shù)加一。假設(shè)我們是一個灰度圖,x和y的范圍都是固定的(0-255),那么也就是說這個統(tǒng)計矩陣也是固定的,是256×256的大小,矩陣中的數(shù)值就是頻數(shù)統(tǒng)計結(jié)果,最后轉(zhuǎn)換成頻率就是GLCM啦。

也就是說GLCM刻畫的是一組像素對兒在圖像中的分布情況。

2.1 不知道有沒有講清楚,舉個栗子。

左圖是原始的CT圖像,右圖是該圖像的灰度共生矩陣

1. CT圖像的像素值范圍是-1000~1000。相當(dāng)于我們需要統(tǒng)計2000個像素值的頻數(shù),這樣劃分的粒度有點太細(xì)了,因此

2. 將這-1000~1000的區(qū)間20等分,每個像素值投射到20個值。直接導(dǎo)致的結(jié)果是圖像看上去不那么豐富了,但是這樣有利于計算。

以上兩步和直方圖一樣。

3. 鎖定CT圖中一個點A,坐標(biāo)(i,j)。A點的像素值是x,在CT圖中,距離A點向右del_i個像素,向下del_j個像素的位置B點,坐標(biāo)(i+del_i, j+del_j),B點的像素值是y,那么,GLCM矩陣中的位置(x,y)計數(shù)加一。注意哦,這里的x,y是原來的CT圖像的像素值大小,i,j,del_i,del_j,x,y的意義可不要搞混嘍!

4. 遍歷CT圖中所有的點,方法就是按照第三步這么統(tǒng)計。注意:del_i和del_j這兩個偏移量是預(yù)先設(shè)定好的,也就是說可以認(rèn)為是常量。

5. 分別將統(tǒng)計完的矩陣中的頻數(shù),除以總頻數(shù)轉(zhuǎn)化成頻率。這樣頻率介于[0,1],并且加和為1.

6. 以20個像素值為橫坐標(biāo),20個像素值為縱坐標(biāo),中間的值表示對應(yīng)的頻率,就得到了這個CT圖像的GLCM可視化圖。

如此這般,得到的GLCM矩陣描述的就是一組像素對兒在原始CT圖像中,在固定偏移(del_x,del_y)中的共現(xiàn)概率分布。

The end of this 栗子.

2.2 簡易的2D-image-GLCM代碼實現(xiàn)

GLCM2 = graycomatrix(CTimage, 'Offset',[4,4], 'NumLevels',20,'GrayLimits',[]);

2.3 2D-image向3D-image拓展

對于一幅3D的圖像,它的GLCM矩陣計算方法與2D圖像類似,得到的GLCM矩陣依舊是一個二維的哦,因為GLCM的橫縱坐標(biāo)是像素值,和原始圖像的維度無關(guān),即使是個4D圖像,它的GLCM矩陣也同樣是二維的。

與二維圖像相比,三維圖像在計算GLCM的步驟類似,只有栗子2的第三步需要做一個改動:

3. 鎖定3D-CT圖中一個點A,坐標(biāo)(i,j,k)。A點的像素值是x,在CT圖中,距離A點向右del_i個像素,向下del_j個像素,向外del_k個像素的位置B點,坐標(biāo)(i+del_i, j+del_j, k+del_k),B點的像素值是y,那么,GLCM矩陣中的位置(x,y)計數(shù)加一。注意哦,這里的x,y是原來的CT圖像的像素值大小,i,j,k,del_i,del_j,del_k,x,y的意義可不要搞混嘍!

厲害的你可能已經(jīng)發(fā)現(xiàn),對于一個固定的偏移量del,可以取0或者±del,一共是三個取值,那么對于2D圖像,就有3×3-1種情況,如下圖所示:

對于3D圖像,就有3×3×3-1種情況。

2.4 基于GLCM的PET/CT紋理特征

一共有19個,分別是:

% The first 14 entries in the output are from [Haralick, 1973]:
(1) Angular second moment (called "Energy" in Soh 1999)

(2) Contrast

(3) Correlation

(4) Sum of squares variance


(5) Inverse Difference moment (called "Homogeneity" in [Soh, 1999])

(6) Sum average

(7) Sum variance

(8) Sum Entropy

(9) Entropy

(10) Difference Variance

(11) Difference Entropy

(12) Information Correlation 1

(13) Information Correlation 2

*(14) Maximal Correlation Coefficient (不做計算,永遠(yuǎn)是0) *

% The next five entries in the output are from [Soh, 1999]:
(15) Autocorrelation

(16) Dissimilarity

(17) Cluster Shade

(18) Cluster Prominence

(19) Maximum Probability

% The next entries are from [Clausi, 2002]:
(20) Inverse Difference

中間量的計算



2.5 紋理特征計算實現(xiàn)

% (1) Angular second moment
metrics_vect(1) = sum( p(:).^2 );

% (2) Contrast (for some reason, the paper does not explicitly state p_xmy
% here):
metrics_vect(2) = sum( ((0:(N_g-1))' .^2) .*  p_xmy  );

% (3) Correlation (there is mathematical ambiguity in the nature of the sum as
% stated in the paper ; this version has the means subtracted after the sum is 
% taken, which is the proper method for computation):
mu_x = sum( (1:N_g)' .* p_x );
mu_y = sum( (1:N_g)' .* p_y );
sg_x = sqrt( sum( ( ((1:N_g)' - mu_x).^2 ) .* p_x ) );
sg_y = sqrt( sum( ( ((1:N_g)' - mu_y).^2 ) .* p_y ) );

if (sg_x*sg_y) == 0
    metrics_vect(3) = Inf;
else
    metrics_vect(3) = ( sum(ndr(:) .* ndc(:) .* p(:) ) - (mu_x*mu_y)  ) ./ (sg_x*sg_y);
end

% (4) Sum of squares variance (NOTE: \mu is not defined in the paper, we will
% take it to describe the mean of the normalized GTSDM):
metrics_vect(4) = sum( (( ndr(:) - mean(p(:)) ) .^2) .* p(:) );

% (5) Inverse Difference moment
metrics_vect(5) = sum( ( 1 ./ (1 + ((ndr(:)-ndc(:)).^2) )  ) .* p(:) );

% (6) Sum average
metrics_vect(6) = sum( (1:(2*N_g))' .* p_xpy(:) ); % NOTE: p_xpy(1) = 0 , so adds nothing.

% (7) Sum variance
metrics_vect(7) = sum( (((1:(2*N_g))' - metrics_vect(6)) .^2) .* p_xpy(:));

% (8) Sum Entropy (computed above)
metrics_vect(8) = SE;

% (9) Entropy (computed above)
metrics_vect(9) = HXY;

% (10) Difference Variance
mu_xmy = sum( (0:(N_g-1))' .*  p_xmy );
metrics_vect(10) = sum( (((0:(N_g-1))' - mu_xmy) .^2) .*  p_xmy  );

% (11) Difference Entropy
metrics_vect(11) = -sum( p_xmy(p_xmy>0) .* log(p_xmy(p_xmy>0)) );

% (12) and (13) Information Correlations
if (max(HX,HY)== 0)
    metrics_vect(12) = Inf;
else
    metrics_vect(12) = (HXY - HXY1) / max(HX,HY);
end

metrics_vect(13) = sqrt(1-exp(-2*(HXY2-HXY)) );

% (14) Maximal Correlation Coefficient
%%% I don't think we use it, so I'll only code it up if needed.

%%%%%
%%%%% The following are from Soh (1999)
%%%%%

% (15) Autocorrelation
metrics_vect(15) = sum( (ndr(:) .* ndc(:)) .* p(:) );

% (16) Dissimilarity
metrics_vect(16) = sum( abs(ndr(:) - ndc(:)) .* p(:) );

% (17) Cluster Shade
metrics_vect(17) = sum( (ndr(:) + ndc(:) - mu_x - mu_y) .^3 .* p(:) );

% (18) Cluster Prominence
metrics_vect(18) = sum( (ndr(:) + ndc(:) - mu_x - mu_y) .^4 .* p(:) );

% (19) Maximum Probability
metrics_vect(19) = max( p(:) );

%%%%%
%%%%% The following are from Clausi (2002)
%%%%%

% (20) Inverse Difference:
metrics_vect(20) = sum( ( 1 ./ (1 + abs( ndr(:)-ndc(:) ) )  ) .* p(:) );

3. Neighborhood grey tone difference matrix (NGTDM)

NGTDM刻畫的是一個像素與其周圍像素值的關(guān)系。

3.1 舉個2D圖像的栗子

1. CT圖像的像素值范圍是-1000~1000。相當(dāng)于我們需要統(tǒng)計2000個像素值的頻數(shù),這樣劃分的粒度有點太細(xì)了,因此

2. 將這-1000~1000的區(qū)間20等分,每個像素值投射到20個值。直接導(dǎo)致的結(jié)果是圖像看上去不那么豐富了,但是這樣有利于計算。

以上兩步和前面栗子的一樣。

3. 鎖定CT圖中一個點A,坐標(biāo)(i,j)。對于一個二維圖像來說,A點周圍應(yīng)該有8個點,左邊分別是(i±1,j±1),(i,j±1),(i±1,j),這8個點的像素范圍是1~20(因為步驟2)。求這8個點的像素值的平均值,為A'。那么,設(shè)A點的像素值為p_A
NGTDM(p_A) = NGTDM(p_A) + abs(p_A-A');
occur(p_A) = occur(p_A) + 1;

4. 遍歷CT圖中所有的點,方法就是按照第三步這么統(tǒng)計。我們可以得到兩個矩陣NGTDM和occur,它們都是20×1的矩陣,NGTDM記錄每個像素值周圍的情況,occur記錄的是每個像素值在整個CT圖像中出現(xiàn)的頻數(shù)。

5. 分別將統(tǒng)計完的occur中的頻數(shù),除以總頻數(shù)轉(zhuǎn)化成頻率。這樣頻率介于[0,1],并且加和為1。

6. 以20個像素值為橫坐標(biāo),以它們所對應(yīng)的NGTDM和occur值為縱坐標(biāo),做一個柱狀圖,就可以得到NGTDM和occur的可視化圖。

3.2 3D-NGTDM代碼實現(xiàn)

function [NGTDM,vox_occurances_NGD26] = compute_3D_NGTDM(ROI_vol,img_vol,binary_dir_connectivity,num_img_values)

% Placeholder for the NGTDM and number of occurances with full NGDs: 
NGTDM = zeros(num_img_values,1);
vox_occurances_NGD26 = zeros(num_img_values,1);

% Record the indices of the voxels used in the ROI:
ROI_voxel_indices = find(ROI_vol);

% Loop over each voxel in the ROI sub-volume:
for this_ROI_voxel = 1:length(ROI_voxel_indices)
    
    % The index of this voxel in the sub-volume:
    this_voxel_index = ROI_voxel_indices(this_ROI_voxel);
    
    % This voxel must have 26 neighbors (plus itself) to be considered:
    if sum(binary_dir_connectivity{this_ROI_voxel}(:)) == 27
        
        % Determine the [r,c,s] of this voxel:
        [r,c,s] = ind2sub(size(ROI_vol),this_voxel_index);
        
        % Compute the mean value around this voxel:
        this_vox_val = img_vol(this_voxel_index);
        vox_ngd = img_vol((r-1):(r+1) , (c-1):(c+1) , (s-1):(s+1));
        vox_ngd_sum = sum(vox_ngd(:)) - this_vox_val;
        vox_ngd_mean = vox_ngd_sum / 26;
        
        % Add this value to the matrix:
        NGTDM(this_vox_val) = NGTDM(this_vox_val) + abs(this_vox_val-vox_ngd_mean);        
        
        % Increment the number of occurances of this voxel:
        vox_occurances_NGD26(this_vox_val) = vox_occurances_NGD26(this_vox_val) + 1;        
    
    end % Test for full neighborhood    
    
end % Loop over ROI voxels

3.3 基于NGTDM的PET/CT紋理特征

(1) Coarseness

(2) Contrast

(3) Busyness

(4) Complexity

(5) Texture Strength

3.4 紋理特征計算實現(xiàn)

%%% (1) Coarseness
metrics_vect(1) = sum( vox_val_probs .* NGTDM );

% It's the reciprocal, so test for zero denominator:
if metrics_vect(1) == 0
    metrics_vect(1) = Inf;
else
    metrics_vect(1) = 1/metrics_vect(1);
end

%%% (2) Contrast
if N_g > 1 % There is some voxel color differences, so perform calculations as normal:
    % The first term in equation (4):
    first_term_mat = (vox_val_probs * vox_val_probs') .* ( (nd_r-nd_c).^2 );
    first_term_val = sum(first_term_mat(:)) / (N_g * (N_g-1) );

    % The second term in equation (4). Note that the 3D computation
    % necessitates normalization by the number of voxels instead of the n^2 that appears in
    % equation (4). 
    second_term_val = sum(NGTDM) / sum(vox_occurances_NGD26);

    % Record the value:
    metrics_vect(2) = first_term_val * second_term_val;
    
else % There is only a single color, so no contrast to compute, so set to negative:
        metrics_vect(2) = -1;
end 

%%% (3) Busyness
% NOTE: The denominator equals zero in the paperAmadasun 1989. Absolute value inside the
% double-sum is given here, in accordance with 
%
% Texture Analysis Methods ? A Review
% Andrzej Materka and Michal Strzelecki (1998)
%
first_term = sum(vox_val_probs .* NGTDM);

second_term_mat = (nd_nz_r .* nd_nz_p_r) - (nd_nz_c .* nd_nz_p_c);
second_term = sum(abs(second_term_mat(:)));

if second_term == 0
    metrics_vect(3) = Inf;
else 
    metrics_vect(3) = first_term / second_term;
end

%%% (4) Complexity
first_term_num = abs(nd_nz_r - nd_nz_c);
first_term_den = nd_nz_p_r + nd_nz_p_c;

second_term = (nd_nz_p_r .* nd_nz_NGTDMop_r) + (nd_nz_p_c .* nd_nz_NGTDMop_c);

if second_term == 0
    metrics_vect(4) = Inf;
else
    tmp = first_term_num(:) .* second_term(:) ;
    tmp = sum(tmp ./ first_term_den(:)) ;
    metrics_vect(4) = tmp / sum(vox_occurances_NGD26) ;
end

%%% (5) Texture Strength
first_term_mat = (nd_nz_p_r+nd_nz_p_c) .* ( (nd_nz_r - nd_nz_c) .^2 );
first_term = sum(first_term_mat(:));
second_term = sum(NGTDM);

if second_term == 0
    metrics_vect(5) = Inf;
else
    metrics_vect(5) = first_term / second_term ;
end

4. Grey Level Zone Size Matrix (GLZSM)

4.1 基于GLZSM的PET/CT紋理特征

(1) Small Zone Size Emphasis

(2) Large Zone Size Emphasis

(3) Low Gray-Level Zone Emphasis

(4) High Gray-Level Zone Emphasis

(5) Small Zone / Low Gray Emphasis

(6) Small Zone / High Gray Emphasis

(7) Large Zone / Low Gray Emphasis

(8) Large Zone / High Gray Emphasis

(9) Gray-Level Non-Uniformity

(10) Zone Size Non-Uniformity

(11) Zone Size Percentage

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容