R / 代碼規范 / Google's R Style Guide

R是一個高級編程語言主要用于統計計算和圖形。R編程風格指南的目標是使我們的R代碼更容易閱讀、分享和驗證。以下R代碼規則是在谷歌的整個R合作用戶社區下進行設計的。

符號和命名(Notation and Naming)

文件名(File names)

File names should end in .R and, of course, be meaningful.
GOOD: predict_ad_revenue.R
BAD: foo.R

標識符(Identifiers)

不要使用下劃線(_)或連字符(-)標識符。標識符應按照以下命名約定。變量名的首選形式都是小寫字母和單詞分開用點(variable.name),但也接受variableName;函數名最初的大寫字母和沒有點(FunctionName);

variable.name is preferred, variableName is accepted
GOOD: avg.clicks
OK: avgClicks
BAD: avg_Clicks

**FunctionName **
GOOD: CalculateAvgClicks
BAD: calculate_avg_clicks , calculateAvgClicks
Make function names verbs.
Exception: When creating a classed object, the function name

語法(Syntax)

每行最大長度(Line Length)

The maximum line length is 80 characters.

縮進(Indentation)

當縮進代碼, 使用兩個空間。絕不使用制表符或混合制表符和空格。例外:括號內發生換行時,使其與括號內的第一個字符對齊。

間距(Spacing)

當使用所有二進制運算符(如=,+,-,<,等)在兩端空格。例外:當符號 = 是函數調用時的傳遞參數周圍不用空格隔開。
不要在符號“,”前空格隔開,但需要在“,”后添加空格
GOOD:

tab.prior <- table(df[df$days.from.opt < 0, "campaign.id"]) 
total <- sum(x[, 1]) 
total <- sum(x[1, ])

BAD:

tab.prior <- table(df[df$days.from.opt<0, "campaign.id"])  # Needs spaces around '<' 
tab.prior <- table(df[df$days.from.opt < 0,"campaign.id"])  # Needs a space after the comma
tab.prior<- table(df[df$days.from.opt < 0, "campaign.id"])  # Needs a space before <
tab.prior<-table(df[df$days.from.opt < 0, "campaign.id"])  # Needs spaces around <
total <- sum(x[,1])  # Needs a space after the comma 
total <- sum(x[ ,1])  # Needs a space after the comma, not before

在左括號之前添加一個空格,除了函數的調用
GOOD:

if (debug)

BAD:

if(debug)

Extra spacing (i.e., more than one space in a row) is okay if it improves alignment of equals signs or arrows (<-).

plot(x    = x.coord,     
     y    = data.mat[, MakeColName(metric, ptiles[1], "roiOpt")],     
     xlab = "dates", 
     ylab = metric,
     main = (paste(metric, " for 3 samples ", sep = "")))

花括號(Curly Braces)

一個左括號不應該自己一行; 而一個右括號應該總是一行。當一個代碼塊是一個單獨聲明時你可以不適用花括號。但是,你必須考慮其他相同的情況,以保持一致。

if (is.null(ylim)) {  ylim <- c(0, 0.06) }

xor (but not both)

if (is.null(ylim))  ylim <- c(0, 0.06)

Always begin the body of a block on a new line.
BAD:

if (is.null(ylim)) ylim <- c(0, 0.06) 
if (is.null(ylim)) {ylim <- c(0, 0.06)}

花括號與else
一個else語句應該總是被花括號包圍在同一行。
Good:

if (condition) {  
  one or more lines 
} else {  
  one or more lines 
}

Bad:

if (condition) {  
  one or more lines 
} 
else {  
  one or more lines 
}

賦值(Assignment)

Use <-, not =, for assignment.
GOOD:

x <- 5

BAD:

x = 5

Organization

總體布局和排序(General Layout and Ordering)

如果每個人都使用相同的一般順序, 我們能夠更快和更容易閱讀和理解彼此的腳本。一般開頭需包含:

  1. 版權聲明注釋
  2. 作者評論
  3. 文件描述的評論,包括程序的目的,輸入和輸出
  4. source() 和 library() 聲明
  5. 函數定義
  6. 已執行的語句

單元測試應該在一個單獨的文件名為originalfilename_test.R。

代碼注釋(Commenting Guidelines)

簡短的注釋可以放置在代碼之后,用 空格 + # + 空格隔開 ,較長的注釋可以單獨一行。

# Create histogram of frequency of campaigns by pct budget spent. 
hist(df$pct.spent,    
     breaks = "scott",  # method for choosing number of buckets     
     xlab   = "Fraction of budget spent",     
     ylab   = "Frequency (count of campaignids)")

函數定義和調用(Function Definitions and Calls)

函數定義應該首先列出參數沒有默認值, 緊隨其后的是那些有默認值的。
在函數定義和函數調用時,允許多個參數一行,但是換行只允許在參數之間進行。
GOOD:

PredictCTR <- function(query, property, num.days,                                              
                       show.plot = TRUE)

BAD:

PredictCTR <- function(query, property, num.days, show.plot =     
                       TRUE)

理想情況下,單元測試應該作為樣本函數調用(共享庫例程)。

函數說明(Function Documentation)

在函數定義之下應該包含一個分段注釋。這些注釋應該包含一句關于函數的描述,一段關于該函數的參數列表的的描述(包括數據類型), 和一個返回值的描述。這些注釋需具有足夠的描述性的,調用者可以通過閱讀注釋即可懂得如何調用該函數。

Example

CalculateSampleCovariance <- function(x, y, verbose = TRUE) {  
  # Computes the sample covariance between two vectors.  
  #  
  # Args:  
  #   x: One of two vectors whose sample covariance is to be calculated.  
  #   y: The other vector. x and y must have the same length, greater than one,  
  #      with no missing values.  
  #   verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE.  
  # 
  # Returns:  
  #   The sample covariance between x and y.  

  n <- length(x)  
  # Error handling  
  if (n <= 1 || n != length(y)) {   
    stop("Arguments x and y have different lengths: ",
         length(x), " and ", length(y), ".")     
  }
  if (TRUE %in% is.na(x) || TRUE %in% is.na(y)) {  
    stop(" Arguments x and y must not have missing values.") 
  }
  covariance <- var(x, y)  
  if (verbose)    
    cat("Covariance = ", round(covariance, 4), ".\n", sep = "")  
  return(covariance) 
}

Functions

錯誤應該使用 stop() 進行提醒

最后的話

Use common sense and BE CONSISTENT.

If you are editing code, take a few minutes to look at the code around you and determine its style. If others use spaces around their if clauses, you should, too. If their comments have little boxes of stars around them, make your comments have little boxes of stars around them, too.

The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it. We present global style rules here so people know the vocabulary. But local style is also important. If code you add to a file looks drastically different from the existing code around it, the discontinuity will throw readers out of their rhythm when they go to read it. Try to avoid this.

OK, enough writing about writing code; the code itself is much more interesting. Have fun!

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容