Biostar學習筆記(2)

  1. If you want to practice at your own pace, you can download related materials from the following website: Biostar data site.
  2. I just skimmed through the whole book. It's a good book for newbies. You will get to know about what is bioinformatics and learn some basic codes. If you want to learn more, the authors have also given some useful resources and links that you can refer to. This book will guide you learn bioinformatics in a systematic way. The best way to learn is to learn by practicing.
  3. What is Bioinformatics?
    Make sense of biological data by using computational methods. Most bioinformatics mainly includds the following four categories:
    • Assembly: 基因組裝,建立新的基因組
    • Resequencing:重測序,與已知基因組進行序列比對,鑒別突變和變異情況
    • Classification:確定一個生物群的種群構成
    • Quantification:用DNA測序的方法來測量細胞內的功能學特征。
  4. "pwd": show current filepath. If you want to use the returned value, You can use the following. DATA_PATH=${PWD}
  5. "|": Pipe sign. very useful when you are trying to acheive simple goals in several steps that can be connected by a pipe.
  6. Keep file folders well-organized, easy to memorize and use.
  7. parallel: use multiple process to finish similartasks. eg:
mkdir -p ~/tmp/fastq && cd ~/tmp/fastq
touch GSE89245.txt
for i in $(seq -w 86 95); do echo "SRP0921""$i" >>GSE89245.txt;done 
# seq -w, return the value in the format of the latter number (compare "seq -w 1 10" and "seq 1 10")
cat sraid.txt | parallel fastq-dump -o sra --split-files {}
  1. view and combine files
# for regular files
cat file1 file2 file... >> bigfile
# for gziped files
zcat file1.gz file2.gz filen.gz >> bigfile.gz
  1. The $PATH environment variable
echo $PATH
export $PATH=/file/path/of/real/programes:$PATH >> ~/.bashrc
source ~/.bashrc
  1. "grep" command, usually used with "cat" or "zcat" and "|" and "cut -f" command to extract certain column and pass the values to downstream analysis
man grep
cat SGD_features.tab | cut -f 2,3,4 | grep ORF | grep -v Dubious | wc -l # sample lines
  1. "sed": replace strings with new values. Very useful when renaming multiple files with similar patterns.
man sed
  1. "awk" command. This command is a little complicated, try to use online resources to learn more.
man awk

Since I used most of my time skiming through this book, I will write more about grep/sed/awk command in the future. Hope you find this useful to you.

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容