plink

#snp2bed­bim­fam
plink –23file JPT-NA19001.snp JPT ID002 –out JPT-NA19001

#去除有问题的s­np
plink –bfile JPT-NA19001 –exclude merge.missnp –make-bed –out new

#merge单个文件
plink –bfile source1 –bmerge source2_­tri­al –make-bed –out merged_­tri­al

#merge多个文件
plink –merge-list merge_list –make-bed –out merge

编程哲理

1. 面向对象编程的奥义在于每种数据都自带其操作,这样使用者就不必了解如何操作复杂的数据结构了,而只需要学习这种数据的接口即可;

2.泛型编程使得编写的一种算法可以广泛用于各种类型的数据,这样就不必为每种类型的数据重新重载一次函数。

C++ 模板与泛型编程

泛型编程旨在编写独立于数据类型的代码” 《c++ primer plus》(6th ed)

实现一种方法,可以用于各种类型的数据。

#include<iostream>

using namespace std;

template <class Nott>
class Stack{
  private:
    Nott arr[20];
    int num;
  public:
    Stack();
    void push(const Nott& ele);
    void print();
};


template <class Nott>
Stack<Nott>::Stack(){
  num = 0;
}

template <class Nott>
void Stack<Nott>::push(const Nott& ele){
  arr[num] = ele;
  num ++; 
}

template <class Nott>
void Stack<Nott>::print(){
  for(int i = 0;i < num;i ++) 
    cout << arr[i] << " ";
  cout << endl;
}

int main(){
  Stack<char> nott;
  nott.push('N');
  nott.push('O');
  nott.push('T');
  nott.push('T');
  nott.print();
  Stack<int> nottt;
  nottt.push(6);
  nottt.push(6);
  nottt.push(6);
  nottt.push(6);
  nottt.print();
  return 0;
}

输出结果:

I C B C
6 6 6 6

HOX gene

ref: 2013-the reg­u­la­tion of hox gene expres­sion dur­ing ani­mal devel­op­ment

 

 

GATK caveat

1. 选择/过滤

Vari­ant­Fil­tra­tion: Fil­ter vari­ant calls based on INFO and/or FORMAT anno­ta­tions
out­put: A fil­tered VCF in which pass­ing vari­ants are anno­tat­ed as PASS and fail­ing vari­ants are anno­tat­ed with the name(s) of the filter(s) they failed.
Select­Vari­ants:    Select a sub­set of vari­ants from a VCF file.
out­put:
1.如果一个值缺失,VariantFiltration会认为这条值所在的记录通过检查,而SelectVariants认为这条记录不能通过检查。

2.foobar

 

notes of ANNOVAR

1. 坐标系: By default, 1-based coor­di­nate sys­tem is used.

2. 核心程序: annotate_variation.pl

3. 注释类型: gene-based (-genean­no), region-based (-regio­nan­no) and fil­ter-based (-fil­ter) anno­ta­tions.

4. 输出结果:

a. The first file con­tains anno­ta­tion for all vari­ants, by adding two columns to the begin­ning of each input line.

b. The sec­ond out­put file con­tains the amino acid changes as a result of the exon­ic vari­ant.

5. 重点定位:

What about GFF3 file for new species?(http://annovar.openbioinformatics.org/en/latest/user-guide/gene/)

gff3­To­GenePred                                                                               (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/)

Saccharomyces cerevisiae (yeast)

The bud­ding yeast Sac­cha­romyces cere­visi­ae is one of the major mod­el organ­isms for under­stand­ing cel­lu­lar and mol­e­c­u­lar process­es in eukary­otes. This sin­gle-celled organ­ism is also impor­tant in indus­try, where it is used to make bread, beer, wine, enzymes, and pharmaceuticals.The Sac­cha­romyces cere­visi­ae genome is approx­i­mate­ly 12 Mb, orga­nized in 16 chro­mo­somes.

shell脚本路径问题

如何在shell的脚本内得知脚本所在的位置?有些命令当使用sh执行本脚本时可以成立,但是通过source执行本脚本的时候就不行了。下面详细论述各种情况。

1.参数传入

sh script.sh para1或者­source script.sh para1

在脚本里面可以通过$1来获得para1

2.