GCE 图片欣赏

图一

A.由于reads错误率上升,无效的reads越来越多,错误峰从无到有,从低到高;

B.由于越来越多的reads变成错误的reads,所以正确的reads占比下降,峰高下降;

C.正确的reads占比下降,是由于其数量的减少导致的,进而导致其对基因组的覆盖度下降,峰左移;

图二

A. Kmer分析峰的深度,比数据对基因组的覆盖深度会低少许,峰左移;

B. Kmer分析峰的高度更高一些,实际意义待定。

 

无锁(lock-free)的非阻塞算法:CAS (Compare-And-Swap)

定义

CAS的意思是:当本线程打算修改某个其他线程也可以访问的变量时,并且这个变量的改变和其旧值相关,那么,在修改前进行这样的操作:判断这个变量的当前值是否和本线程存储的其过去值相等,如果相等,就将其修改为新值;否则,就将其过去值修改为当前值,本次尝试修改失败,然后再次尝试修改这个值。

当讨论多线程的时候,就得讨论锁,而CAS是一种乐观锁。所谓乐观锁,就是指,不去锁定数据,而是通过判断来确定是否可以修改这个值。

应用:

Jel­ly­fish

GFF3 Format

This sec­tion describes the rep­re­sen­ta­tion of a pro­tein-cod­ing gene in GFF3. To illus­trate how a canon­i­cal gene is rep­re­sent­ed, con­sid­er Fig­ure 1 (figure1.png). This indi­cates a gene named EDEN extend­ing from posi­tion 1000 to posi­tion 9000. It encodes three alter­na­tive­ly-spliced tran­scripts named EDEN.1, EDEN.2 and EDEN.3, the last of which has two alter­na­tive trans­la­tion­al start sites lead­ing to the gen­er­a­tion of two pro­tein cod­ing sequences.

There is also an iden­ti­fied tran­scrip­tion­al fac­tor bind­ing site locat­ed 50 bp upstream from the tran­scrip­tion­al start site of EDEN.1 and EDEN2.

Here is how this gene should be described using GFF3:

 
 0  ##gff-version 3.2.1
 1  ##sequence-region ctg123 1 1497228
 2  ctg123 . gene            1000  9000  .  +  .  ID=gene00001;Name=EDEN
 3  ctg123 . TF_binding_site 1000  1012  .  +  .  ID=tfbs00001;Parent=gene00001
 4  ctg123 . mRNA            1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
 5  ctg123 . mRNA            1050  9000  .  +  .  ID=mRNA00002;Parent=gene00001;Name=EDEN.2
 6  ctg123 . mRNA            1300  9000  .  +  .  ID=mRNA00003;Parent=gene00001;Name=EDEN.3
 7  ctg123 . exon            1300  1500  .  +  .  ID=exon00001;Parent=mRNA00003
 8  ctg123 . exon            1050  1500  .  +  .  ID=exon00002;Parent=mRNA00001,mRNA00002
 9  ctg123 . exon            3000  3902  .  +  .  ID=exon00003;Parent=mRNA00001,mRNA00003
10  ctg123 . exon            5000  5500  .  +  .  ID=exon00004;Parent=mRNA00001,mRNA00002,mRNA00003
11  ctg123 . exon            7000  9000  .  +  .  ID=exon00005;Parent=mRNA00001,mRNA00002,mRNA00003
12  ctg123 . CDS             1201  1500  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
13  ctg123 . CDS             3000  3902  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
14  ctg123 . CDS             5000  5500  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
15  ctg123 . CDS             7000  7600  .  +  0  ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
16  ctg123 . CDS             1201  1500  .  +  0  ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
17  ctg123 . CDS             5000  5500  .  +  0  ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
18  ctg123 . CDS             7000  7600  .  +  0  ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
19  ctg123 . CDS             3301  3902  .  +  0  ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
20  ctg123 . CDS             5000  5500  .  +  1  ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
21  ctg123 . CDS             7000  7600  .  +  1  ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
22  ctg123 . CDS             3391  3902  .  +  0  ID=cds00004;Parent=mRNA00003;Name=edenprotein.4
23  ctg123 . CDS             5000  5500  .  +  1  ID=cds00004;Parent=mRNA00003;Name=edenprotein.4
24  ctg123 . CDS             7000  7600  .  +  1  ID=cds00004;Parent=mRNA00003;Name=edenprotein.4

Lines begin­ning with ‘##’ are direc­tives (some­times called prag­mas or meta-data) and provide meta-infor­ma­tion about the doc­u­ment as a whole. Blank lines should be ignored by parsers and lines begin­ning with a sin­gle ‘#’ are used for human-read­able com­ments and can be ignored by parsers. End-of-line com­ments (com­ments pre­ceed­ed by # at the end of and on the same line as a fea­ture or direc­tive line) are not allowed.

Line 0 gives the GFF ver­sion using the ##gff-ver­sion prag­ma. Line 1 indi­cates the bound­aries of the region being anno­tat­ed (a 1,497,228 bp region named “ctg123”) using the ##sequence-region prag­ma.

Line 2 defines the bound­aries of the gene. Column 9 of this line assigns the gene an ID of gene00001, and a human-read­able name of EDEN. Because the gene is not part of a larg­er fea­ture, it has no Par­ent.

Line 3 anno­tates the tran­scrip­tion­al fac­tor bind­ing site. Since it is log­i­cal­ly part of the gene, its Par­ent attrib­ute is gene00001.

Lines 4–6 define this gene’s three spliced tran­scripts, one line for the full extent of each of the mRNAs. The­se fea­tures are nec­es­sary to act as par­ents for the four CDSs which derive from them, as well as the struc­tural par­ents of the five exons in the alter­na­tive splic­ing set.

Lines 7–11 iden­ti­fy the five exons. The Par­ent attrib­ut­es indi­cate which mRNAs the exons belong to. Notice that sev­er­al of the exons share the same par­ents, using the com­ma sym­bol to indi­cate mul­ti­ple parent­age.

Lines 12–24 denote this gene’s four CDSs. Each CDS belongs to one of the mRNAs. cds00003 and cds00004, which cor­re­spond to alter­na­tive start codons, belong to the same mRNA.

Note that sev­er­al of the fea­tures, includ­ing the gene, its mRNAs and the CDSs, all have Name attrib­ut­es. This attrib­ut­es assigns those fea­tures a pub­lic name, but is not manda­to­ry. The ID attrib­ut­es are only manda­to­ry for those fea­tures that have chil­dren (the gene and mRNAs), or for those that span mul­ti­ple lines. The IDs do not have mean­ing out­side the file in which they reside. Hence, a slight­ly sim­pli­fied ver­sion of this file would look like this:

Find more at this link: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md

正确答案

围棋的核心在于逻辑运算。答案正确与否,并非是以标准答案为准,我们可以自己进行逻辑推理去判断答案给出的解法是否合理。比如下面这一道来自“围棋大全”的题:

初始局面:(下一手是第25手)

标准解(下面的左图):标准解假设我们下了第25手以后,白棋会走第26手,实际上,白棋可以走在第27手黑棋的位置,那么局面如下面的右图。

                 

所以我认为,应当这么下:第25手直接立下。

孰优孰略,还可以精密计算,后面补充。