GCE 图片欣赏

图一

A.由于reads错误率上升,无效的reads越来越多,错误峰从无到有,从低到高;

B.由于越来越多的reads变成错误的reads,所以正确的reads占比下降,峰高下降;

C.正确的reads占比下降,是由于其数量的减少导致的,进而导致其对基因组的覆盖度下降,峰左移;

图二

A. Kmer分析峰的深度,比数据对基因组的覆盖深度会低少许,峰左移;

B. Kmer分析峰的高度更高一些,实际意义待定。

 

无锁(lock-free)的非阻塞算法:CAS (Compare-And-Swap)

定义

CAS的意思是:当本线程打算修改某个其他线程也可以访问的变量时,并且这个变量的改变和其旧值相关,那么,在修改前进行这样的操作:判断这个变量的当前值是否和本线程存储的其过去值相等,如果相等,就将其修改为新值;否则,就将其过去值修改为当前值,本次尝试修改失败,然后再次尝试修改这个值。

当讨论多线程的时候,就得讨论锁,而CAS是一种乐观锁。所谓乐观锁,就是指,不去锁定数据,而是通过判断来确定是否可以修改这个值。

应用:

Jellyfish

GFF3 Format

This section describes the representation of a protein-coding gene in GFF3. To illustrate how a canonical gene is represented, consider Figure 1 (figure1.png). This indicates a gene named EDEN extending from position 1000 to position 9000. It encodes three alternatively-spliced transcripts named EDEN.1, EDEN.2 and EDEN.3, the last of which has two alternative translational start sites leading to the generation of two protein coding sequences.

There is also an identified transcriptional factor binding site located 50 bp upstream from the transcriptional start site of EDEN.1 and EDEN2.

Here is how this gene should be described using GFF3:

Lines beginning with ‘##’ are directives (sometimes called pragmas or meta-data) and provide meta-information about the document as a whole. Blank lines should be ignored by parsers and lines beginning with a single ‘#’ are used for human-readable comments and can be ignored by parsers. End-of-line comments (comments preceeded by # at the end of and on the same line as a feature or directive line) are not allowed.

Line 0 gives the GFF version using the ##gff-version pragma. Line 1 indicates the boundaries of the region being annotated (a 1,497,228 bp region named “ctg123”) using the ##sequence-region pragma.

Line 2 defines the boundaries of the gene. Column 9 of this line assigns the gene an ID of gene00001, and a human-readable name of EDEN. Because the gene is not part of a larger feature, it has no Parent.

Line 3 annotates the transcriptional factor binding site. Since it is logically part of the gene, its Parent attribute is gene00001.

Lines 4-6 define this gene’s three spliced transcripts, one line for the full extent of each of the mRNAs. These features are necessary to act as parents for the four CDSs which derive from them, as well as the structural parents of the five exons in the alternative splicing set.

Lines 7-11 identify the five exons. The Parent attributes indicate which mRNAs the exons belong to. Notice that several of the exons share the same parents, using the comma symbol to indicate multiple parentage.

Lines 12-24 denote this gene’s four CDSs. Each CDS belongs to one of the mRNAs. cds00003 and cds00004, which correspond to alternative start codons, belong to the same mRNA.

Note that several of the features, including the gene, its mRNAs and the CDSs, all have Name attributes. This attributes assigns those features a public name, but is not mandatory. The ID attributes are only mandatory for those features that have children (the gene and mRNAs), or for those that span multiple lines. The IDs do not have meaning outside the file in which they reside. Hence, a slightly simplified version of this file would look like this:

Find more at this link: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md

正确答案

围棋的核心在于逻辑运算。答案正确与否,并非是以标准答案为准,我们可以自己进行逻辑推理去判断答案给出的解法是否合理。比如下面这一道来自“围棋大全”的题:

初始局面:(下一手是第25手)

标准解(下面的左图):标准解假设我们下了第25手以后,白棋会走第26手,实际上,白棋可以走在第27手黑棋的位置,那么局面如下面的右图。

                 

所以我认为,应当这么下:第25手直接立下。

孰优孰略,还可以精密计算,后面补充。