KEGG ORTHOLOGY (KO) Database

在KEGG中,分子水平上的功能保存在KO(KEGG Orthology)数据库中。这些功能与直系同源组联系在一起,以此来使得一个特殊物种的实验数据可以被扩展到其他物种。KEGG中的基因组注释是直系同源注释,其方式为,为GENES数据库中的每个基因制定KO iden­ti­fiers (K num­bers) 。对于原始数据,像由RefSeq或者GenBank给出的基因名和描述,即使他们和KO的分配不一致,KEGG也不会做任何修改。

将KO的条目与功能表征的序列数据的实验证据联系在一起的工作,已经开始了,并且现在已经展示在REFERENCE下的SEQUENCE子域中。而且,基因组层面的“KEGG GENES”(http://www.genome.jp/kegg/genes.html)集合已经被扩展,使其可以将蛋白数据也包含在附录中。最终KO数据库将覆盖所有的功能表征蛋白序列信息(另见“KEGG Enzyme”(http://www.genome.jp/kegg/annotation/enzyme.html))。

In KEGG, mol­e­c­u­lar-lev­el func­tions are stored in the KO (KEGG Orthol­o­gy) data­base and asso­ci­at­ed with ortholog groups in order to enable exten­sion of exper­i­men­tal evi­dence in a speci­fic organ­ism to oth­er organ­isms. Genome anno­ta­tion in KEGG is ortholog anno­taion, assign­ing KO iden­ti­fiers (K num­bers) to indi­vid­u­al genes in the GENES data­base. No updates are made to orig­i­nal data, such as gene names and descrip­tions given by Ref­Seq or Gen­Bank, even if they are incon­sis­tent with the KO assign­ment.

Major efforts have been ini­tat­ed to asso­ciate each KO entry with exper­i­men­tal evi­dence of func­tion­al­ly char­ac­ter­ized sequence data, now shown in the SEQUENCE sub­field of the REFERENCE field. Fur­ther­more, the genome-based col­lec­tion of KEGG GENES has been expand­ed to allow indi­vid­u­al pro­tein data to be includ­ed in the adden­dum cat­e­go­ry. Even­tu­al­ly the KO data­base will cov­er all knowl­edge on func­tion­al­ly char­ac­ter­ized pro­tein sequences (see also KEGG Enzyme).

一般来说,KO对功能直系同源的划分是定义在KEGG分子网络的语境中(KEGG path­way maps, BRITE hier­ar­chies and KEGG modules)。KEGG分子网络实际上是由K numbers标识的网络节点表示的。KOs和相应的分子网络的关系呗存储在下面这个系统中。

KEGG Orthol­o­gy (KO

将功能信息和直系同源组关联在一起这个功能是KEGG资源的一个独特的功能。基于有限总量的实验数据生成的对序列相似性的预测被预先定义好在KEGG中。如同在BlastKOALA和其他工具中实现的那样,对KEGG GENES的序列相似性搜索是针对K numbers的。一旦一个K numbers被指定给基因组中的基因,KEGG path­ways maps, Brite hierarchies,和KEGG modules都会自动重建。如此一来,就能对较高水平的功能有一个生物学上的科学的诠释。

In gen­er­al KO group­ing of func­tion­al orthologs is defined in the con­text of KEGG mol­e­c­u­lar net­works (KEGG path­way maps, BRITE hier­ar­chies and KEGG mod­ules), which are in fact rep­re­sent­ed as net­works of nodes iden­ti­fied by K num­bers. The rela­tion­ships between KOs and cor­re­spond­ing mol­e­c­u­lar net­works are rep­re­sent­ed in the fol­low­ing KO sys­tem.

KEGG Orthol­o­gy (KO)The fact that func­tion­al infor­ma­tion is asso­ci­at­ed with ortholog groups is a unique aspect of the KEGG resource. The sequence sim­i­lar­i­ty based infer­ence as a gen­er­al­iza­tion of lim­it­ed amount of exper­i­men­tal evi­dence is pre­de­fined in KEGG. As imple­ment­ed in BlastKOALA and oth­er tools, the sequence sim­i­lar­i­ty search again­st KEGG GENES is a search for most appro­pri­ate K num­bers. Once K num­bers are assigned to genes in the genome, the KEGG path­ways maps, Brite hier­ar­chies, and KEGG mod­ules are auto­mat­i­cal­ly recon­struct­ed, enabling bio­log­i­cal inter­pre­ta­tion of high-lev­el func­tions.

Leave a Reply

Your email address will not be published. Required fields are marked *