<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
   xmlns="http://purl.org/rss/1.0/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
   xmlns:dcterms="http://purl.org/dc/terms/"

>
<channel rdf:about="http://www.citeulike.org/about">
<pubDate>Sat, 26 Jul 2008 17:28:10 BST</pubDate>


	<title>CiteULike: xingxu Gerstein</title>
	<description>CiteULike: xingxu Gerstein</description>


	<link>http://www.citeulike.org/user/xingxu/author/Gerstein</link>
	<dc:publisher>CiteULike.org</dc:publisher>
	<dc:language>en-gb</dc:language>
	<dc:rights>Copyright &#169; 2004-2008 citeulike.org</dc:rights>
	<items>
    <rdf:Seq>
        <rdf:li rdf:resource="http://www.citeulike.org/user/xingxu/article/910292"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/xingxu/article/1572696"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/xingxu/article/1302950"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/xingxu/article/1277867"/>
        <rdf:li rdf:resource="http://www.citeulike.org/user/xingxu/article/1572693"/>

	</rdf:Seq>
	</items>
	</channel>


<item rdf:about="http://www.citeulike.org/user/xingxu/article/910292">
    <title>A supervised hidden Markov model framework for efficiently segmenting tiling array data in transcriptional and ChIP-chip experiments: systematically incorporating validated biological knowledge.</title>
    <link>http://www.citeulike.org/user/xingxu/article/910292</link>
    <description>&lt;i&gt;Bioinformatics (12 October 2006)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;MOTIVATION: Large-scale tiling array experiments are becoming increasingly common in genomics. In particular, the ENCODE project requires the consistent segmentation of many different tiling array data sets into &#34;active regions&#34; (e.g. finding transfrags from transcriptional data and putative binding sites from ChIP-chip experiments). Previously, such segmentation was done in an unsupervised fashion mainly based on characteristics of the signal distribution in the tiling array data itself. Here we propose a supervised framework for doing this. It has the advantage of explicitly incorporating validated biological knowledge into the model and allowing for formal training and testing. Methodology: In particular, we use a hidden Markov model (HMM) framework, which is capable of explicitly modeling the dependency between neighboring probes and whose extended version (the generalized HMM) also allows explicit description of state duration density. We introduce a formal definition of the tiling-array analysis problem, and explain how we can use this to describe sampling small genomic regions for experimental validation to build up a gold-standard set for training and testing. We then describe various ideal and practical sampling strategies (e.g. maximizing signal entropy within a selected region versus using gene annotation or known promoters as positives for transcription or ChIP-chip data, respectively). RESULTS: For the practical sampling and training strategies, we show how the size and noise in the validated training data affects the performance of an HMM applied to the ENCODE transcriptional and ChIP-chip experiments. In particular, we show that the HMM framework is able to efficiently process tiling array data as well as or better than previous approaches. For the idealized sampling strategies, we show how we can assess their performance in a simulation framework and how a maximum entropy approach, which samples sub-regions with very different signal intensities, gives the maximally performing gold-standard. This latter result has strong implications for the optimum way medium-scale validation experiments should be carried out to verify the results of the genome-scale tiling array experiments. SUPPLEMENTARY INFORMATION: The supplementary materials are available at http://tiling.gersteinlab.org/hmm/.</description>
    <dc:title>A supervised hidden Markov model framework for efficiently segmenting tiling array data in transcriptional and ChIP-chip experiments: systematically incorporating validated biological knowledge.</dc:title>

    <dc:creator>Jiang Du</dc:creator>
    <dc:creator>Joel S Rozowsky</dc:creator>
    <dc:creator>Jan O Korbel</dc:creator>
    <dc:creator>Zhengdong D Zhang</dc:creator>
    <dc:creator>Thomas E Royce</dc:creator>
    <dc:creator>Martin H Schultz</dc:creator>
    <dc:creator>Michael Snyder</dc:creator>
    <dc:creator>Mark Gerstein</dc:creator>
    <dc:identifier>doi:10.1093/bioinformatics/btl515</dc:identifier>
    <dc:source>Bioinformatics (12 October 2006)</dc:source>
    <dc:date>2006-10-23T15:32:54-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Bioinformatics</prism:publicationName>
    <prism:issn>1460-2059</prism:issn>
    <prism:category>bioinformatics</prism:category>
    <prism:category>methods</prism:category>
    <prism:category>tiling</prism:category>
    <prism:category>transcriptome</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/xingxu/article/1572696">
    <title>Large-scale analysis of pseudogenes in the human genome</title>
    <link>http://www.citeulike.org/user/xingxu/article/1572696</link>
    <description>&lt;i&gt;Current Opinion in Genetics &#38; Development, Vol. 14, No. 4. (August 2004), pp. 328-335.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Pseudogenes are considered as genomic fossils: disabled copies of functional genes that were once active in the ancient genome. Recently, whole-genome computational approaches have revealed thousands of pseudogenes in the genomes of the human and other eukaryotes. Identification of these pseudogenes can improve the accuracy of gene annotation. It also offers new insight on the evolutionary history and the stability of the genome as a whole.</description>
    <dc:title>Large-scale analysis of pseudogenes in the human genome</dc:title>

    <dc:creator>Zhaolei Zhang</dc:creator>
    <dc:creator>Mark Gerstein</dc:creator>
    <dc:identifier>doi:10.1016/j.gde.2004.06.003</dc:identifier>
    <dc:source>Current Opinion in Genetics &#38; Development, Vol. 14, No. 4. (August 2004), pp. 328-335.</dc:source>
    <dc:date>2007-08-17T16:29:52-00:00</dc:date>
    <prism:publicationYear>2004</prism:publicationYear>
    <prism:publicationName>Current Opinion in Genetics &#38; Development</prism:publicationName>
    <prism:volume>14</prism:volume>
    <prism:number>4</prism:number>
    <prism:startingPage>328</prism:startingPage>
    <prism:endingPage>335</prism:endingPage>
    <prism:category>genomics</prism:category>
    <prism:category>pseudogene</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/xingxu/article/1302950">
    <title>PseudoPipe: an automated pseudogene identification pipeline.</title>
    <link>http://www.citeulike.org/user/xingxu/article/1302950</link>
    <description>&lt;i&gt;Bioinformatics, Vol. 22, No. 12. (15 June 2006), pp. 1437-1439.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;MOTIVATION: Mammalian genomes contain many 'genomic fossils' i.e. pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or retrotransposition events. Pseudogenes are important resources in understanding the evolutionary history of genes and genomes. RESULTS: We have developed a homology-based computational pipeline ('PseudoPipe') that can search a mammalian genome and identify pseudogene sequences in a comprehensive and consistent manner. The key steps in the pipeline involve using BLAST to rapidly cross-reference potential &#34;parent&#34; proteins against the intergenic regions of the genome and then processing the resulting &#34;raw hits&#34; -- i.e. eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Finally, pseudogenes are classified based on a combination of criteria including homology, intron-exon structure, and existence of stop codons and frameshifts.</description>
    <dc:title>PseudoPipe: an automated pseudogene identification pipeline.</dc:title>

    <dc:creator>Z Zhang</dc:creator>
    <dc:creator>N Carriero</dc:creator>
    <dc:creator>D Zheng</dc:creator>
    <dc:creator>J Karro</dc:creator>
    <dc:creator>PM Harrison</dc:creator>
    <dc:creator>M Gerstein</dc:creator>
    <dc:identifier>doi:10.1093/bioinformatics/btl116</dc:identifier>
    <dc:source>Bioinformatics, Vol. 22, No. 12. (15 June 2006), pp. 1437-1439.</dc:source>
    <dc:date>2007-05-17T16:17:47-00:00</dc:date>
    <prism:publicationYear>2006</prism:publicationYear>
    <prism:publicationName>Bioinformatics</prism:publicationName>
    <prism:issn>1367-4803</prism:issn>
    <prism:volume>22</prism:volume>
    <prism:number>12</prism:number>
    <prism:startingPage>1437</prism:startingPage>
    <prism:endingPage>1439</prism:endingPage>
    <prism:category>bioinformatics</prism:category>
    <prism:category>pseudogene</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/xingxu/article/1277867">
    <title>Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.</title>
    <link>http://www.citeulike.org/user/xingxu/article/1277867</link>
    <description>&lt;i&gt;Nucleic Acids Res, Vol. 35, No. Database issue. (January 2007)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current 'consensus' set of pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100,000 pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources.</description>
    <dc:title>Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.</dc:title>

    <dc:creator>JE Karro</dc:creator>
    <dc:creator>Y Yan</dc:creator>
    <dc:creator>D Zheng</dc:creator>
    <dc:creator>Z Zhang</dc:creator>
    <dc:creator>N Carriero</dc:creator>
    <dc:creator>P Cayting</dc:creator>
    <dc:creator>P Harrrison</dc:creator>
    <dc:creator>M Gerstein</dc:creator>
    <dc:source>Nucleic Acids Res, Vol. 35, No. Database issue. (January 2007)</dc:source>
    <dc:date>2007-05-04T19:10:08-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Nucleic Acids Res</prism:publicationName>
    <prism:issn>1362-4962</prism:issn>
    <prism:volume>35</prism:volume>
    <prism:number>Database issue</prism:number>
    <prism:category>platform</prism:category>
    <prism:category>pseudogene</prism:category>
</item>



<item rdf:about="http://www.citeulike.org/user/xingxu/article/1572693">
    <title>The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?</title>
    <link>http://www.citeulike.org/user/xingxu/article/1572693</link>
    <description>&lt;i&gt;Trends Genet, Vol. 23, No. 5. (May 2007), pp. 219-224.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Pseudogenes have long been considered to be 'dead', nonfunctional by-products of genome evolution. However, several lines of evidence now show that some pseudogenes are transcriptionally 'alive', and a few might even have biochemical roles. Therefore, the boundary between genes (often considered to be 'living') and pseudogenes (often considered to be 'dead') might be ambiguous and difficult to define. Here, we examine the evidence for and against pseudogene functionality, and we argue that the time is ripe for revising the definition of a pseudogene. Furthermore, we suggest a classification system to accommodate pseudogenes with various levels of functionality.</description>
    <dc:title>The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?</dc:title>

    <dc:creator>D Zheng</dc:creator>
    <dc:creator>MB Gerstein</dc:creator>
    <dc:identifier>doi:10.1016/j.tig.2007.03.003</dc:identifier>
    <dc:source>Trends Genet, Vol. 23, No. 5. (May 2007), pp. 219-224.</dc:source>
    <dc:date>2007-08-17T16:28:03-00:00</dc:date>
    <prism:publicationYear>2007</prism:publicationYear>
    <prism:publicationName>Trends Genet</prism:publicationName>
    <prism:issn>0168-9525</prism:issn>
    <prism:volume>23</prism:volume>
    <prism:number>5</prism:number>
    <prism:startingPage>219</prism:startingPage>
    <prism:endingPage>224</prism:endingPage>
    <prism:category>pseudogene</prism:category>
    <prism:category>review</prism:category>
</item>



</rdf:RDF>

