Регистрация | Вход в службу | FAQ      [?] 
Recent | Unread | Search | Authors | Tags | Export

ChaTo crawling [78 articles]

Recent papers added to ChaTo library classified by the tag crawling. You can also see everyone's crawling.
  • IRLbot: scaling to 6 billion pages and beyond
    (2008), pp. 427-436.
    by Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, Dmitri Loguinov
    posted to crawling by ChaTo on 2008-08-13 13:38:22 as read along with 1 person agulli
  • notes On the peninsula phenomenon in web graph and its implications on web search
    Computer Networks, Vol. 51, No. 1. (January 2007), pp. 177-189.
    by Tao Meng, Hong-Fei Yan
    posted to crawling web-graph characterization by ChaTo on 2008-04-24 01:45:10 as read along with 1 person shashikant
  • notes Do not crawl in the DUST: different URLs with similar text
    (2006), pp. 1015-1016.
    by Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
    posted to crawling by ChaTo on 2008-04-08 18:34:24 as read along with 1 person fmccown
  • WIRE: an open-source Web information retrieval environment
    (September 2005), pp. 27-30.
    by Carlos Castillo, Ricardo B Yates
    edited by Michel Beigbeder, Wai G Yee
  • Decoding the structure of the WWW: A comparative analysis of Web crawls
    ACM Trans. Web, Vol. 1, No. 2. (August 2007)
    by ángeles M Serrano, Ana Maguitman, Marián, Santo Fortunato, Alessandro Vespignani
  • notes Effective Web Crawling
    (December 2004)
    by Carlos Castillo
    posted to crawling by ChaTo on 2007-06-05 15:25:00 as read along with 1 person tolosoft
  • notes A large-scale study of robots.txt
    (2007), pp. 1123-1124.
    by Yang Sun, Ziming Zhuang, Lee C Giles
    posted to crawling by ChaTo on 2007-05-10 23:16:29 as *****
  • notes Random web crawls
    (2007), pp. 451-460.
    by Toufik Bennouas, Fabien de Montgolfier
    posted to crawling characterization by ChaTo on 2007-05-09 22:56:53 as read
  • notes Do not crawl in the dust: different urls with similar text
    (2007), pp. 111-120.
    by Ziv Bar-Yossef, Idit Keidar, Uri Schonfeld
    posted to crawling similarity by ChaTo on 2007-05-09 21:05:01 as read along with 1 person mapio
  • notes The discoverability of the web
    (2007), pp. 421-430.
    by Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christopher Olston, Sandeep Pandey, Andrew Tomkins
    posted to crawling time by ChaTo on 2007-05-09 20:54:56 as read along with 1 person ssn
  • A Memory-Efficient Strategy for Exploring the Web
    (December 2006), pp. 680-686.
    by Carlos Castillo, Alberto Nelli, Alessandro Panconesi
    posted to crawling search by ChaTo on 2006-12-21 05:58:03 as *****
  • notes Geographical partition for distributed web crawling
    (2005), pp. 55-60.
    by Jos&\#233; Exposto, Joaquim Macedo, Ant&\#243;nio Pina, Albano Alves, Jos&\#233; Rufino
    posted to crawling by ChaTo on 2006-11-17 09:33:41 as **
  • notes Crawling the Infinite Web
    Journal of Web Engineering, Vol. 6, No. 1. (15 February 2007), pp. 49-72.
    by Ricardo Baeza-Yates, Carlos Castillo
    posted to clicks crawling by ChaTo on 2006-10-30 12:13:50 as read along with 1 person IP
  • notes Estimating the global pagerank of web communities
    (2006), pp. 116-125.
    by Jason V Davis, Inderjit S Dhillon
    posted to crawling ranking sampling web-graph by ChaTo on 2006-10-10 16:52:19 as read
  • notes Optimal crawling strategies for web search engines
    (2002), pp. 136-147.
    by JL Wolf, MS Squillante, PS Yu, J Sethuraman, L Ozsen
    posted to crawling by ChaTo on 2006-06-15 15:09:46 as **** along with 2 people ansobol jrw
  • notes Topic-specific crawling on the Web with the measurements of the relevancy context graph
    Information Systems, Vol. 31, No. 4-5. ( 2006), pp. 232-246.
    by Ching-Chi Hsu, Fan Wu
    posted to crawling by ChaTo on 2006-06-05 16:40:28 as read along with 1 person schaal
  • notes Approximating Aggregate Queries about Web Pages via Random Walks
    (2000), pp. 535-544.
    by Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, Dror Weitz
    posted to crawling web-graph characterization by ChaTo on 2006-04-13 12:32:17 as read
  • notes Effective web crawling
    SIGIR Forum, Vol. 39, No. 1. (June 2005), pp. 55-56.
    by Carlos Castillo
    posted to crawling by ChaTo on 2006-04-12 15:33:09 as read along with 3 people ssn jrw taho
  • notes Growing and navigating the small world Web by local content.
    Proc Natl Acad Sci U S A, Vol. 99, No. 22. (29 October 2002), pp. 14014-14019.
    by F Menczer
    posted to crawling by ChaTo on 2006-02-15 14:08:29 as read along with 1 person and 1 group camster dbk-lab
  • notes Controlling the Queue Size in Web Crawling
    (December 2006)
    by Carlos Castillo, Alberto Nelli, Alessandro Panconesi
    posted to crawling by ChaTo on 2006-02-04 11:52:41 as read
  • Link Contexts in Classifier-Guided Topical Crawlers
    IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 1. (January 2006), pp. 107-122.
    by Gautam Pant, Padmini Srinivasan
    posted to crawling by ChaTo on 2006-01-20 11:52:13 as ** along with 1 person zhengzhong
  • WebBase: a repository of Web pages
    Computer Networks (Amsterdam, Netherlands: 1999), Vol. 33, No. 1--6. (2000), pp. 277-293.
    by Jun Hirai, Sriram Raghavan, Hector Garcia-Molina, Andreas Paepcke
  • notes Person resolution in person search results: WebHawk
    (2005), pp. 163-170.
    by Xiaojun Wan, Jianfeng Gao, Mu Li, Binggong Ding
    posted to crawling search by ChaTo on 2005-11-29 10:26:39 as read along with 2 people and 1 group oscar jiria onekin
  • WIRE: an open-source Web information retrieval environment
    (September 2005), pp. 27-30.
    by Carlos Castillo, Ricardo Baeza-Yates
    edited by Michel Beigbeder, Wai G Yee
    posted to crawling by ChaTo on 2005-08-10 12:49:46 as read
  • Viúva negra
    (2005)
    by Daniel Gomes
    posted to crawling by ChaTo on 2005-06-30 11:34:03 as read
  • Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering
    (2005), pp. 864-872.
    by Ricardo Baeza-Yates, Carlos Castillo, Mauricio Marín, Andrea Rodríguez
    posted to crawling by ChaTo on 2005-06-30 11:34:02 as read
  • NontriSearch: search engine for campus network
    (1998)
    posted to crawling by ChaTo on 2005-06-30 11:33:59 as read
  • notes High performance crawling system
    (2004), pp. 299-306.
    by Youn&\#232;s Hafri, Chabane Djeraba
    posted to crawling by ChaTo on 2005-06-30 11:33:54 as read
  • Crawling the infinite Web: five levels are enough
    Vol. 3243 (2004), pp. 156-167.
    by Ricardo Baeza-Yates, Carlos Castillo
    posted to crawling web-graph by ChaTo on 2005-06-30 11:33:51 as read
  • Design of a crawler with bounded bandwidth
    (2004), pp. 292-293.
    by Michelangelo Diligenti, Marco Maggini, Filippo M Pucci, Franco Scarselli
    posted to crawling by ChaTo on 2005-06-30 11:33:51 as read
  • Scheduling algorithms for Web crawling
    (2004), pp. 10-17.
    by Carlos Castillo, Mauricio Marin, Andrea Rodríguez, Ricardo Baeza-Yates
    posted to crawling by ChaTo on 2005-06-30 11:33:50 as read
  • Keeping up with the changing web
    IEEE Computer, Vol. 33, No. 5. (May 2000), pp. 52-58.
    by Brian Brewington, George Cybenko
    posted to crawling time by ChaTo on 2005-06-30 11:33:50 as read
  • notes UbiCrawler: a scalable fully distributed Web crawler
    Software, Practice and Experience, Vol. 34, No. 8. (2004), pp. 711-726.
    by Paolo Boldi, Bruno Codenotti, Massimo Santini, Sebastiano Vigna
    posted to crawling by ChaTo on 2005-06-30 11:33:50 as read along with 1 person donade
  • Design and Implementation of a Distributed Crawler and Filtering Processor
    Vol. 2382 (June 2002), pp. 58-74.
    by Demetrios Z Yazti, Marios D Dikaiakos
    posted to crawling by ChaTo on 2005-06-30 11:33:50 as read
  • GENVL and WWWW: Tools for taming the Web
    (May 1994)
    by Oliver A Mcbryan
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Finding What People Want: Experiences with the WebCrawler
    (May 1994)
    by Brian Pinkerton
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • The RBSE spider: balancing effective search against web load
    (May 1994)
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • WebBase
    (2002)
    by Lois Dacharay
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • WebSPHINX, a personal, customizable Web crawler
    (2004)
    by Rob Miller
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Larbin
    (2004)
    by Sebastien Ailleret
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Mercator: A Scalable, Extensible Web Crawler
    World Wide Web Conference, Vol. 2, No. 4. (April 1999), pp. 219-229.
    by Allan Heydon, Marc Najork
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Balancing volume, quality and freshness in Web crawling
    (2002), pp. 565-572.
    by Ricardo Baeza-Yates, Carlos Castillo
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Design and Implementation of a High-Performance Distributed Web Crawler
    (February 2002)
    by Vladislav Shkapenyuk, Torsten Suel
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • A new crawling model
    (2002)
    by Carlos Castillo, Ricardo Baeza-Yates
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • CoBWeb - A crawler for the Brazilian Web
    (1999), pp. 184-191.
    by Altigran S da Silva, Eveline A Veloso, Paulo B Golgher, Berthier, Alberto HF Laender, Nivio Ziviani
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • An adaptive model for optimizing performance of an incremental web crawler
    (May 2001), pp. 106-113.
    by Jenny Edwards, Kevin S Mccurley, John A Tomlin
    posted to crawling by ChaTo on 2005-06-30 11:33:48 as read
  • Estimating Frequency of Change
    ACM Transactions on Internet Technology, Vol. 3, No. 3. (2003)
    by Junghoo Cho, Hector Garcia-Molina
    posted to crawling time by ChaTo on 2005-06-30 11:33:47 as read
  • Effective page refresh policies for Web crawlers.
    ACM Transactions on Database Systems, Vol. 28, No. 4. (2003)
    by Junghoo Cho, Hector Garcia-Molina
    posted to crawling by ChaTo on 2005-06-30 11:33:47 as read
  • Synchronizing a database to improve freshness
    (2000), pp. 117-128.
    by Junghoo Cho, Hector Garcia-Molina
    posted to crawling by ChaTo on 2005-06-30 11:33:47 as read along with 1 person mapio
  • Breadth-first crawling yields high-quality pages
    (May 2001), pp. 114-118.
    by Marc Najork, Janet L Wiener
    posted to crawling by ChaTo on 2005-06-30 11:33:47 as read along with 1 person mapio
  • Вы можете ссылаться на эту страницу по адресу: http://www.citeulike.org/user/ChaTo/tag/crawling

    Result page: 1 2 Next RIS BibTeX
    CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.