☆ 4.4 Article

Web robot detection in the scholarly information environment

JOURNAL OF INFORMATION SCIENCE (2008)

期刊

JOURNAL OF INFORMATION SCIENCE

卷 34, 期 5, 页码 726-741

出版社

SAGE PUBLICATIONS LTD

DOI: 10.1177/0165551507087237

关键词

electronic journals; robot detection; web crawlers; web log analysis

类别

Computer Science, Information Systems Information Science & Library Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

An increasing number of robots harvest information on the world wide web for a wide variety of purposes. Protocols developed at the inception of the web laid out voluntary procedures in order to identify robot behaviour, and exclude it if necessary. Few robots now follow this protocol and it is now increasingly difficult to filter for this activity in reports of on-site activity. This paper seeks to demonstrate the issues involved in identifying robots and assessing their impact on usage in regard to a project which sought to establish the relative usage patterns of open access and non-open access articles in the Oxford University Press published journal Glycobiology, which offers in a single issue articles in both forms. A number of methods for identifying robots are compared and together these methods found that 40% of the raw logs of this journal could be attributed to robots.

Web robot detection in the scholarly information environment

期刊

JOURNAL OF INFORMATION SCIENCE

出版社

SAGE PUBLICATIONS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Web robot detection in the scholarly information environment

期刊

JOURNAL OF INFORMATION SCIENCE

出版社

SAGE PUBLICATIONS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文