Web Archive Collection Extractor

This method extracts event-centric collections of Web Archives through a focused crawling method. The key of this method is to adapt focused Web crawling to previously collected Web archives and to select documents by iteratively following links from relevant documents.

Data and Resources
To access the resources you must log in
Additional Info
Field Value
Accessibility Both
AccessibilityMode Download
Availability On-Line
Basic rights Download, Copying, Distribution, Modification, Communication, Making available to the public
CreationDate 2016-12-15
Creator Gossen, Gerhard, gossen@l3s.de, orcid.org//0000-0001-8492-1103
Dependencies on Other SW Hadoop
Field/Scope of use Any use
Owner Gossen, Gerhard, gossen@l3s.de, orcid.org//0000-0001-8492-1103
ProgrammingLanguage Java
RelatedPaper Gerhard Gossen, Elena Demidova, and Thomas Risse. 2016. Analyzing web archives through topic and event focused sub-collections. In Proceedings of the 8th ACM Conference on Web Science (WebSci '16). DOI: 10.1145/2908131.2908175
Sublicense rights No
Territory of use World Wide
ThematicCluster Web Analytics
UsageMode Download
system:type Method
Management Info
Field Value
Author Gossen Gerhard
Maintainer Gossen Gerhard
Version 1
Last Updated 5 January 2021, 19:49 (CET)
Created 26 September 2019, 12:28 (CEST)