BBC Russian

Applied Filters

Publications

Conferences

Publication Date

2 Results for: Book/Issue: WIDM '12: Proceedings of the twelfth international workshop on Web information and data managementEdit SearchSave SearchRSS

Searched The ACM Full-Text Collection (744,775 records)|Expand your search to The ACM Guide to Computing Literature (3,710,830 records)

Showing 1 - 2of2 Results

Filters

Select All

Export Citations Save to Binder

per page:

Relevance

research-article
November 2012
TitleFinder: extracting the headline of news web pages based on cosine similarity and overlap scoring similarity
WIDM '12: Proceedings of the twelfth international workshop on Web information and data managementNovember 2012, pp 65–72https://doi.org/10.1145/2389936.2389950

Automatically extracting the headline of online web articles has many applications in web mining and information retrieval. In this paper, we developed a content-based and domain-and language-independent approach, TitleFinder, for unsupervised extraction ...
3
275
Metrics
Total Citations3
Total Downloads275
Last 12 Months5
Last 6 weeks1
Get Access
research-article
November 2012
Web crawler middleware for search engine digital libraries: a case study for citeseerX
WIDM '12: Proceedings of the twelfth international workshop on Web information and data managementNovember 2012, pp 57–64https://doi.org/10.1145/2389936.2389949

Middleware is an important part of many search engine web crawling processes. We developed a middleware, the Crawl Document Importer (CDI), which selectively imports documents and the associated metadata to the digital library CiteSeerX crawl repository ...
4
255
Metrics
Total Citations4
Total Downloads255
Last 12 Months5
Last 6 weeks0
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

TitleFinder: extracting the headline of news web pages based on cosine similarity and overlap scoring similarity

Web crawler middleware for search engine digital libraries: a case study for citeseerX