Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-settings.php on line 512

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-settings.php on line 527

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-settings.php on line 534

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-settings.php on line 570

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-includes/cache.php on line 103

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-includes/query.php on line 61

Deprecated: Assigning the return value of new by reference is deprecated in /mnt/tb/scilsdata/www/comminfo.rutgers.edu/htdocs2/conferences/mmchallenge/wp-includes/theme.php on line 1109
Evaluation Metrics & Example Test Data for HP Web Content Identification Challenge – Multimedia Grand Challenge 2010
ACM MM 2010 Header Image

Evaluation Metrics & Example Test Data for HP Web Content Identification Challenge

The goal of the algorithms for the HP web content identiciation challenge is to retrieve or label all the informative multimedia content in web pages. The performance of the algorithms would be measured by comparing the automatically computed results with the manually labeled ground truth. Ground truth is generated by manually labeling all the informative multimedia content (i.e. images/video/flash objects) in a pre-selected set of the web pages (in various languages). The algorithm is expected to retrieve nearly all the informative multimedia content in the web pages.

 

The precision is defined as the number of informative images classified/labeled correctly by the algorithm divided by the total number of images labeled as informative by the algorithm. In other words, precision is the number of true positives divided by the sum of true positives and false positives. Recall is defined as the number of informative images classified/labeled correctly by the algorithm divided by the total number of informative images (which should have been labeled as informative). Recall is the number of true positives divided by the sum of true positives and false negatives. The final comparisons between the algorithms will be made by computing the F-measures using the precision and recall.

 

Here are examples of the types of web pages that we plan to use to evaluate the submissions:

 

* English

http://www.mapquest.com/maps?1c=Palo+Alto&1s=CA&2c=San+Francisco+&2s=CA (Driving directions)

 

http://www.buy.com/specialty_store_1/promotions/33379.html (shopping)

(Note, images of similar products or other product recommendations are considered informative)

 

http://edition.cnn.com/2009/TECH/05/27/ship.sinking.reef/index.html (news)

 

* Chinese

http://news.sina.com/oth/phoenixtv/502-104-103-108/2009-05-27/01323899156.html (entertainment articles)

 

http://www.china.travel/sym/lyhd/2009-05-14/274878.shtml (travel)

 

http://www.yahtour.com/destination/province.php?id=2238 (travel)

 

 

* Arabic

http://www.aljazeera.net/NR/exeres/5CC37A8B-39E7-4692-BCD9-2D8807ACE580.htm

 

http://www.marma.net/content.prt-CID=16303

 

* Korean

http://news.chosun.com/site/data/html_dir/2009/05/28/2009052800708.html (news)

 

http://blog.naver.com/honeykja/40045216645 (blog, recipe)

 

0 Comments on “Evaluation Metrics & Example Test Data for HP Web Content Identification Challenge”

Leave a Comment