The MERLIN project (2009-2011) demonstrates and evaluates the use of off-the-shelf text mining and thesaurus tools to derive descriptive subject classification from repository deposits. (MERLIN stands for: Metadata Enrichment for Repositories in a London Institutional Network.)
The MERLIN partners are UCL (University College London), The University of London Computing Centre (ULCC), and The University of Nottingham, in association with NaCTeM, the UK’s National Centre for Text Mining.
The testbed for MERLIN is the SHERPA-LEAP Consortium’s LASSO aggregation service. MERLIN uses terms extracted from texts stored in LASSO’s source repositories to demonstrate automated enhancements to the discovery and navigability of the LASSO service. Other tools, for use with individual repositories, are also under development.