RSP event on thesis deposit

February 15, 2011 in Uncategorized by Peter Webster

I notice an interesting event on March 28th at the British Library. It brings together results from two projects: our own Josh’s JISC-funded “Influencing the deposit of E-Theses in Higher Education” study, and the BL’s own evaluation of EThOS.

Further details from the RSP:  Supporting and Influencing the deposit of E-Theses in Higher Education

Overlay Journals

April 21, 2010 in Uncategorized by Josh Brown

It’s been a whirl of different projects over the last six months, with the working groups starting up, MERLIN moving forward and other new collaborations on the horizon. One of the research projects I have undertaken explored overlay journals, specifically in the context of the Repository Interface for Overlaid Journal Archives (RIOJA ) project. Based on my original research findings, I wrote a briefing paper which gave a simple definition of the overlay journal, examined the history and evolution of overlay journals, and laid out some thoughts about their potential to contribute to scholarly communication.

The RSP have accepted the final version as one of their guides, and it gained a citation in Alma Swan’s report for the JISC, Modelling Scholarly Communication Options: Costs and Benefits for Universities. I will also be presenting a poster based on the paper at OR10 in Madrid in June (which will be the 4th distinct version of my original paper, excluding edits and amendations!).

Repository Manager Wish Lists

April 21, 2010 in Uncategorized by Richard Davis

At a previous meeting, I offered some thoughts on some of the ‘more interesting’ things that you can do with a repository, in the form of a presentation called “1001 Things To Do With A Live Repository (Part 1)“. This has been followed by many a discussions with experts in Repoland, from EPrints developers to JISC consultants and project managers, about exactly what tools and guidance would be of most to Repository Managers. One thing many people agree is that EPrints in particular, and repository applications in general, would benefit from having a good manual, something, for example, in the O’Reilly style. There is a lot of material available online, from technical instruction at the Web sites of EPrints et al., to guidance on administration and advocacy – notably the outputs of the JISC RSP and SHERPA projects. But too often it can involve following out-of-date links or wading through online technical discussions, and many people agree that it would be beneficial to pull as much of this as possible together. One leading figure in Repoland agreed that an up-to-date, relevant, accessible manual would be a great idea, “but who has the time or money?”
With that in mind, we thought that SHERPA-LEAP might be able to dip its toe in the water: perhaps we can start to compile and share links and guidance for our members, if only we can work out where to start, and what’s really needed. Read the rest of this entry →

Funder Mandates – a new guide.

April 20, 2010 in Uncategorized by Josh Brown

A new guide to funder mandates has been prepared, and is available to anyone who would like to adapt it for their own institution. UCL have adopted it, and their version is available here.

The brief was to create a simple, easy guide for researchers to a) finding out if they are covered by a mandate and b) making their work compliant.

No page requires the researcher to scroll, all details are condensed to their absolute essentials and the design is based on three steps -

  1. Check your funder’s policy.
  2. Check your publisher complies with your funder’s policy.
  3. Make your work open access.

Even the busiest researcher should be able to find time for that!

If any other members of the consortium would like me to add any other funder policies to the list summarised, please let me know and I will prepare a summary for you to use. Otherwise, feel free to rebrand the pages for your own institution and let me know if you want any updates, changes or additions.

If you post any requests as a reply to this entry, I’ll get to it asap…

LEAP meeting latest

April 20, 2010 in Uncategorized by Josh Brown

The most recent consortium meeting was held at UCL on the 15th of April.

We began by welcoming some new members to the group – Ashley Cousins, from the Institute of Cancer Research and Briony Fane and Bethan Adams from St George’s University of London. ICR were members of LEAP a while ago, but fell out of the loop after staffing changes a few years ago. St George’s have just set up their CRIS, using Symplectic, and are in the process of starting their repository, using EPrints.

I reported on the work being done in the e-theses and media working groups – see the threads in the forum for more information, and reported on some of the other work I’ve been doing.

Rory McNicholl from the University of London Computing Centre gave us an update on the work that’s being done for the MERLIN project, adding text mining based search to LASSO. The project has used the TerMine tool (from the National Centre for Text Mining) to extract weighted terms from the full text items held in our repositories. These are then displayed as either a tag cloud or a list alongside ‘traditional’ search results. Clicking in a term brings up a variety of options for adding the term to your results, making the results screen more dynamic, and giving new ways to search. The new interface is under development, but the latest ‘draft’ is here, although it will ‘break’ periodically as Rory continues bashing its innards.

Peter Webster, from SAS, presented an overview of some in-depth interviews he’s undertaken with users of the SAS Space repository. He posed some really interesting questions for discussion, such as where our time is best spent – branding our own repositories, or building content and feeding into subject resources? – and to what extent self-archiving is an obvious good, from a researchers perspective. His reserach feeds into the work being done by the e-theses working group, so we’ll be including more discussion of it in the notes of our next meeting.

Richard Davis took the final session, in which he explored some ideas about what kinds of things repository mangers wanted to be able to do with their repositories, but might be held back from attempting by the technical demands of the task. He’ll be blogging his own report on his session, so that’s as much as I’ll say here!

Overall, it was a very positive meeting. It was great to get a sense of just how much work is being done in the consortium, and great to get a chance to share ideas. My thanks to all those who came, and special thanks to Rory, Peter and Richard for getting up and providing so much food for thought.

Happy new year!

January 4, 2010 in Uncategorized by Josh Brown

And welcome back to work – I hope you all had a great break.

It’s looking like this January’s going to be a busy one.

I’m finishing off a couple of reports and we have the media and the theses working group meetings next week. At the end of the month, there’s the MERLIN project steering group meeting to prepare for… Check out the demonstrator for a snapshot of where we’re taking the LASSO interface – it’s very much a work in progress, but we’re definitely heading for a really nice search tool. All the terms in the tag cloud come from the actual full text of items in LEAP repositories – we’ve got the text mining working! The next steps are to improve the functionality of the search (adding new ways of using those mined terms in the cloud for example) and developing the interface.

There will be new reports next week on what’s going on in the working groups, and some documents. Any feedback is much appreciated – let us know if you would like us to add anything to our agenda, or if you would like us to address anything on your behalf.

I’ll be looking to set up the next LEAP meeting soon as well – please drop me a line with any suggestions for items you’d like to discuss, speakers you’d like to invite or achievements you’d like to showcase…

That’s enough for now, but I’ll see you all soon. In the meantime, I’m just glad the sun’s out for the first day back!

Text Mining for Scholarly Communications and Repositories

November 3, 2009 in Uncategorized by Josh Brown

On the 28th and 29th of October, I attended a joint workshop, organised by the National Centre for Text Mining (NaCTeM) and UKOLN in Manchester, on Text Mining for Scholarly Communications and Repositories.

Text mining achieves something which I think is quite unusual. It manages to be both fiendishly technical (operating somewhere between computer science, linguistics and information retrieval) and absolutely fascinating to the non-specialist (in this case, me). When you see what it can do, and get a sense of what the future could have in store for the technology, it gets very interesting and exciting indeed.

The aim of this two-day workshop was to showcase a few of the clever ways that text mining is already being used, and to sketch a possible future in which text mining tools are deployed in all sorts of ways across scholarly communications. The talks were, taken individually, occasionally baffling but in combination gave some fascinating insights into what text mining can do for researchers, librarians and publishers.

There were demonstrations of the work being done at European Bioinformatics Institute/UKPMC and the Royal Society of Chemistry that showed the potential of text mining applications to find useful scientific information hidden or buried in the literature. For example, some of the most useful information for chemists is about what didn’t work – if you can be sure that a given set of reactions won’t work, you can save a lot of time and money by not bothering to repeat them. This information is often very difficult to find in the literature, but by letting a computer sift through the text of many thousands of reports and datasets for you, you can find it with a huge saving in time.

A repository specific set of applications, UKOLN’s FixRep project and Intute Repository Search, demonstrated that text mining is set to change the way we handle our metadata. FixRep scans the full text and uses the information in it to complete metadata fields – an obvious benefit – while Intute Repository search complements metadata by extracting keywords from the full text content of repositories and exposing them for retrieval alongside more formal metadata. If we add our own MERLIN project, which takes the idea of extracting key words from full text and using them to automatically create a subject tree, opening up new relations between items for searchers, then we get a vision of just how sophisticated text mining-based automatic metadata generation, usage and mapping could become, and how powerful the new tools it offers will be.

Examples of text mining applications being used by real-life researchers provided a fascinating insight into the ways in which the technologies can enable researchers to efficiently exploit what us now literature overload. There were demonstrations, from the Institute of Education of how text mining can make systematic reviews quicker and more effective by pre-digesting a huge number of papers and drawing out the relevant ones for reviewers to use, and from Cambridge of how text mining can expand citation analysis by enabling researchers to separate out positive and negative citations.

Tony Hey from Microsoft External Research and Sidi Rafael from Elsevier, gave us some ideas about how the current technological landscape offers the potential for the tools and technologies that we had been shown to explode in a way that will change the landscape of scholarly communication dramatically. Tony Hey emphasised the raw computing power available via cloud computing services, enabling vastly power-hungry calculations and processes to be undertaken muchmore cheaply at a huge data farm. He argued that today’s research has so much data at its disposal that we are entering a new paradigm which is computationally intensive and involves sifting and combining huge data sets as a core activity.

Sidi Rafael emphasised the way in which more and more big companies are opening up their development process to create new products using the ingenuity of their users. Elsevier will be opening up platforms for opportunistic developers to create new tools, a development that will offer rapid evolution of the services available. Taken together, this vision of huge computational power and rapidly evolving services suggests that the power and potential of text mining tools is about to explode in all sorts of ways.

The workshop rounded up with what is usually politely termed a ‘lively discussion’, in which some of the legal issues that remain to be addressed (open access, copyright, re-use and so on) were clearly named as barriers to the effective exploitation of text mining tools. Those issues aside, the tone was overwhelmingly optimistic for the future of text mining as a new and emerging engine for scholarly communication.

Additional documents and embargoes

September 17, 2009 in Ars Technica by Josh Brown

If you want to:

  • Add a published version to a preprint in your archive
  • Set an embargo
  • Add or remove the ‘request a copy’ button and contact details

Then check out this screencast from Richard Davis showing you exactly how to do it.

LEAP meeting, thanks to you all!

September 10, 2009 in Uncategorized by Josh Brown

Just to say again what a pleasure it was to see so many of you yesterday. Thanks for all your input and suggestions, and a special thank you to the speakers for the afternoon – four very interesting presentations and a lot to think about. Thanks again to Beth at SOAS for providing the venue and the biscuits!

The themes for the day seemed to work well, and it looks like there are some great opportunities for collaboration. Dominic Tate’s presentation on the future plans of the RSP was great, and I for one am particularly interested in the possibility of RSP funding for LEAP advocacy events – if we can time them to coincide with RSP campaigns, so much the better. The idea of building our next wave of advocacy around funder mandates seemed popular, and would enable us to really publicise the work being done by the mandate working group as well.

Adrian Clark’s contribution to the meeting sparked quite a few ideas, and I think the various strands of research we’re undertaking, and Peter’s suggestions for looking at graduate students’ attitudes to theses mandates could lead to some useful work. I’m going to send some requests for information on proposed mandates to the UKCoRR list and continue to examine institutional and funder mandates – please add to the Mandates thread in the forum if you have any news or information to share.

Steve Grace’s presentation on R4R showed a potentially really interesting way of linking publications metadata to other kinds of research information using CERIF. I think the adaptation of CERIF for the REF could prove informative in all sorts of ways – and provides us all with a pathway to better integration with other campus information systems. The more systems we work with, the more visible we are, and if we can use the REF to push our repositories up the institutional agenda, well, it can’t be bad can it? If you’re using the IR for the REF, keep us posted on what you’re doing, how it’s going, what challenges you face and how you overcome them.

Richard Davis gave some great examples of ways we can customise our interfaces to make them more meaningful to users. The more we can incorporate the kind of interactions users want in our pages, the more attractive we can make our repositories. I also particularly like the idea of using folksonomies – on one level, and this is the librarian in me coming out, I hate the concept, but pragmatically speaking, more and more people use them as discovery tools, and we’d be mad to miss the boat. We have two sets of users, researchers and researchers. It’s all too easy to focus on researchers who have research to add to our repositories and forget about the ones who want to get research out of them again. Anything we can do to make it easier and friendlier for them to do it has to be a good thing.

It was great to see some agreement around the working groups – I’ll be adding to the forum threads later, and planning initial meetings in the coming weeks and we should have some progress to report by the next LEAP meet. I think the online library of policy documents about what to accept, copyright, anything you’re willing to share basically, will be a real help to members who may have to revisit their policies or may be starting from scratch. Please upload them to the forum, and if you encounter a tricky question, let us know!

Finally, if you have any particular success or good ideas for advocacy, share them! This is an ongoing concern for us all, and it would be great to get as many ideas as possible for us to take forward.

I’m looking forward to the next meeting in January!

OA Repositories in the Arts

July 20, 2009 in Uncategorized by Richard Davis

I thought it worth mentioning the RSP event at British Academy last Tuesday (14th July) discussing Arts-based repository approaches. I talked about the little PRIMO repo project we did with IMR. Also on the bill was the rather more substantial KULTUR repo of UAL and Southampton.

Jacqueline was there too, and I’ve posted a very brief report on ULCC’s DA Blog so won’t rehash that. Worth noting though the growing pressure to move repositories beyond the PDF/journal-article model, without breaking the bank. I am also personally quite interested in whether it’s possible to devolve wizzy dissemination functions to services like YouTube and Flickr, while still keeping master copies and metadata in the IR.

Bill Hubbard agreed to take some of these themes forward for future RSP activities – I’ll report back anything I hear.