Thursday, 18 September 2008

Institutions hate repositories... one simple reason.

Open access is not enough!

People want to give Open Access to some of their materials at their institution however the IR software is seen as a means to manage all Institutional content and not just that which is Open Access and part of the external image of the Institution.
The problem exists in the other direction as well where repository software is trying to solve these problems, thus people are not likely to use this software until it is included.

So what do we end up with...

Lots of Repository Islands which aren't interoperable with each other!

So if we solve the access and copyright issue will people use the software? errrr No. At this point the software is an all in solution and not a service which can be utilised by current institutional practise ... Give up...?


Focus on providing a service, e.g. something which can manage your Digital Resources and enable this to plug to existing institutional services. Some softwares would argue they support this already. OK good, so don't try and solve the problem if it is just an integration issue.

To the repositories: Decouple! Build a set of services, build ways of plugging services together and allow the community to pic 'n' mix.

To the institution: You already have access control systems ask your Information/Computer Systems department. You probably already have a Content Management System for educational resources for students (Blackboard? - Integrates with an LDAP server), these use external services to manage access and authentication! Here's a few services for you... LDAP, Radius, Eduroam, Domain Controller.

Till next time!

Saturday, 19 July 2008

Is winning in Casino's a Bad Thing

Totally off my usual topics but by playing short games in the Casinos in Vegas and quitting while ahead i'm up by just over $100. It's not a lot but considering i've only been playing 1c machines and $5 - $10 blackjack I think that's quiet cool! However since I haven't lost yet does that mean i'll now want to continue... could it get addictive. Considering i'm going to Atlantic City next week is this bad news!

As for the technical note, follow the rules of blackjack on wikipedia ( and make sure you buy in enough for 10-12 hands at minimum bet, and never bet more unless wikipedia says you should! Also when you are up, 3-4 decks worth of rounds... get out!

This may not be the longest game in the world but you take the money off the Casino!
Thank-you Venitian/Palazzo Las Vegas!

Wednesday, 16 July 2008

#crigshow - Conference 2 - Worldcomp

Agents and Web Services... Why no collaboration?

Out of all the presentations at worldcomp this one struck me as one of the most obvious but not covered areas for research in computer science. Probably the most well known agent system is that used by the travel industry where they have standard ways of interfacing with each other to find details of travel and hotels available on a global scale. This is no mean feat with the number of companies there are hooking into this network.

So why doesn't the same exist for web services or if there is such a system why isn't everyone in the open community using it?

Surely the point of web services is for people to discover and use them in their own scenarios just like the agents in the travel industry do. OK so maybe the problem lies in the fact that there are so many communities that there will never be a specific use case or framework and thus hosting a generic web service network becomes infinitely hard with the number of different APIs and Implementations.

OK so if you are going to use Agents in Web Services what issues do you need to consider? Also what do you gain through doing this?

One of the key ideas which came out of a talk at worldcomp is to use Agents to be the intelligent front to a web service. This enables an agent to track of a set of web services including information about a specific web service such as availability, versions, changing cost and and offline copy if the service allows this. So the agent becomes a Rendezvous Point for a series of web services.

So why aren't we seeing more collaboration between the Agent community and the Web Services community?

Monday, 14 July 2008

#crigshow - Conference 1 - Oscelot

This open source day (#osdiii) hosted by Oscelot was an unconferene which soon became based heavily around the Blackboard platform. This was expected as the majority of people attending it were then going on to attend the BbWorld conference. With the title of the conference being Open Source and yet the main topic being that of a Closed Source product this gave an opening for the CRIG team to promote the wider Open Source community to those who are focused on Blackboard use cases.

The day was a success for the team as we promoted good practices in web development, standards, resource management and the fact that the people who manage an eLearning platform has a responsibility to the content they hold.

From our point of view, we discovered: If blackboard is the industry leader in learning management systems then the repository community is big problems when it comes to archiving these resources by the current methodologies each community practices.

More Collaboration and Awareness please!

Friday, 27 June 2008

OAI-PMH + OAI-ORE (Atom) + Pronom Droid = Pretty

I've just finished writing a wrapper (very simple!) which takes a OAI-ORE Resource Map in Atom Format and classifies the objects which are listed in the Aggregation using the National Archives (UK) technical registry (Pronom).

The wrapper provides a simple front end to the DROID tool, it takes an OAI-PHM URI and requests the latest resource maps in atom format (ore-atom) and creates a list of the resources which are passed to DROID to classify directly.

The wrapper requires OAI-PMH as it requests all records which have been modified since it last did a parse of the repository. This way the wrapper can be scheduled to run once a day/week/month etc.

A single DROID xml file comes back as the output.

This is all working with EPrints repository software currently.

Next stage is to do something useful with the output xml in terms of providing useful data back to the repository manager.

Total lines of source code for the wrapper: 302 :)

Sunday, 8 June 2008

Repository Software is Dead

Repository Software for digital collections as we know it supplies the complete solution to the client, thus without the software you cannot access any of the data in your repository. This is a bad thing for object reuse and digital preservation!

Many people at conferences such as Open Repositories 2008 and from workgroups like CRIG have been talking for a long while about the importance of Interoperability. However, if you get rid of the need for the interoperability and use a standard specification for accessing simple data objects (pdfs and their metadata), then you don't need interoperability!

So this leads me to the fact that EPrints, Fedora and hopefully at some point DSpace are abstracting their database and storage layers to support use of any type of storage platform. Thanks goes SUN Microsystems preservation action group and open storage group for pushing this work from a commercial perspective. But we need to go further than this to get rid of the need for interoperability.

From Open Repositories 2008, myself and a college Ben O'Steen from Oxford University proved how OAI-ORE (OAI specification for Object Reuse and Exchange) can be used to enable high level repository interoperability. This work won us $5000 but more importantly got the community thinking about the true power of a specification like OAI-ORE. Ben and I are now hoping to push this work down to the low level storage such that the objects within an ORE map (documents and metadata) can be directly referenced without the need for the current repository layer. For this to happen all objects need to be stored in their simplest form - NO WRAPPER FORMATS ALLOWED at the lowest level.

From recent talks with Sandy Payette and Les Carr (Fedora and EPrints respectively) I am envisaging that the current repository software becomes classified as repository service software which is able to manage low level objects but is not specifically required to access these objects. So current services which plug into the repository software can act directly on the objects.

A couple of problems to solve, security and consistency of cached data. All especially applicable if you have more than one piece of repository service software modifying your objects.

CRIG / IEDemonstator After Thoughts

IEDemonstrator is a really bad name for a project as it just says Microsoft to me but I'm fairly it isn't anything to do with that most stable of web browsers.

From the workshop it has become clear to me that discussing a specification for service interaction globally is going to be impossible. This could be due to the fact that SOAP did such a good job of it and no one wants to use anything else (enough sarcasm??). I think many people left the workshop with a much better idea at how HTTP error codes (which have been around years) already go most of the way to solving a web service model. We also realised quickly that any specification would have to be built specifically for pay services (e.g. make use of the 402 code), this would then encourage companies/institutions to supply reliable services which last more than 4 years (cough AHDS cough).

Friday, 6 June 2008

First Post - CRIG DRY Workshop

Well there's a surprise!

CRIG DRY Workshop in Bath is where I am now. So what's happening:

People have been talking about services and proposed projects to provide authoritative and complete services to users/agents/repositories. A couple of themes have come out morning session for me:

SKOS: A lot of projects (incl. Library of Congress) are using this RDF language to describe subject and properties. Each provides access to this information in so many different ways it is hard to see how to interact in a constant manor.

Service Interaction (read on as the name is not that descriptive)

This moves us on from the Open Storage stuff i've been working on (again more later in another blog post) into how we facilitate the use of services and discover how to interact with these services. We are pushing for the use of http codes! CRIG it.

Tis it for now....