The overall objective is to invert bibliographic data from its traditional format where each record describes a document. We want to create to a CV-style format that has authors as the heading and the documents written by the author underneath it. This allows for a navigation of the bibliographical space by author. It also prepare for performance evaluation of authors.


There are two sources. One is a set of simple document data from the OKFN sponsored 3lib project. 3lib. These data are de facto open because they contain only factual descriptions of documents, titles, author names, identifiers. The document data describe scientific articles and preprint. The other source is a set of author profiles that are openly available from AuthorClaim.


Authors are referenced in bibliographic information by names. Names are ambiguous. There are many ways to write the name of a single person. We call these "name expressions". Several persons may share valid name expressions.

Since names don't identify authors, AuthorProfile can not do a reliable job. The AuthorClaim project allows authors to claim documents. Only a very small part of documents are subject to author claims at this time. These are the people for authoritative publication lists are available. For the others we have to use name expressions. We look at bibliographic data records containing such author name expressions, and create files, one for each author name expression. We call this process "auversion".

The system will have list of author pages as top-entry navigation. Author pages can only be constructed for AuthorClaim registrants. However most AuthorClaim registrants have coauthors, and most of these are not yet registered. These non-registered co-authors then provide entry points to author name expressions, etc. Thus a substantial part of "auverted" bibliographic data can be linked from the authors.


In addition to navigating a set of authors (not implemented yet), we plan two navigational features. First, we want to link from an "auverted" author name page to the closest registered. By "closest" we mean by shortest intermediate author name expression path through co-authorship. This is partly implemented on our test set system. We call this "vertical integration". Second, we want to provide links between related author name expression. Assume for example, we have the author name J. Griffin, but we also have James Griffin, we want to create a link from J. Griffin author name expression to James Griffin author name expression. We want to do a similar thing for diacritics, linking from expressions with diacritics to those without and back. We links between author name expressions that may refer to the same person as "horizontal integration".

Current state

A debugging/testing demonstrator of the system is available here.


The 3lib dataset includes the IUCR data from the JISC funded open bibliography project. However since the data is very small, it is not likely to be seen in the actual demonstrator that we have running.

submitted 17 Feb '11, 07:55

jrgriffiniii's gravatar image

accept rate: 0%

edited 17 Feb '11, 16:14

Seems interesting but not sure I exactly understand what would be be built (demonstrator link is just an index directory). Could you summarize very briefly at the top what would be built.

(21 Apr '11, 09:23) rgrp ♦♦
Be the first one to respond to this idea!
toggle preview

Follow this question

By Email:

<span class='strong'>Here</span> (once you log in) you will be able to sign up for periodic email updates about this idea.



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 17 Feb '11, 07:55

Seen: 770 times

Last updated: 21 Apr '11, 09:23

powered by OSQA