Tagger for hermeneutic markup

Friday, July 2nd, 2010

I’m currently interested in hermeneutic markup hence my proposal: let’s try a web based tagger that lets the user define custom tags and give him the ability to apply those tags to passages of the current web page. As Wendell pointed out, tags should be able to overlap and allow a good visibility. We could go for a firefox plugin implementation and write out the tags and tagging locally. Or we could try a full fledged ajax tagger that talks to a server. We could also add complexity and save the tagging as TEI standoff markup and the tags as TEI feature structures. Or down the other way we just prototype some GUIs and lay out the architecture for such a tagger. Preferably an architecture that fits both the plugin and the ajax solution to allow sever based as well as client only usage.


Herding Archivists – and other ideas

Friday, July 2nd, 2010

I’m only able to attend the second day of THATCamp, so will probably end up fitting in with someone else’s session proposal, but here are a couple of ideas I’d be interested in exploring if possible:

  1. Crowdsourcing Archival Description.  Most archive organisations have significant cataloguing backlogs, which restricts access to collections.  But they also have experience of managing volunteer cataloguing projects.  How can this experience best be translated to an online world?  What would a crowdsourced cataloguing framework look like?  What similar projects already exist as examples of best practice?  Are there available guidelines or principles for participatory UI design which would be relevant here? What do we already know about what motivates people to contribute to crowdsourcing projects?  Is there a minimal level of cataloguing or structure which needs to be provided which the crowd can then add to?  What filters need to be provided to allow participants to choose what they catalogue and in what quantities?  What models of participant/user segmentation can be used to plan and evaluate crowdsourced cataloguing projects?  This could be either a discussion session or include an element of practical prototyping, if someone was up for it.
  2. Herding Archivists.  My background is in small to medium-sized archives in the UK, employing usually between 1 and 30 members of staff (almost all of whom will usually have a humanities background).   Although nominally ‘available’ to the human end-user, information about local collections is all too often inaccessible to developers, hidden inside proprietary database applications.  How can we engage archivists in opening up this data for the developer community, getting over the idea that ‘big computing’ can be as applicable to archives and the humanities as it is to science?  Can we identify some ‘quick wins’?  Can the developers at THATCamp articulate for a non-technical audience of archivists what formats and methods of making archival data would be most useful to them (this links to one of the points in Kicking off the Developers’ Challenge)?  What barriers stand in the way of making archival data available (eg IPR constraints, the complexity of hierarchical archival descriptive practices & standards)?  How do we encourage young archivists to work in this space (spot the other archivists attending THATCamp…)?

Rethinking the Digital Scholarly Edition

Thursday, July 1st, 2010

I’d like to propose a brainstorming session in which we re-imagine, rethink, and reconceptualize the digital edition.  Some of the questions we could begin with include: What are current digital editions doing well?  What are they lacking?  How can we make them more useful for both professional readers and the general public?  What do we imagine the digital edition/text to look like in 5 or 10 years?  How will the nature of the digital edition change with advances in technology?  I imagine beginning with broad theoretical questions, but I’d also like to suggest that we spend some time getting our hands dirty with prototypes and wireframes.  Perhaps a DSE UI challenge?

Kicking off the Developers’ Challenge

Thursday, July 1st, 2010

THATCamp London is running a Developers’ Challenge (DC) in parallel with THATCamp London and DH2010. (see information about this at  As a part of the Challenge, one of the first sessions in the THATCamp will be a get-together session where some of the data providers meet some or all of the developer-contestants. In addition, this session will be the place where DC participants can get copies of data that we currently hold only in CD or DVD format.

We hope the exchange of information at this session will be in both directions!  The data provider can describe some aspects of his/her data to the developers, of course — but we also think that the developers might want to share ideas about how data from our data providers could be made available that makes it more readily exploited by developers such as themselves.

Although this session will be launching the Developers’ Challenge and there will be some business to do related to that, we hope the session will not be restricted to only DC participants!  Please join us if you have views on the broader topic of technical issues around the the reuse of scholarly digital data.

… John Bradley ( and Gabriel Bodard (

TEXTvre and GATE?

Thursday, July 1st, 2010

What about some work related to TEXTvre / TextGrid and GATE – e.g. integration of GATE services with TEXTvre?

Exploiting the ATAM Space

Wednesday, June 30th, 2010

THATCamp London/DH2010 is being held in King’s recently renovated facility called the Anatomy Theatre and Museum (ATAM) — a Grade 1 listed space which is described as a facility “for exploration and innovation in performance and e-research.”  The ATAM’s website describes the Museum space as equipped with facilities such as “white wall storyboarding with e-beams, software to support thinking, deliberation, and creativity, an access grid equipped studio space with multi-directional digital recording facility, sprung performance floor, and immersive screen and sound environment.”

It would be great if, as a part of the THATCamp experience, one or more groups proposed sessions that exploited some of the specialised facilities available.  The ATAM’s technical expert will be with us throughout the time there, so many things can be arranged on short notice.  We would welcome your suggestions.  If you would like to explore some ideas you might have in more detail, please contact me, John Bradley at

Participatory, Interdisciplinary and Digital

Saturday, June 26th, 2010
There’s much being said, written and created around the digital humanities at the moment. There’s less, from what I can see, around digital engagement with social research. I’m an interdisciplinary social researcher (researching biblical studies/theology and disability studies). I’m interested in ways in which social research can learn from the digital humanities project.

Participatory social research is interested in engaging individuals and communities, empowering them to use research in ways that benefits them. In my (probably quite limited!) view, it’s about the empowerment of research participants, while the digital humanities project is about a different kind of wider engagement – involving information, archives and texts. I think there are fascinating potential links to explore here.

There are also opportunities to forge interdisciplinary links between social and humanities research because of the possibilities offered by the digital humanities approach. I would like us to discuss ways in which these interdisciplinary opportunities can be exploited, without exploiting research participants.

This session may explore some, all or none of the following questions:
– Beyond Survey Monkey: How far can digital peer collaboration make space for more participatory approaches to social research?
– Ethics: Can digital engagement in social research happen effectively without raising major ethical issues for social researchers? (e.g. Do research blogs have to be locked if they mention a point that a participant raised? Do confidentiality and engagement conflict?)
– Interdisciplinary Co-operation: How can bringing digital artefacts from across disciplines to research participants encourage wider engagement in research? (e.g. I’m thinking about how I can meaningfully share examples of historical and current religious representations of disability with my participants, and whether they would find that useful.)

I’ve never been to an unconference, and I’m new to the digital humanities. So any help or advice anyone can offer would be very much appreciated, either here or at THATCamp London!

Where is your text going?

Friday, June 25th, 2010

I am a tool builder currently focused on building a tool that facilitates transcription of manuscripts housed in digital archives like Parker on the Web and e-codices by performing image analysis to locate each line in an image, and presenting each line to the user for transcription. It really makes transcription and proofreading easier. One of the issues I need to address in my tool is how to ensure users can get the transcriptions they create into the editing or text analysis environments they want to use such as Juxta and TextGrid.

In my experience working with such tools, I have found the data import process to be rather difficult, and documentation lacking.  I want to build a list of the quality editing and text analysis tools people are using and what makes each one stand out, then I will create detailed instructions for getting text into them both from my tool and a more generalized text source. The result will serve as a resource for other tool builders, as well as potential users of the tools.

Telling Stories to your Computer

Tuesday, June 22nd, 2010

A group of us have been working on a (semi-unofficial) project to describe the narrative content within and across media objects ( Recently we have began looking at bootstrapping the system by playing with TEI marked up plays to identify basic character (and character-interaction) information in addition to our ongoing discussion about at ways to semi-automate or crowd-source the annotation process.

We will be presenting some of our work at DH but I am interested in discussing topics related to semantically describing the narratives within texts (from actual textual data to multimedia sources) such as the applications that data could be used for, to the issues of annotation, search/filtering, analysis, visualisation/presentation and how to make the data easily available/useful to researchers.

I’d also be very interested in talking to other people who had material/projects that they thought might benefit from this type of annotation as we are always looking for different use-cases to explore the possibilities of.

“Sum” THATCamp possibilities?

Thursday, June 17th, 2010

As a participant of the upcoming THATCamp I was asked to outline a session I’d like have. Hmm… Well, I think I can brainstorm a few possibilities:

  • Exploiting RR is a increasingly popular open source statistical tool/programming language. I’d like to get up with others to discuss how it can be used in the digital humanities.
  • Graphing texts – There are many ways texts can be “measured”. I can count the number of words, the parts of speach, and the reading level. I’ve begun to count the “greatness” of a book as described in a number of blog postings. Once these sorts of things are measured, I’d like to discuss with people ways these measurments and be illustrated through the use of charts and graphs. A picture is worth a thousand words.
  • Integrating digital humanities with libraries – As a librarian one of my ultimate goals is to figure out ways digital humanities computing techniques can be seamlessly integrated into library collections and services. Instead of a library “catalog” simply pointing a person to a text, I’d like it to offer services allowing the user to… use the text. Maybe we can create a prototype of such a thing.
  • Reducing ambiguity – In one of my “experiements” I wanted to assess a set of works’ use of the word “being”, as in the thing, but the analysis returned too many false-positives because the word was being used as a verb and not a noun. Such a problem is not uncommon, and I’m wonding how it can be resolved.

‘Just some ideas, and please be gentle with me. I’m a noob.

Eric Lease Morgan