It’s getting to the end of the year, and I’m feeling a little retrospective and I’m (anxiously) looking forward to the future. We have enjoyed a great year with Open Context (see here).
More generally, it’s obviously been a big year for all things “open.” The White House has embraced Open Access and Open Data policies, and even recognized the work of some advocates of reform, and that has been hugely exciting. It seems that the arguments for greater openness have finally led to some meaningful changes. All of these are signs of real progress.
However, I’m increasingly convinced that advocating for openness in research (or government) isn’t nearly enough. There’s been too much of an instrumentalist justification for open data an open access. Many advocates talk about how it will cut costs and speed up research and innovation. They also argue that it will make research more “reproducible” and transparent so interpretations can be better vetted by the wider community. Advocates for openness, particularly in open government, also talk about the wonderful commercial opportunities that will come from freeing research.
This last justification boils down to creating a “research commons” in order to remove impediments for (text, data) mining of that commons in order to foster entrepreneurialism and create wealth. This is pretty explicit here in this announcement from Europeana, the EU’s major open culture system (now threatened with devastating cuts). I don’t have a problem with wealth creation as an outcome of greater openness in research. Who doesn’t want more wealth? However we need to ask about wealth creation for whom and under what conditions? Will the lion’s share of the wealth created on newly freed research only go to a tiny elite class of investors? Will it simply mean a bit more profit for Google and a few other big aggregators? Will this wealth be taxed and redistributed enough to support and sustain the research commons exploited to feed it? The fact that the new OSTP embrace of “Open Data” in research in an unfunded mandate makes me worry about the prospect of “clear-cutting” the open data commons.
These are all very big policy issues, but they need to be asked if the “open movement” really stands for reform and not just a further expansion and entrenchment of Neoliberalism. I’m using the term “Neoliberalism” because it resonates as a convenient label for describing how and why so many things seem to suck in Academia. Exploding student debt, vanishing job security, increasing compensation for top administrators, expanding bureaucracy and committee work, corporate management methodologies (Taylorism), and intensified competition for ever-shrinking public funding all fall under the general rubric of Neoliberalism. Neoliberal universities primarily serve the needs of commerce. They need to churn out technically skilled human resources (made desperate for any work by high loads of debt) and easily monetized technical advancements.
This recent White House announcement about making universities “partner at the speed of business” could not be a clearer example of the Neoliberal mindset. It was written by Tom Kalil, one of the administration’s leading advocates for open science. The same White House that has embraced “open government,” “open science,” and “open data” has also ruthlessly fought whistle-blowers (Snowden), perpetuated ubiquitous surveillance (in conjunction with telecom and tech giants), hounded Aaron Swartz (my take here), and secretly negotiated the TPP, a far reaching expansion of intellectual property controls and punishments. All of these developments happened in a context of record corporate profits and exploding wealth inequality. And yes, I think these are all related trends.
How can something so wonderful and right as “openness” further promote Neoliberalism? After all, aren’t we the rebels blasting at the exhaust vents of Elsevier’s Death Star? But in selling openness to the heads of foundations, businesses, governments and universities, we often end up adopting the tropes of Neoliberalism. As a tactic, that’s perfectly reasonable. As a long-term strategy, I think it’s doomed.
The problem is not that the Open Movement is wrong. The problem is that the need for reform goes far deeper than simply making papers and data available under CC-By or CC-Zero. Exploitative publishing regimes are symptomatic of larger problems in the distribution of wealth and power. The concentration of wealth that warps so much of our political and economic life will inevitably warp the Open Movement toward unintended and unwanted outcomes.Let them Eat Cake Open Data
Let’s face it. Most researchers that I know who are lucky enough to be employed are doing the work of 4 or 5 people (see also this paper by Rosalind Gil). Even some of my friends, lucky enough to have tenure or tenure-track positions, seem miserable. Maybe it’s survivor guilt, but they are stressed, distracted, and harried. Time and attention are precious and spent judiciously, usually in a manner where rewards are clear and certain. Data management plans, data sharing or collaboration on GitHub? Who has time for all that?! They don’t count for much in the academic rat-race, and so the normative reward structures in the Academy create perverse incentives for neglecting or outright hoarding of data.
Data sharing advocates talk about how data should get rewarded just like other forms of publication. Data should “count” with measurable impacts. As a data sharing advocate, much of this really does appeal to me. Making data sharing and collaboration part of the mainstream would be fantastic. If we convince universities to monitor data citation metrics, they can “incentivize” more data sharing. We can also monitor participation in social media (Twitter), version control (GitHub), etc. All of these statistics can be compiled and collated to provide an even more totalizing picture of a researcher’s contributions.
But are more metrics (even Alt-metrics) really the solution to the perverse incentives embodied by our existing metrics? The much derided “Impact Factor” started out as a way for librarians to make more informed choices about journal subscriptions (at least according to this account). In that context, the Impact Factor was relatively benign, but it then became a tool for Taylorism and the (coercive) monitoring of research outputs by university bureaucracies (see this history). That metric helps shape who gets hired and fired. And while metrics can be useful tools, the Impact Factor case shows hows metrics can be used by bureaucracies to reward and punish.
What does all of this have to do with the Open Movement?
One’s position as a subordinate in today’s power structures is partially defined by living under the microscope of workplace monitoring. Does such monitoring promote conformity? The freedom, innovation, and creativity we hope to unlock through openness requires greater toleration for risk. Real and meaningful openness means encouraging out-of-the-ordinary projects that step out of the mainstream. Here is where I’m skeptical about relying upon metrics-based incentives to share data or collaborate on GitHub.
By the time metrics get incorporated into administrative structures, the behaviors they measure aren’t really innovative any more!
Worse, as certain metrics grow in significance (meaning – they’re used in the allocation of money), entrenched constituencies build around them. Such constituencies become interested parties in promoting and perpetuating a given metric, again leading to conformity.
Metrics, even better Alt-metrics, won’t make researchers or research more creative and innovative. The crux of the problem centers A Hunger Games-style “winner take all” dynamic that pervades commerce and in the Academy. A rapidly shrinking minority has any hope of gaining job security or the time and resources needed for autonomous research. In an employment environment where one slip means complete ejection from the academy, risk-taking becomes quasi-suicidal. With employment increasingly precarious, professional pressures balloon in ways that make risk taking and going outside of established norms unthinkable. Adding more or better metrics without addressing the underlying job security issues just adds to the ways people will be ejected from the research community.
Metrics, while valuable, need to carry fewer professional consequences. In other words, researchers need freedom to experiment and fail and not make every last article, grant proposal, or tweet “count.”Equity and Openness
“Big Data,” “Data Science,” and “Open Data” are now hot topics at universities. Investments are flowing into dedicated centers and programs to establish institutional leadership in all things related to data. I welcome the new Data Science effort at UC Berkeley to explore how to make research data professionalism fit into the academic reward systems. That sounds great! But will these new data professionals have any real autonomy in shaping how they conduct their research and build their careers? Or will they simply be part of an expanding class of harried and contingent employees hired and fired through the whims of creative destruction fueled by the latest corporate-academic hype-cycle?
Researchers, including #AltAcs and “data professionals”, need a large measure of freedom. Miriam Posner’s discussion about the career and autonomy limits of Alt-academic-hood help highlight these issues. Unfortunately, there’s only one area where innovation and failure seem survivable, and that’s the world of the start-up. I’ve noticed how the “Entrepreneurial Spirit” gets celebrated lots in this space. I’m guilty of basking in it myself (10 years as a quasi-independent #altAc in a nonprofit I co-founded!).
But in the current Neoliberal setting, being an entrepreneur requires a singular focus on monetizing innovation. PeerJ and Figshare are nice, since they have business models that less “evil” than Elsevier’s. But we need to stop fooling ourselves that the only institutions and programs that we can and should sustain are the ones that can turn a profit. For every PeerJ or Figshare (and these are ultimately just as dependent on continued public financing of research as any grant-driven project), we also need more innovative organizations like the Internet Archive, wholly dedicated to the public good and not the relentless pressure to commoditize everything (especially their patrons’ privacy). We need to be much more critical about the kinds of programs, organizations, and financing strategies we (as a society) can support. I raised the political economy of sustainability issue at a recent ThatCamp and hope to see more discussion.
In reality so much of the Academy’s dysfunctions are driven by our new Gilded Age’s artificial scarcity of money. With wealth concentrated in so few hands, it is very hard to finance risk taking and entreprenurialism in the scholarly community, especially to finance any form of entrepreneurialism that does not turn a profit in a year or two.
Open Access and Open Data will make so much more of a difference if we had the same kind of dynamism in the academic and nonprofit sector as we have in the for-profit start-up sector. After all, Open Access and Open Data can be key enablers to allow much broader participation in research and education. However, broader participation still needs to be financed: you cannot eat an open access publication. We cannot gloss over this key issue.
We need more diverse institutional forms so that researchers can find (or found) the kinds of organizations that best channel their passions into contributions that enrich us all. We need more diverse sources of financing (new foundations, better financed Kickstarters) to connect innovative ideas with the capital needed to see them implemented. Such institutional reforms will make life in the research community much more livable, creative, and dynamic. It would give researchers more options for diverse and varied career trajectories (for-profit or not-for-profit) suited to their interests and contributions.
Making the case to reinvest in the public good will require a long, hard slog. It will be much harder than the campaign for Open Access and Open Data because it will mean contesting Neoliberal ideologies and constituencies that are deeply entrenched in our institutions. However, the constituencies harmed by Neoliberalism, particularly the student community now burdened by over $1 trillion in debt and the middle class more generally, are much larger and very much aware that something is badly amiss. As we celebrate the impressive strides made by the Open Movement in the past year, it’s time we broaden our goals to tackle the needs for wider reform in the financing and organization of research and education.
It has long been part of my philosophy that archaeology needs to communicate what it does and what it finds out to the widest possible audiences. It must do this to stay relevant in popular society so that people continue to care about the past, and so that there can still be an archaeological profession to help understand and record it. As archaeologists we have a moral duty to disseminate as widely as possible.
So it is very heartening to see that top-ranking technology blog Gizmodo have a feature on archaeology. In the post Lasers, Drones, and Future Tech on the Front Lines of Archaeology, Gizmodo ask archaeologist James Newhard about the use of technology in the course of his work. He discussed 3D capture, drones, Reflectance Transformation Imaging (RTI), and does a bit of blue-sky thinking about where the technology could go in the future.
Earlier in 2012, the excellent Linked Ancient World Data Institute was held in New York at the Institute for the Study of the Ancient World (ISAW). During this symposium, Leif and Elton convinced many participants that they should contribute their data to the Pelagios project, and I was one of them.
I work for a project based at the British Museum called the Portable Antiquities Scheme which encourages members of the public within England and Wales to voluntarily record objects that they discover whilst pursuing their hobbies (such as metal-detecting or gardening). The centrepiece of this projects is a publicly accessible database which has been on-line in various guises for over 13 years and the latest version is now in the position to produce interoperable data much more easily than previously.The Portable Antiquities Scheme database
Within the database that I have designed and built (using Zend Framework, jQuery, Solr and Twitter Bootstrap), we now hold records for over 812,000 objects, with a high proportion of these being Roman coin records (175,000+ at the time of writing, some with more than 1 coin per record). Many of these coins have mints attached (over 51,000 are available to all access levels on our database, with a further 30,000 or so held back due to our workflow model.) To align these mints with a Pleiades place identifier was straightforward due to the limited number of places that are involved, with the simple addition of columns to our database. Where possible, these mints have also been assigned identifiers from Nomisma, Geonames and Yahoo!’s WOEID system (although that might be on the way out with the recent BOSS news), however some mints I haven’t been able to assign – for instance ‘mint moving with Republican issuer‘ or ‘C‘ mint which has an unknown location.
Once these identifiers were assigned to the database, it allowed easy creation of RDF for use by the Pelagios project and it also facilitated use of their widgets to enhance our site further. To create the RDF for ingestion by Pelagios, our solr search index dumps XML via a cron job cUrl request, which is transformed by XSLT every Sunday night to our server and uses s3sync to send the dump to Amazon S3 (where we have incremental snapshots). These data grow at the rate of around 100 – 200 coins a week, depending on staff time, knowledge and whether the state of the coin allows one to attribute a mint (around 45% of the time.) The PAS database also has the facility for error reporting and commenting on records, so if you use the attributions provided through Pelagios and find a mistake, do tell us!
At some point in the future, I plan to try and match data extracted from natural language processing (using Yahoo geo tools and OpenCalais) against Pleiades identifiers and attempt to make more annotations available to researchers and Pelagios.
For example, this object WMID-3FE965, the Staffordshire Moorlands patera or trulla (shown below):
Has the following inscription with place names:
This is a list of four forts located at the western end of Hadrian’s Wall; Bowness (MAIS), Drumburgh (COGGABATA), Stanwix (UXELODUNUM) and Castlesteads (CAMMOGLANNA). it incorporates the name of an individual, AELIUS DRACO and a further place-name, RIGOREVALI. Which can further be given Pleiades identifiers as such:
These emperor pages also pull in various resources from third party websites (such as Adrian Murdoch’s excellent talking head video biographies of Roman emperors), data from dbpedia, nomisma, viaf and the site’s internal search engine. The same approach is also used, but in a more pared down way for all other issuer periods on our website, for example: Cnut the Great.
Integrating Johan’s map tiles Following on from Johan’s posting on the magnificent set of map tiles that he’s produced for the Pelagios project (and as seen in use over at the Pleiades site and OCRE), I’ve now integrated these into our mapping system. I’ve done it slightly differently to the examples that Johan gave; due to the volume of traffic that we serve up, it wasn’t fair to saddle the Pelagios team with extra bandwidth. Therefore, Johan provided zipped downloads of the map tiles and I store these on our server (if you’re a low traffic site, feel free to use our tile store): Imperium map layer, with parish boundary. Zoom level 10.
The map zoom has been set to the level (10 for Great Britain) at which we decided site security was ensured for the discovery points (although Johan has made tiles available to level 11). This complements the other layers we use:
- Open Street Map
- soil map
- Stamen map watercolor
- Stamen map toner
- NLS historic OS maps
Each find spot is also reverse geocoded for a WOEID and Geonames identifier to be produced, elevation to obtained and subsequently we link to Aaron Straup Cope’s excellent woedb for further enhancement of place data. We also serve up boundaries derived from the Ordnance Survey Opendata BoundaryLine dataset, split from shapefiles and converted to KML by ogr2ogr scripts. The incorporation of this layer allows researchers (over 300 projects currently use our data) to interpret the results that they get from searches on our database against the road network and settlement data much more easily and has already gathered many positive comments from our staff and research colleagues.
By contributing to the Pelagios project, we hope that people will find our resources more easily and that we in turn can promote the efforts of all the fantastic projects that have been involved in this programme. What we’ve managed to implement from joining the Pelagios project already outweighs the time spent coding the changes to our system. If you run a database or website with ancient world references, you should join too!
- CBA History
- Support Us