Preserving the Creative Culture of the Web (SXSW)

Hello there! I’m at SXSW (South By Southwest) 2012 this weekend, going to the interactive conference. Anyway, I tend to take copious notes when I go to talks. I thought they might be useful to a wider audience, so here you go. You can see other SXSW 2012 posts that I’ve made as well.


We’re going to present a range of perspectives on archiving – how we go about it, means and methods. Discussion forums, games, user-generated content: we’ll give it a cultural context, beyond the technologies involved. It’s important to keep this, and save it. The internet community as a whole hasn’t done a great job of keeping this stuff (beyond the Internet Archive). We think it’s wroth saving we’re here to tell you why.

Nick Hasty, Rhizome

We present internet art, any kind of production where the intent behind it was within an artist context. Since the beginning of the web, since people could log on and connect to computers, people have been making art. Virtual worlds and games, even if it ran on the PC, the web enabled them with mods and communities to discuss them. Then user-generated content – produced by communities online, hosted by other larger entities (yahoo or google), discussion forums, BBS.. why they are important and often in jeopardy.
Rhizome is a non-profit dedicated to Digital Art and Culture. Started as a mailing list in 1996. Based in New Museum, NYC – independent of that museum as an organization, but collaborate together. ArtBase: Our Archive of Digital Art (1999).We do events and programming in the city, but I’ll focus on the ArtBase. We create records of artworks and we also preserve them. 2500 artwork records, 520+ artworks archived and running on our servers.
When we’re going to archive a work, we work with the artist to get the files, we work to see if they are compatible with contemporary browsers. We’ll take the original artwork, see if it still runs. Clone it, preserve the original, and bring it up to modern browser standards. I’ll show you some things from our archive:

  • Little Movies Lev Manovich, 1994-1997 I worked a bit on this over the summer, found that there were missing assets, worked with Lev to get them. The content of any new medium is the old medium – McLuhan like. These little boxes you watch, 1.1 MB pixel boxes. In 1996 it would have taken a while, but it was about the aesthetics of limited bandwidth. What does it do to the experience of watching movies. Today, they load quickly, you can watch them.
  • VVWEBCAM (2007) Petra Cortright She takes video of herself, scrolls through the default effects that come with the cam… lightning, pizzas, etc. Put on YouTube, a commentary on how we consume video online. It has over 200 tags, mundane to obscene tags. YouTube took it down because it violated their spam levels. The artist used YouTube to present the work; when YouTube took it down, it was gone. We have it – the context is changed, but it’s still around now. A work that was pretty well documented… taught in universities, pretty major piece. So an interesting tension there with YouTube.
  • The Deleted City (2011) Richard Vijgen Artist made a processing app, to visualize the old geocities archive…. geocities was broken up into locations, places, cities, etc. An interactive way to see them visualized on a touch screen. If the site had midi files, they’ll play. Presenting old data in a new interface.

Kari Kraus

Preserving Virtual Works

From the University of Maryland – I preserve virtual 3D worlds. The first project was funded by the Library of Congress. We adopted a case set approach. We were trying to figure out how to archive a set of 8 different video games. Colossal Cave Adventure – first documented interactive fiction. Mystery House – first interactive fiction with graphics. SecondLife – 3D virtual world. We didn’t archive all of SecondLife, just a few key islands. A huge disaster, incredibly complicated.
We have a book-length whitepaper, “Preserving virtual worlds final report” – it’s a free PDF you can download.
Our case set this time is mostly educational game (Doom is in it, not really educational.) Typing of the Dead, Doom, Carmen San Diego, Oregon Trail. Game Bits at the University of Stanford… University of Illinois at Urbana Champaign. You can’t save all the features of every game or complex interactive software. You’re going to lose information. Can we try to determine what are the most salient features of these games, to ensure at least those features remain intact. This has proven to be difficult. Surface features, not underlying data structure. Beacuse the games are proprietary, we have no access to source code. This has proven to be enormously challenging. Jason Scott was on our advisory board for that.
Summer 2011, I wrote an OpEd in the NY Times on digital preservation. Two types of of responses. Data as a singular noun instead of plural. Second piece in large volumes, far more positive. The attention I showed to the role of gamers in digital preservation – I use the phrase curatorial activisim to describe that. I want to highlight them to show what they are doing. Recent OpEd in PC World, piracy and preservation. What they do goes way beyond that.
Jane McGonigal’s Ted Talk – Games save bits. In Reality is Broken, Jane McGonigal’s thesis is that far from jeopardizing our future, video and massively multiplayer online games have potential to helping us solve the world’s most pressing problems. It’s a 21st century way of collaborating thinking problem solving changing attitudes and behavior. She has developed a number of games, world without oil, super strapped, and evoke. Trying to develop creative solutions for world’s most pressing solutions. None of saved the real world yet, or have they?
The kryoflux – designed to read old data types – to rescue 1980s era games played on Commodore 64, Amiga, other vintage games. Gamers created this crazy piece of hardware, now professional archivists use it. It allows you to bypass / circumvent the original computer system or platform, so it’s an extraordinary tool that professional archivists now using. Created by gamers.
DIY/maker mentality can save bits. What they produce is instrumental and practical in nature. Hardware, emulators – the way they do it is playful, exploratory, and experimental. The instruments they produce also have playful one. Commodore 64 re-imagined as a laptop. Xbox360 controller shoehorned into atari 2600 game controller.
These curatorial activists, from video game enthusiasts to dedicated digital preservationists, over the last 2 decades or more, collecting, documenting, rendering video games. No shortage of preservation challenges to content – bit rot, tech obsolescence, code libraries supplied by OS in other packages, digital collections, shifting foundations of rust and plastic. Emails, instant messages – how to archive? The internet is populated with amateur digital preservationists. Saving 8 bit worlds also saves real worlds – translating legacies from one generation to the next. Gamers a precious human resource we can use to do real world work.

Jason Scott

A historian, activist, preservationist… he is here – when he was young, he discovered he was an archivist. He’s recording this whole thing because he doesn’t trust anybody. 🙂
Woke up in middle of the night while 9 years old, his mother decided she had to leave the house as soon as possible. A blanket, a dog, and a pillow is all he had left. He had to rearrange things in his life to accommodate mobile lifestyle. He learned nothing is permanent or definite – people get locked up in the way things have been. Shaking up paradigms the way to go about it.
We have all these things going on here in this convention center. 20 other tracks going on right now, we can’t go to them because we mad e a choice to watch this guy. Maybe they’re just as good. We could pull in all of these materials at the same time, but we could easily lose them.

Act Two – come with me if you want to live

The internet archive is a beautiful place, not as crazy as I am… still great. Many people know about it because of the wayback archive. Have the past 10-15 years. They are a non-profit library. Check out what else they have. Thousands of movies, books, podcasts, saved from all over the world that they are trying to rescue. We always think about the now, not what was. There is a contributing set of people holding hands through time agreeing not to destroy these things.
When I joined internet archive, it was because another group i worked with. Archive Team. I saw the death of Tripod and Geocities. We need a team of people to save these items. – it’s a rogue band of archivists, we’ve been downloading things whether they want it or not. Geocities decided they were closing, we downloaded it and put it up as a torrent on the pirate bay. A 15-year human anthropological study, people speaking with an audience they never had before in the human life time. Only happened because someone wanted to move from column a, column b. Gowalla is gone now. Fortune cities is closing within the next 60 days. These are places that people offered people to come to and bring their things on. You have to find a new place or just walk away from it. We feel like saving the data first and asking questions later works for us.

Act Three – Everything was just fine until you came along

When you look at archiveteam, it’s made up of volunteers who have a belief. If you’re involved in something that involves user’s data, content – even if the laws don’t permit it, you are a caretaker of people’s lives. people are trusting you with data. the easiest way for you is to add an export function as fast as you add that import function to let people pull that data off. if you’re good, it doesn’t matter. if you’re not interesting, they weren’t going to stick around anyway. if you walk away with nothing else today, definitely please add an export function, and think about shutdown. it happens. it happens to all of us.

Coda – everything you own is gone and everyone you love is dead

thikn of your life as what you’ve learned and what others can learn from what you learned. our data is becoming us. we are becoming a hive mind. if someone slaps someone in the line at the grocery, we all know about it. if somebody mistreats their partner – we here about it. hive mind, you’re a bee, it’s okay, being a bee is cool, part of the advantage of being that, stackexchange, wikipedia, can provide us a way to share information with each other. get better faster and quicker. solve problem sthat previously would have dominated thought for year son end, in minutes.
we need to preserve what we’ve done. we’re too focused on the start. we won’t be doing it forever.
I never shut up and you shouldn’t either


Nick Hasty

For us, each one of our archival practices, we’ve dealt with friction, creative content hosted online by a larger entity that controls the means of dissemination. a youtube, a yahoo, id software… they have control over the long term life of this thing. they make decisions that can really affect.
rhizome, internet archive, trying to preserve this stuff, we can only do so much ourselves.
what are other people doing in the gaming community, what’s the context of that, what are the implications of that, the instability and tension of these two things. what needs to change, how can we circumvent it?


One of the things Jason does is he gets bits from geocities and other places that are shutting down with very little warning tell users they are going to lose all of their stuff. if you’re in cultural heritage or non profit, we look at these for-profit companies as the villian. but when i published an op ed about digital presevation, bruce sterling replied, he donated a lot of his papers to university of texas but he didn’t include any electronic materials or papers. he believes digital preservation is a fools errand. inherently unstable storage and conditions. “i used to worry about floppy disks. now i worry about the universities.” to outrun the short lifespans of the media we use to store our bits necessitates a constant budget. in the larger political climate, higher ed is under attack via funding, museums receiving less funding, he doesn’t think the project is sustainable. do you think we are in this scary era where we’re aware of the precarious nature of our stuff on the cloud, but we do tend to think of museums / archives / universities as stable institutions and environments. I think Bruce is spot on when he questions that assumption. We’ll have to rely on innovative services like kickstarter, Jason demonstrates that veyr well. We need a new set of skills to do that. You can’t be shy to use kickstarter. We need to train the next generation of archivists.


In the internet art world, it’s a different situation. These are things – Petra’s video on YouTube – it’s bound into the context in which it’s shown. YouTube is part of the context of the work. The terrible comments, bad obscene stuff – it’s a comment about how we look at YouTube and consume media. Oh, we have a backup – doesn’t speak to the long term context of it. We can’t run youtube – we can’t run that on rhizome’s servers. I don’t expect YouTube to be sensitive to this but – what these things, what they are hosted on, part of what’s wrapped into it. We do what we can. Newer artists, younger artists… internet is bound up into what they are doing. They are more savvy, abundant storage, they keep everything up, online, and available. It’s out there in the cloud, we store it as best we can. We have trouble with things created 15-20 years ago, the hosting provider is gone, they didn’t back it up… maybe they moved and lost a hard drive. It’s the older stuff we work on now. We do archive new stuff, and we do bring in new stuff, but we have a special place for things that are older. Things that establish the genre itself… the – it comes from a garbled email between two artists. They were looking through, trying to make sense of it. The medium screwed it up. Some of the works of the guy who coined that term, his works were lost. It’s an important time that should be findable and browseable… we emphasize access, modifications if need be, keep it so people can go back and look at it.


The Cray supercomputer, world’s most expensive love seat. You can now run it on a laptop. It was a big deal, Chris who lives in New York got his hands on a disk pack in the late 70’s, early 80’s disk pack. He wanted the data off of it. Everything associated with it is dead, nowhere to read it. The archivists and institutionalists will say, this is a tragedy. This is what we have to fight against. People dead inside don’t care about it. Technical people will say this is a problem we need to solve. Activists will say we have to find out what’s on here before, and tell people to help us. Someone wrote an arduino guided magnet to go over the disk, read its data. This resulted in a 20gb image, magnetic flux recording. But there was no place to host this. People contacted me, I put it on the internet archive and our blog – a 20 gb magnetic flux image of a Cray disk. Can anyone help us get data off of it and read what the data is and read its format.
48 hours later, a guy in Norway took a look at it. A guy in Australia had a disk back from Cray in the 1980s that had all their localization files. He flew it to NY, someone sat with the disk pack in his lap, read it off to disk.
Negativity was not needed at any point in that. I want people to think about all these different forces that came in there. At no point, no one said ‘this isn’t worthing doing, let’s stop this now’ – I think that happens too much today.


I love what all you guys do. There’s two different approaches here. Rhizome, University projects – you’re selecting specific works of art, selecting them for a curatorial process, archival process… at the same time, the internet archive has a completely different approach, where you’re constantly downloading tons and tons of data. Are there any ideas to merge those processes together, to make them more easily readable for someone who doesn’ t have time?

Jason can’t speak for the internet archive, but a lot of archivists are taxidermists. They’re looking at things for a certain degree of life. Websites that were up for an average of 6 months with consistent changes. As an objective observer of the internet archive. They really interact with the library of congress, various universities – what they produce is of use to other groups. The WARC format, web archive format to allow long term storage of webpages. We hammered on the gnu wget project, the next version of wget will support that format. All sorts of things involving metadata and taxonomy – everything tries to be flexible in that way. A technically-oriented group that doesn’t have enough funding. Post-grad students want to classify the stuff, no funding. We try to observe other classifications…
I just acquired 10,000 hours of 3,000 shows at the DNA lounge in San Francisco. Many hours. I’m using their own event calendar as its taxonomy. So it’s all based on what their impression of their item is. It could be completely different than what is on the tape. I put that up there, people are benfiting from it on an emotional basis, there are also ways to serach whatbands are on there. Leave it open, reduce frictions for outside groups, so they can feel their work is getting instaneous reaction. As bad as it is, the IMDB is a good example of this. Semi-friction, but what you put it is very high quality. By spreading the word this thing exists… a out-of-work guy looking for metadat for fun guy. I need 6000 of them. My hope is that they are waiting for the word to get out.


For us it’s a fairly young field. A curatorial aspect to what we’re doing. Limited resources in what we do have, what we can archive, what we can deal with to the best of our abilities. We have people who submit artwork to us. A great team of people who are aware of what’s going on. We also have people who ask us to consider their art. Everyone’s a creator. We’re not trying to write a history of what’s good and bad. People specialize in art history, it’s a thing. Art scene have whole teams with art history backgrounds who are classifying work. Can we get people who can really describe this and make it something meaningful. We need someone with good historical and technical knowledge – it’s a rare set – we need them to get in, describe the work – if we can archive it, we can archive it. Rhizome used to be more open, a listserv, chatoic, it has ossified into the man we’re accused of. It’s funny. One of those things where we want to make this something that is accessible, it’s part of a larger discussion about art and creativity. Internet art is one of these extremely niche things, but this is contemporary art. We want to push that forward, introduce these new aesthetics in the world.


We chose our case sets, not because they have some intrinsic merit, space war isn’t better than pacman or whatever. What does it even mean to say that you’ve preserved it at all? We were at the experimental and scoping phase. What are the boundary objects of some of these games. Represenational information… the representational information that you need is going to be, among other things, the Apple II DOS manual. That becomes part of your information package. IF that Apple II DOS manual is saved in PDF format, it becomes infinitely recursively, because then you have to save the PDF specification, so the Apple II DOS manual is accessible in the future. Infinite regression means you’re ultimately relying on paper copies at some point. I love the question, if you have ideas, talk to us afterwards. I think that you need someone with Jason’s talents who can crowdsource a lot. Artchive, etc.

What is the most endangered species of media, the nitrate movies of digital?


the things we think are being recorded but are not. We have no idea if people made contingencies for long-term preservation. Is Facebook? Is SXSW? We believe archive is an archive? Google is an archive like a supermarket is a food museum. There is this inherent trust. Big data means permanent data – does it? Risk in relying on the cloud or shared hosting.


Usually the majority of what we do is web-based. Was the artwork created using flash or w3 certified standards. 1994 artwork, HTML, plop it in and it works. If JS is funky, we can clean it up. Fairly easy to update. Old version of flash, quicktime, something like that – open up the subject. Old actionscript, ouch.

Saving the data first, asking questions later. There was a talk earlier today, the right to be forgotten. Things stick around when people don’t want them to. Chills us from speaking in the first place. What do you people think about that?


I’m not an archivist, I’m a scholar who preserves games. I don’t know if an archivist would answer this in the same way. We think of archivists as saving everything. But part of their jobs is disposing of materials because not everything can be saved. I don’t htink it’s that controversial. I think part of the challenge is that we have to work hard to ensure future generations cares about what we save. In many instances, dumping bits in a dark archive that is inaccessible for decades is one of the first things we can do. A great article by an austin native, Ernest Clive, Ready Player One. All about hte gameification of digital preservation. How do you motivate future generations to care about the same things you care about? Allowing our bits and content to be re-used and transformed can be valuable.


our rule of thumb – we talk to the artist. We’ve gone to seminal people, and they don’t want us to archive it, and we don’t want us to do it. Or people will tell us don’t fix it, it’s not meant to be fixed. I’ve been talking to people who were more active back in the earlier days. People have suggested Jasn’s approach. Just get it, tell them later. At the end of the day, better to have it or not have. We may not be able to show it, but it’s still around. We haven’t done this, but it’s something that’s been going around.

I have a similar question. Petra video that got erased reminded me of a Robert erased dekoonig, drawing by wilhem dekoonig, a blank piece of paper. the fact it got deleted by an algorithm is essential to the piece. What about github? Security exploit that happened recently, a lot at stake there, are you working with them at the internet archive?


We ask the question, what would happen if this was gone? We downloaded thingaverse, momar kadafi’s site hours before it disappeared, an ethiopian news agency going away – we keep asking the question. Stack exchange made all their items avialable, we grabbed a copy. Usenet – Google screwed it up. We constantly ask the question… life insurance is trying to convince people about actuary tables. you’ve got children… have you thought about what’s going to happen if you’re not part of the puzzle. Here’s a contingency. A lot of websites and functionality is shared. Sourceforge concerns me, github concerns me. Facebook’s utter domination of all human photographic expression is very dangerous. Very sociopathic company… strange place to store your things, but people are doing it. Asking non-trivial , non-intuitive questions. We ask these questions. We have a tool, git annex… lets you check things into git without putting the items in there… an indexer. Using git’s ability to check it in. Let me examine it using git’s tools – a way to turn websites and things out there into indexed files you can then manipulate, I always want to ensure a copy of this is on my local drive over here. You join this hive, you and five people guarantee the safety of this ongoing data. Geekery and skills to do this. Make sure verything you’re doing is in other places. “ANIL -” “think up” – providing you with the tools, they’re a little clunky now. I’m going to take back the responsibility. I’m my own IT dept, without the pain. Parallel flickr. The team is downloading all the CC images on flickr.


We use git and github as a backup. We worry about these services, but if they didn’t exist we couldn’t do what we do. We depend on third party services. Dropbox helps us get stuff done so much better. You can’t reinvent the wheel for everything. Do what you can, keep what you can local.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.