Tag Archives for best practices
By now everyone is up to their ears with tweets about the Library of Congress’s annoucement that they will archive every Tweet. Here are my initial concerns and lauds.
- Cost. Library Journal has already questioned this. How much storage space is this going to require? How will it be sustainable? And how often are they planning on doing updates to the data stream? Will they begin collecting Tweets in real time? Monthly? Yearly?
- Content and archival quality. What about all those shortened bit.ly links? Or the old ones from services that have shut down, like Twurl? Or the really old ones that might be full URLs but that have rotted away? We can’t expect this to be perfect, but is LOC planning on trying to capture anything external to what the tweets may refer to? I got this idea from @dancohen. He suggests that LOC may need to take snapshots of the linked websites, and I think that sounds almost essential in a way albeit messy and difficult.
- Searchability. This could either be the greatest thing to happen to Twitter search, or a huge disappointment. Will LOC make their database of Tweets searchable? Right now, Twitter search is good for about two weeks. Library of Congress has a huge opportunity to blast that wide open, and we can only hope that they are able (infrastructure and $$$-wise) to do so.
- Privacy. A commenter was posted on the LJ blog about this issue. Is there a privacy problem here? Yes, our tweets are public, but is it somehow unethical even if it may not be a violation of copyright to republish Tweets in what could become public archive? Don’t ask me for an answer. Because I’ll say “no, it isn’t.”
- Metadata. How will the data about the tweets and their authors be captured and stored? Furthermore, Twitter is about to let us start adding annotations and other metadata to tweets in our stream. Will this sort of marginalia be lost?
All in all I have a feeling that this project is going to set a tone for social media archiving practice. One of the most talked about services being archived by one of the world’s largest libraries. If they truly think this is important (and I am tempted to agree), I think there is an excellent opportunity here to demonstrate that importance publicly. Essentially, I think the LOC is about the create the standard and best practices for social media archiving with this project, for better or for worse. If it is not implemented well in the beginning, it has the potential to set the bar too low (in both the technical and the public eye) for future endeavours seeking to capture online content.
In any case, this is a very exciting development to round off my library education. Two more days!
UPDATED Apr 15:
- ReadWriteWeb has some more good questions. Among them: “Will the archive include friend/follower connection data? Will it be usable for commercial purposes? Will there be a Web interface for searching it, and will that change the face of Twitter search for good? Is there any way that the much larger archive of Facebook data could be submitted to the same body for analysis of the same kind?” The answer to some of these is already known: no commercial use, there will [sounds like] be little web interface for searching–instead they will present a curated set for public use, while the entire archive will remain for serious research only.
- To address the problem of search, Google Replay was announced yesterday as well. This is Google’s attempt to capture what SearchEngineBLog calls a “vox populi” view of historical events. You can essentially search Google’s index of tweets easily for a specific date or range and keywords to get a sense of what was said about topics such as health care reform. With Twitter handling a reported 19-billion searches a month on their junky index, it’s about time we got another option. Google Replay, just like in their real-tme results display, resolves those shortened links, but I don’t know whether or not the full URL is saved within the index or if it is resolved on the fly. My guess is the latter.
What I want and have always wanted was a way to search for specific tweets by specific users. Sometimes I can recall a fuzzy thing like, “I know @somebody tweeted something about “Topic A” like a month ago.” With Google Replay, we’re getting closer, but it’s not perfect, yet. It does effectively use Twitter handles as a search term, for example: “iphone @danhooker” brings up some tweets (but not all) that I have sent or that were RTd by me. I hope it will get better. Google has that habit, so I fully expect–and pray–this will be a workable option for meaningful Twitter search in the future.
I think, hands down, I’d have to go with the Nintendo Wii on this one. No contest. PS3′s are ridiculously expensive and have hardly any good games (except for Grand Theft Auto, which, of course, you can’t exactly promote within your library) and the Xbox just sort of seems to me to be the type of gaming machine that encourages long-term single player experience. Of course, you can go online and frag your pals in Call of Duty (ahem, CoD, excuse me) but it isn’t very conducive to in-person team play.
Enter Nintendo Wii. The machine is worth buying for several reasons in the library. Not only is it comparatively inexpensive, but so many of its games are designed for in-person collaborative or competitive play. Where in Xbox live you scream at people through a headset, with the Wii you interact in a way that is unusual for a video game experience. It also has an image, because of its uniqueness, that is parent friendly. When mom and dad want to go play tennis with the neighbors, how can they say no to sending the kids off to the library (of all places) to do the same?
I guess I was supposed to talk about the research aspect, and considering the pros and cons of each system a little more in this post. But it just seems to me to be a no brainer here. The cross-demographic appeal and collaborative play elements of the Wii just seem to me to trump anything else a Playstation might have to offer. And if you’re looking for that more traditional, video game-y, single player experience, there’s still just no match for Mario.
Now that we have completed a couple screencasts, I have noticed several things that I do as a computer user that don’t bother me because normally, I am the only one looking at my screen. Audio in screencasts is another challenge, especially in Jing where there is no redo.
Like I said, I noticed in the last screencast I made on MySpace, that I scroll the page up and down sort of aimlessly a couple of times. This I just do I guess, when I am trying to get a sense of something, or perhaps it is just a nervous tic. Another common thing I do while reading online is highlighting text, sort of at random. The situation where I noticed these things happening in my screencast, I was explaining a point, and there was nothing specific to do visually with the screen at that moment. In this situation, it is much better to resist the urge to move the cursor aimlessly or scroll the screen. Viewers are listening, make your visual screen movements have a purpose.
Another thing I just noticed while making my screencast for Hapland, is that you have to be careful about the audio. I ran into this problem when recording a series of instructional screencasts at my last job. Because you are speaking into a microphone while doing something else, it is easy to get lost or just not describe what you are doing to the fullest effect. This leads to unnecessary “um”s and even, God forbid, me sniffling because of this cold I seem to have developed. If Jing would let me edit that audio, boy would I.
The point is, practice your lines, or write a script to follow. If you can edit audio, take out your swallows, awkward missteps, and any other oddities. Writing in mouse cues as well can help with my first point. You have to remember that you’re not the only watching anymore.
What are some advantages to an audio podcast over text-based presentation?
This post is in response to the above question, which was posed by the professor, Anselm Spoerri, in the course podcast here. My response, summarized after the jump, is available here.
What I see as the major advantages of the podcast are getting access to users who can feel overwhelmed by overly hyperlinked and visually “noisy” web presentation. These types of sites are common, and can often turn readers off. Another possible advantage (not mentioned in my response) is accessing users who are quite mobile, and like to have access to internet content while not necessarily being tethered to a device that can display web content. For example, commuters who hate morning radio would perhaps prefer this style of delivery.
If you can’t tell (based on this rather lengthy explanation of my response) I am a very visual person, and express myself much better and more thoroughly though writing. I also prefer to digest information through reading and exploring web content myself, rather than having it explained aurally.
I suppose I would be remiss if, after my espousal of the importance of hyperlinking to content, I did not include the appropriate links here. The Sirsi/Dynix podcast can be found here (or the direct feed), and, for a good example of visual “noise,” maybe you should check out this good reason to stay away from certain blogs.

