BusinessWeek Logo

How to save this blog (or at least the posts)

Posted by: Stephen Baker on November 22, 2009

Heather and I both got the word on Thursday that we won’t be part of BusinessWeek once Bloomberg takes over, on Dec. 1. (We’re both pleased with this outcome, though it’s no picnic watching the staff get decimated, with good friends and colleagues heading off in every direction.) In the coming week, I think I’ll write a nice long eulogy for this blog.

But in the meantime, a question: Does anyone know how to preserve and store our four and a half years of blog posts and comments? Our colleague Arik Hesseldahl said something about turning each month into a pdf. I’ll look into that (as soon as I close my last story tomorrow). But you have a specific how-to, I’m all ears. As I wrote in September on my Numerati blog, I’m not sure how committed Bloomberg will be to social media. There’s no telling when someone might pull the plug on a server housing the archives of a discontinued blog.

TrackBack URL for this entry: http://blogs.businessweek.com/mt/mt-tb.cgi/

Reader Comments

Joe Remo

November 22, 2009 11:13 AM

Been there, but once you start your next chapter you will feel a breathe of fresh air.

Hopefully, your blogs have been saved by the Internet Archive Org. Also I think there are paid services that can preserve your blogs. I have seen my website that I created fifteen years ago, so I can attest to the Internet Archive. But I can't recommend a paid service since I never thought about using one.

Thanks for all you have done and please continue your success elsewhere.

John Craft

November 22, 2009 12:03 PM

Stephen, from a technical perspective, someone on your IT support staff should be able to get you a copy of the database - that's easy. However, there may be copyright issues related to the posts themselves if you planned to re-post them (i.e. reconstruct the blog at another site).

Chris Amico

November 22, 2009 12:13 PM

It looks like you're on Movable Type. You should be able to export your posts as XML and move into Wordpress, or probably another MT blog (I've moved from Blogger to Wordpress myself, which wasn't difficult).

Here's a full explanation that will walk you through the process: http://codex.wordpress.org/Importing_from_Movable_Type_to_WordPress

Good luck.

Amyloo

November 22, 2009 12:27 PM

Have you thought about asking someone who uses something like Sharepoint Designer to import it as a site? It wouldn't preserve the database but would grab the all the generated HTML as HTML. If I have time later I'll experiment.

Steve Wart

November 22, 2009 12:39 PM

Maybe you could export the whole thing as a large RSS file. It should be fairly easy then to find someone who could put together a script and re-import the articles into one of the free blog services.

Christian Huldt

November 22, 2009 12:40 PM

seems to be documented: http://www.movabletype.org/documentation/appendices/import-export-format.html

Steve Rubel

November 22, 2009 01:00 PM

Steve, was great seeing you.This might help - http://www.backupify.com/

Jeff Jarvis

November 22, 2009 01:07 PM

No PDFs, please!
You can export the content and put it in a new blog; just create a parallel universe for it.

Veit Irtenkauf

November 22, 2009 01:13 PM

There is two aspects to saving a blog

1. Negotiate for the rights to the blog content. I'd think BW would be open to transferring them to you given the take-over and subsequent actions. Without the rights, though, it does not matter what the technical solution is

2. How accessible do you want the past blog posts to be? Index-able by Google? Assuming you have the rights, do you even want to continue posting? If so, you might follow @Chris' suggestion above to convert from Movable Type to Wordpress (or approach Movable Type and negotiate for them to host the old blog) Otherwise, creating PDF's might work.

Good luck!

Robert Stewart

November 22, 2009 04:23 PM

You can find snapshots (through June 08, as of today) on the Wayback Machine at http://web.archive.org/web/*/http://www.businessweek.com/the_thread/blogspotting/

As others have commented, though, exporting the content directly from MT is the way to go, assuming you can reach an agreement on any copyright issues.

Since the main page for the blog has links to all the monthly archive pages, worst case you could get someone to write a simple crawler/scraper that extracts the content.

francine hardaway

November 22, 2009 06:51 PM

I easily moved my blog from MT to WP; there's an automated way to do that on WP. I agree with the posters above, though, in that the rights to the IP are the thing to get straightened out.

Dane Hesseldahl

November 22, 2009 07:34 PM

Stephen,
If you guys need a hand on the technical end of things - I'd be happy to see if there is anything I can do - although you'll most likely need to cooperation of the BW IT dept.

Email me = dane@simler.com

Christopher Alden

November 22, 2009 07:56 PM

Stephen -- No need to "approach Movable Type" -- we'll approach you :). We (Six Apart, makers of Movable Type, TypePad, Vox) would be happy to help with a migration and get a new, independent site set up asap. It should be fairly straightforward to go from one MT site to another one and we'll do it for you. We are committed to helping every former or soon to be former journalist set up with a site so they can continue to do what they do best. I'll be in touch or feel free to email me a chris dot alden at sixapart dot com.

Panayotis Vryonis

November 23, 2009 01:37 AM

I created a static archive of this blog using httrack. It looks like it works fine!

You may download it from here: http://dl.dropbox.com/u/2437600/blogspotting.tgz (I'll delete it after a couple of days)

Lauren Young

November 23, 2009 07:50 AM

Fellow blogger Lauren Young here. I'm wondering what to do with the Working Parents archive, so please let me know. I'd hate for it to disappear...

Erik Scherz Andersen

November 23, 2009 10:09 AM

Way to go, Christopher!

Daria Steigman

November 23, 2009 02:18 PM

Stephen-- Your terrific blog has been on my required reading list, and I'm really going to miss you & Heather in this space. Good luck with your next venture, and please do let all your readers know how to find you.

Best,
Daria

Jeff Rutherford

November 23, 2009 04:04 PM

Despite Jeff Jarvis' comment, here's a service that you can use to store/convert your blog content into a PDF or eBook - http://www.zinepal.com/

For what it's worth, you may want to do both - move the archives to another server - and offer the archives as a downloadable PDF or eBook.

Lloyd Budd

November 23, 2009 06:41 PM

Come home to WordPress ;-)
Lloyd Budd here from Automattic's WordPress.com. We're also eager to help in any way we can.

Chuck Tanowitz

November 23, 2009 09:55 PM

This is a discussion worth having far beyond Blogspotting. Over drinks an acquaintance with a background in library science commented recently that despite the massive amount of data we're producing, the question of its longevity is still very much up in the air.

I wrote a family blog for a long time, but archiving it with pictures intact is something else entirely. Even the long-term existence of the massive amount of pictures we're taking is in question.

I can still pull out 50, 60 and 70 year old pictures (and even some negatives) that are stored in a shoebox, but will my grandchildren be able to see all that images I took?

Jeff Rutherford

November 24, 2009 10:22 AM

Chuck has a very good point. There's a ton of quality content being produced amidst all the user-generated content clutter.

But, will that content be available in 10-15-20 years? Als, we're all assuming that the http: - browser - URL underpinnings of the Web will remain as a standard. That's not a given.

What will the Web look like in 20 years when we're all using super-computer iPhones or other mobile gadgets?

Librarians and technology entrepreneurs definitely need to tackle the longevity issue of Web content.

Panayotis Vryonis

November 25, 2009 05:41 AM

Jeff, you are right. The answer is the simplest form has better chances of surviving time.

Here is my test: I get "the content" in some format. How much effort do I need in order to read it?

For example, a Wordpress export is good. But I need to setup a MySQL server, then Apache, PHP, and other stuff. Will all of them be there in 10 years, will the future versions be compatible with the export taken 10 years ago? Possibly, but I wouldn't bet on this.

If I had to choose, I'd go for plain HTML files. Not a safe bet, but the odds are much better.

Oh, and I'd put a tarball of all the HTML files on the web and let archive.org archive it and simple users to download and redistribute it.

Jonathan Brown

November 25, 2009 09:46 AM

Thanks a lot Stephen for such a nice post about how to save our blog posts and comments longer . I really appreciate your post . Keep blogging .

:-)
http://www.tvtubex.com

Post a comment

 

About

In Blogspotting Senior Writer Stephen Baker and Associate Editor Heather Green take a look at how cutting-edge technologies are changing business and society. Whether its blogs or wikis, data crunching or data targeting, technology’s advances are reshaping the world that we live in.

BW Mall - Sponsored Links