Wordpress Themes

A place for my programming projects and the occasional blog about technology related matters.

Export Haloscan comments

Author einar

A few weeks ago I was helping my sister change her blog. She has a Blogger account but uses Haloscan for comments, since when she started blogging Blogger didn’t offer comments as a part of their service. That has changed now, so I thought it would be much more convenient to have the comments and blog all at the same place. I just needed a way to export her Haloscan comments and import them into Blogger. So I wrote a small Python script to do the exporting for me.

The script logs into your Haloscan account, makes a http request for each comment and writes them to an .xml file. If the script fails halfway through you can just start it again and it will continue where it left off. A xml file with the name yourusername.xml will be created in your working folder where you start the script. It contains all information about every comment, date, author, url, email, ip, threadid, commentid and text. You can then parse the xml file to import them into other comment systems. Before you download the script, be sure to read the following disclaimer:

DISCLAIMER: This is provided “AS-IS”, I make no guarantees that this works, it’s based on screen scraping so could stop working whenever Haloscan change their pages.

According to the Haloscan Terms of Service (http://www.haloscan.com/privacy/) it’s not explicitly forbidden to screen scrape their site. HOWEVER, they say:

“We reserve the right to suspend, delete, or cancel any account/service at any time for any reason.”

This script makes a new http request for every single comment so if you have thousands of comments and Haloscan doesn’t approve of you pounding their server and suspends your account and you lose all your comments, I CANNOT BE HELD RESPONSIBLE. USE AT YOUR OWN RISK! You’ve been warned!

Now that you’ve read that, you can download the script here or view it syntax highlighted here.

Reader's Comments

  1. Mamamiiia |

    Hello,

    My techi habilities have limits. Can you tell me what to do with this script. I would love to use it but don’t know how! Million thx

    C.

  2. einar |

    What you need is to have Python installed (see http://python.org) and then open up a command prompt and type “python haloscan.py”. But you have to realize that this will only put all the comments in a file on your hard disc, if you then want to import them into some other blogging system you’ll have to do it yourself? Also, if Haloscan doesn’t like this and closes your account, I cannot be held responsible. Use at your own risk!

  3. Geoff |

    Hi einar!

    I just wanted to thank you for your script… it made things so much easier for me!

    Also, I found that the following regex worked better than the one you used, because I had the postnames appear instead of the comment ids (line 89 for those of you playing along at home):

    comments = re.findall(r’target=”_blank”>(.*?)<a href=”editpost.php\?post\=(\d+)”‘, html)

CommentComment

For spam detection purposes, please copy the number 3618 to the field below: