Thoughts on Technology, Methodology and Programming.

scrAPI on Snow Leopard

Posted by Marcus Wyatt on 3 November 2009

Today I needed to scrape some data from a website and tried to use the trusted old scrAPI to do the job. Grrrr, its not working. Throwing an error:

Scraper::Reader::HTMLParseError: Scraper::Reader::HTMLParseError: Unable to load /Library/Ruby/Gems/1.8/gems/scrapi-1.2.0/lib/scraper/../tidy/libtidy.dylib

After some time on google I didn’t find any fixes for the issue. So I decided to build from source…

I grabbed the assaf’s github repository.

  • git clone git://github.com/assaf/scrapi.git

Then tried the tests by running

rake test

117 tests, 346 assertions, 0 failures, 44 errors

Nope errors all over the show… Looking at the original exception message, I checked if the libtidy.dylib exist on the lib/tidy directory. Nope, not there….

So where do I get this library file….

MacPorts to the rescue… Install tidy from MacPorts using the following command:

  • sudo port install tidy

Now we need to find where MacPorts installed the files using the following port command:

  • port contents tidy

The result:

Port tidy contains:

Now all we need to do is copy the library file to our scrAPI source directory:

  • cp /opt/local/lib/libtidy.dylib [your source location]/lib/tidy/libtidy.dylib

Ok, before we speed ahead. Lets just run those test to check if all is fine:

117 tests, 474 assertions, 0 failures, 0 errors

Awesome, we are almost there. Next we need to build the gem using the rake:

  • rake package

Make sure you get a ‘Successfully built RubyGem’ message. Now we are ready to install the newly build gem and test scrAPI again.

  • sudo gem install pkg/scrapi-1.2.1.gem

And there you go, scrAPI working again.


5 Responses to “scrAPI on Snow Leopard”

  1. Mike said

    This worked like a charm, thanks man!

  2. Marcus said

    Great, thanks for sharing!

  3. Jeremy Weathers said

    I had to comment out line 3 of the scrapi Rakefile “Gem::manage_gems” was deprecated and removed

  4. mike said

    I also had to add a “require ‘test/unit'” in test/selector_test.rb when I was testing scrapi (running ubuntu 10.04)

  5. Thank you, I have been seeking for info about this subject for ages and yours is the best I have found so far.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: