As promised, MapReduce support is once again working! Neptune 0.0.8 fixes this support, so when you use an input job to put your data into the underlying datastore, it will also put it into HDFS in case you want to use it for MapReduce later. The test suite includes test cases for regular Hadoop MapReduce via Java WordCount, and for Hadoop MapReduce Streaming via a Ruby implementation of the Embarassingly Parallel NAS Benchmark.
Also, I forgot to mention back in the 0.0.7 release that Walrus support was fixed, so just like for Google Storage, you can run the following:
neptune( :type => output, :storage => "walrus", :EC2_ACCESS_KEY => "your access key", :EC2_SECRET_KEY => "your secret key", :S3_URL => "http://ip of storage box/services/Walrus" )
We also changed it so that for all the S3-like storage backends, you need to specify the URL starting with http, so keep that in mind when deploying jobs.
Also, the test coverage is up to almost 87%, as we now cover many more failure conditions:
So update your Neptune gem and get coding!
And we have a new version out! Neptune 0.0.7 adds quite a bit of stability compared to previous releases thanks to the use of automated testing via good old fashioned Test::Unit. We also run rcov to automatically see how much code we're covering in our tests and which code in particular we're missing. Right now we're at a little less than 65% coverage - take a look here:
Our fancy new automated testing also revealed a number of tiny bugs to fix (a few around the auto-generation of makefiles) and a major one - when we added input jobs in 0.0.6, we wanted to use it to make job input / output chaining easier, but as a side-effect, it broke MapReduce jobs. These jobs need their input in HDFS when they start, and with all the different storage options we support, we weren't consistently putting the input in HDFS automatically. It's still something we're working out, but it's something we will fix for 0.0.8, so stay tuned for more updates from the world of Neptune!
We're looking into writing some nice automated tests for Neptune for the next release - it's mostly done but we're also messing around with rcov as well to make sure we're covering most of the interesting cases. Hopefully this will make sure we keep Neptune stable across releases, so stay tuned!