I showed this to a lawyer friend of mine who also has a background in technology. While he thought it was a nice tool for preliminary or casual research, he said that ultimately he will still need to go and purchase the documents so that he is absolutely certain they are the originals since there doesn't seem to be a publicly available way to verify them.
A possible solution to this would be a published hash value from the original source.
That is what I was thinking. Publishing a hash along with the listing of the article would essentially be free as far as bandwidth costs are concerned which is what is supposed to be the reason for the fee.
It's clever, but we'll have to wait and see how much it actually adds to the collection of documents freely available. The bulk of the Recap repository was seeded with documents Malamud collected. Of course, Recap appears to be a nice way to actually interact with that collection.
I don't think you should discount how much of a difference this makes. It slides right in to users' existing workflow requiring far less effort on their part.
I don't think I am discounting how much of a difference it makes. My point is simply that without the 1 million documents Malamud handed over, the tool would be largely useless for many months if not years. What Recap does is make Malamud's documents accessible to lawyers (they are presently stuck in massive tar files at bulk.resource.org), and the combined efforts of Malamud and the Recap people, nicely complement each other.
I applaud the people behind this.