MD5 hashing in CoffeeScript, Perl and Scala

Kind of a follow up to my on MD5 hashing in Python, Ruby and Groovy posting, Here is a way of doing MD5 hashing in CoffeeScript and Perl.

CoffeeScript/Node.js 

Perl

Scala

I also moved the code for Python, Ruby and Groovy into the same Gist on Github. If you know a better way feel free to fork and update!

Sinatra with WebSockets

WebSockets are a hot topic now a days with the HTML5 push, even though they are not officially part of HTML5 spec. If WebSockets are new to you, they are a way of being able to keep a connection open from the client’s browser to the server. It will let you push data back and forth, think AJAX but without  the need of pulling for new data over and over. WebSockets give you the ability to push, which gives you a very close to real time update on the client’s side. At this time, most of the newest versions of the popular browsers support WebSockets.

Due to the real time factor of WebSockets, they have found themselves into online games. One of my new favorite examples of this is http://browserquest.mozilla.org/, a multiplayer Zelda clone. You see a lot of examples using Node.js along with Socket.IO, which is a great pack of new technologies (example of my own) but what if  you don’t feel like switching to a new platform to take use of WebSockets? Most languages have libraries for dealing with WebSockets. In Ruby there is an em-websocket gem which is an EventMachine based WebSocket server.

I have a small project on GitHub which shows how I setup a Sinatra application to take uses of WebSockets to make a chat application. Nothing too great but this should give you an idea of where to start.

Here is the main part of the code,

require 'rubygems'
require 'em-websocket'
require 'yajl'
require 'haml'
require 'sinatra/base'
require 'thin'

EventMachine.run do
  class App < Sinatra::Base
      get '/' do
          haml :index
      end
  end

  @channel = EM::Channel.new

  EventMachine::WebSocket.start(:host => '0.0.0.0', :port => 8080) do |ws|
      ws.onopen {
        sid = @channel.subscribe { |msg| ws.send msg }
        @channel.push "#{sid} connected!"

        ws.onmessage { |msg|
          @channel.push "<#{sid}>: #{msg}"
        }

        ws.onclose {
          @channel.unsubscribe(sid)
        }
      }

  end

  App.run!({:port => 3000})
end

Link to the full project here on Github, if you have any questions or comments please post, also any suggestions on improving this are welcome.

Using Redis to sort World of Warcraft’s Auction House data

If you have read some of my other postings you know by now that I enjoy using the data from World of Warcraft in my projects. In my mind it’s a good amount of data that I can easily understand having played World of Warcraft for many years now. If you look at the Blizzard Community Platform API Documentation you will see that there is an Auction Resource that will return a link to a JSON file you can download that will show you all the auctions for a given server. I wanted to be able to take this file and pull some useful data or format it in a way that I could easily push it into a database.

For the Lothar server, the file is about  4M, so I wanted to ensure that the JSON parse I was using is fast. From Google I found yajl-ruby which is “~1.9x faster than JSON.parse” which will help with reading the file. I also wanted to use Redis for this project. Redis isn’t something that I have talked about a lot on the blog before but it’s one of my favorite database systems. At its core it’s a key-value data store but it also comes with a lot of other great features like, other data type like sets, sorted sets and hashes along with a pubsub system. For this project I’m going to be using sets and sorted sets.

The idea of using Redis to sort data is one I got from Eric Redmond’s talk on “A CouchDB, Neo4j, Redis, and Node.JS Circus” from NodePDX. It’s a great talk about using a nice blend of technology or a nice list of buzz words! :)

Lets take a look at the code,

At the star it’s reading the JSON return to get the URL for the main JSON file with all the auction data. Next we parse the auction data and start loading the auctions for the alliance (only loading in the alliance data for I can easily check it). Redis’s sorted sets take use of 3 keys, the item ID as the key, the bid as the score and the auction ID as the unique key. Redis will sort the data as it goes in to the database based on the score or the bid price. I also add the item ID to a set for I can easily iterate over all the data that was loaded from the JSON file.

The next part reads back the data, for each item in the set I find top and bottom item or the lowest and highest price and then return its score or price. I divide by 10000 as all the price are in copper and it’s easier to read them in gold. If you wanted the output to be in Gold, Silver and Copper then you can use divide with modulo operation to do this.

I also wanted to display the item name not just the ID, there is another API call for this but doing a call per item takes a long time so I use Redis to build up a local cache of the names. When you do a get on a key that is not in the database Redis will return nil. If the name is nil then I make a API call to get the name and then I save it back in to the database. Next time around the key isn’t nil and I don’t need to make the API call. I did find that two items where missing from the API and would reply with a HTTP 404 error. Not sure why they are missing but here are the ones I found, 39998 is Brilliant Scarlet Ruby and 41782 is Design: Lightning Forest Emerald, I just looked them up on Wowhead and added them to the database by hand. Just keep this in mind when running the code.

The output will show the lowest and hightest bids on a item along with how many items. The output is just there to show something from the code, I see this being used as a step to get data into a other type of database.

If you have any questions or comments please post, also any suggestions on improving this are welcome.

No Fluff Just Stuff – Boston 2012

Last week was a busy week for me, I’m just getting time right now to sit down and write about my take a ways from No Fluff Just Stuff. For the people that are not familiar with No Fluff Just Stuff it’s a conference based around Java technologies. It takes place allover and happens on the weekend which makes it an easier sell to your managers or higher ups. I’ve been lucky enough to have gone to 3 of them and I always learn about great new things. I would highly recommend checking one out if you are in an area they come to! This time around I feel they really went all out. They removed all the need for paper by lending people iPads to use for note taking and talk evaluations, along with giving away a very nice backpack and a T-shirt.

My favorite talk was ‘Sonar workshop’ by Matthew McCullough of Github. Sonar is a grouping of great tools to help with code quality by pointing out things that may have not been done right in the code due to time restraints or other issues. It tries to apply best practices to your code and show you where you can do better. This falls in line with idea of paying back technical debt which is a concept that I recommend reading up on. Most of the time you hear about great software and tools but have a hard time getting them into your work environment due to needing to justify it with your managers or finding time to get it up and running. In this case, in the workshop while waiting for other people to catch up, I was able to import my team’s main project into Sonar! Having my boss next to me in the class also helped with the justification part. Monday morning I had Sonar up and running on one of our servers and started to “pay back” some of the technical debt we had. I’m always concerned about code quality, so Sonar as become one of my favorite tools next to Jenkins!

Some other talks I enjoyed were the ‘Connected Data with Neo4j’ and ‘Functional Thinking’. Neo4j is another type of noSQL database that is based on graphs for doing certain types of queries. It’s very fast, but your data really needs to be in the right format to take use of it. The talk about Neo4j was good as Tim Berglund had us think about our own data needs in terms of graphs. For me this helped show some use cases that I didn’t see for Neo4J before. I also liked the Functional Thinking talk by Neal Ford as it showed the ideas of writing function code and how to solve problems in different ways. He also went into why there is so much talk about it and also comparing it to when Object-oriented programming come about and how long that took to become mainstream.

As always there was a lot of talks about languages on the JVM like Groovy, Scala and Clojure and how they can really help your production. The BOF (Birds Of a Feather) talk I went to was about languages on the JVM. A good number of people talked about switching or talking about tools to help them like IDEs (or not needing tools like IDEs)

Over all I had a excellent time and as always I learned a lot. I’m hoping I will be able to take what I learned and apply it to my work! Now I just need to work on convincing my manager to send me to ÜberConf! :)

MongoMapper with Sinatra Example

If you have worked with Ruby on Rails Activerecord before you know how it makes working with databases very easy by giving you objects to deal with communication with the database. MongoMapper is an Object-Document-Mapper (ODM), it takes a lot of ideas from Activerecord and in turn should be very familiar. I found MongoMapper a good fit for people that want to use Sinatra ,because you are able to simply make your model classes right within the application file. MongoMapper will also work within Rails too if that’s more like your style.

I coded up a small example of using MongoMapper with Sinatra to make a URL Shortner. Very rough but again is just an example.

Due to MongoDB being schema-less there is no need for database migrations, just ensure you have your database up and running and you should be good to go. The MongoMapper model is up top, named ‘Shorten’ consisting of variables for url, shorten_id, created_at and a count. This would become a collection within the database. Within the configure block, we tell MongoMapper what database to use, in this case ‘urls’

If MongoMapper turns out not to be your cup of tea then you should take a look at Mongoid, which could be a better fit. Both MongoMapper and Mongoid are able to be used within Rails but people say that Mongoid has better Rails 3 support.

After reading some postings and working with MongoDB you maybe wondering if there are any hosted solutions. There are 2 that come to mind, MongoLab and MongoHQ. Both have a free tier to help you get started. Both are also available as add-ons from Heroku with great documentation which will make it easy to setup your application.

Here are some resources links,

If you have any questions or comments please post, also any suggestions on improving this are welcome.

Loggly Archives into MongoDB

loggly-logoLoggly is a great cloud service for managing log files from servers or many servers. It’s also an add-on for your Heroku hosted app. Loggly comes in different tiers from a free to monthly service based on how much data you store on Loggly servers. Being cheap, I have picked the free tier for amscotti.com as it’s not a mission critical app and I don’t have tons of logs. One of the coolest things I like about Loggly is they do ‘Log Archiving’ on Amazon’s S3 for you. All you need to do is setup a bucket on S3 and update some settings into the Loggly UI.

The files they push out to S3 are JSON files that have been gzip’ed and are in folders based on year/month/day. There is a easy way to push their files in to MongoDB just using some nice Unix tools and MongoDB’s importer.

Download all the files you want to push to MongoDB, you can leave the files in the folder they are in. No need to move them around but there also isn’t harm in it. There are many tools you can do this with, myself I use Cyberduck and just drag and drop the folder on my desktop. Keep in mind that the files for the current date and time are still being written out and could give you errors when you try to copy them. Best to not copy them.

Now we need to get the files into a format that MongoDB’s importer can read. Lucky the importer can read JSON along with other formats so we don’t need to do too much work. I find this can be done with one command. Ensure that you’re in the root folder for all the files you want to import into MongoDB and run this,

find . -type f -name "*json.gz" | xargs gzcat >> output_file.json

You should now have a file of all the logs in JSON format. Depending on your system you may be able to use zcat instead of gzcat.

The next part is the import into MongoDB, again this can be done with one command.

mongoimport -d loggly -c amscotti output_file.json --jsonArray

The -d is the database you want to use and the -c is the collection you want the data to be imported in to. Due to the format of the file you need the –jsonArray. I don’t have many logs, I have about 8816 lines in the file and it took less then a second to import into the database.

After that all your data is imported and you are good to go, your now able to run map reduce on your log files to your heart’s content!

If you have any questions or comments please post, also any suggestions on improving this are welcome.

MongoDB, added Ruby in to the mix

To keep with the previous posting I made with MongoDB, I am going to show some Ruby code of how to connect and push data in to your database. If you take a look at the MongoDB driver page you see that there are a good number of programming languages that are supported by MongoDB.org along with tons that are supported by the community. My languages of choice is Ruby. It’s a Supported language from MongoDB.org and you can install the drivers using gem.

There are 2 Gems you are going to want to install, ‘mongo’ and ‘bson_ext’, you are able to get away with installing just ‘mongo’. You will be warned that any performance-critical applications should have ‘bson_ext’ installed. At this point you should be all set to start using MongoDB from Ruby. To give a quick overview, here is my Ruby World of Warcraft Armory code that dumps the data into MongoDB.

Not too many changes, we have at line 5 and 6 as the connection to the database and picking of the collection to be used. Line 7 isn’t needed but if you keep running this code then you will keep on adding to the collection, by dropping it lets you remove all the data and start fresh. Line 24 and 25 are making a document as a hash and passing it to the collection to be save. You now have data in your MongoDB collection.

Now that we have data in the collection we are able to do some pulling. Line 27 shows us how many documents are in the collection. Line 32 is a find based on class type which is in a loop so all the classes will be looked up. All the other part of the code should be similar as before or just ruby code.

I hope to do another posting that shows off some code using MongoMapper which you can most likely guess from the name is an Object-relational mapping (ORM) for MongoDB.

If you have any questions or comments please post, also any suggestions on improving this are welcome.

Getting started with MongoDB

I’m a big fan of CouchDB. I enjoy how they go about doing things and how you are able to use it without the need of drivers as its all RESTful based. As long as your language of choice has the ability to make RESTful calls and read JSON data then you’re all set. But before I really started to use CouchDB I did start looking at MongoDB, another Document data store. MongoDB is now being compared to mySQL as far as its use by projects. It has a great group of developers along with getting some great press from big companies moving over to MongoDB from some type of SQL based system.

I wanted to take another look at MongoDB and eventually do some playing around with it in Ruby for another posting.

Installing

You can get MongoDB from their download page at http://www.mongodb.org/downloads, for MacOS, I installed it the same way for these types of apps.

ascotti$ wget http://fastdl.mongodb.org/osx/mongodb-osx-x86_64-2.0.2.tgz
ascotti$ tar -xzf mongodb-osx-x86_64-2.0.2.tgz
ascotti$ sudo mv mongodb-osx-x86_64-2.0.2 /usr/share/
ascotti$ cd /usr/share/
ascotti$ sudo ln -s mongodb-osx-x86_64-2.0.2/ mongodb
ascotti$ sudo mkdir -p /data/db

Don’t forget the last part, /data/db is the path where your databases are saved. After moving around the files be sure to add lines below to your .profile in your home folder.

MONGODB_HOME=/usr/share/mongodb; export MONGODB_HOME
PATH=$MONGODB_HOME/bin:$PATH; export PATH

Now lets test it out. First we need to start the server and then run the clients for it.

ascotti$ sudo mongod
mongod --help for help and startup options
Tue Jan 17 13:20:47 [initandlisten] MongoDB starting : pid=3072 port=27017 dbpath=/data/db/ 64-bit host=Lycan.local
Tue Jan 17 13:20:47 [initandlisten] db version v2.0.2, pdfile version 4.5
Tue Jan 17 13:20:47 [initandlisten] git version: 514b122d308928517f5841888ceaa4246a7f18e3
Tue Jan 17 13:20:47 [initandlisten] build info: Darwin erh2.10gen.cc 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_40
...

and now in a new shell start the client,

ascotti$ mongo
MongoDB shell version: 2.0.2
connecting to: test
>

At this point you are all set to start playing around with the database.  There are also downloads for Linux, Windows and Solaris along with packages for Ubuntu and Fedora.

Some playing around

Now that we have everything all set, lets play around in the client. This is a Javascript base client and documents are JSON data, so if you have worked with JSON before you should be familiar. Sadly there is no web interface like the one that comes with CouchDB, but you can get a 3rd party one.

MongoDB shell version: 2.0.2
connecting to: test
> use playingaround
switched to db playingaround
> doc = {fname:"Anthony", lname:"Scotti", favcolor:"Blue"}
{ "fname" : "Anthony", "lname" : "Scotti", "favcolor" : "Blue" }
> db.mytest.save(doc)
> db.mytest.find()
{ "_id" : ObjectId("4f19f49c4bffc928bbeee802"), "fname" : "Anthony", "lname" : "Scotti", "favcolor" : "Blue" }
> db.mytest.save({fname:"Dave", lname:"Smith", favcolor:"Green"})
> db.mytest.find()
{ "_id" : ObjectId("4f19f49c4bffc928bbeee802"), "fname" : "Anthony", "lname" : "Scotti", "favcolor" : "Blue" }
{ "_id" : ObjectId("4f19f4f74bffc928bbeee803"), "fname" : "Dave", "lname" : "Smith", "favcolor" : "Green" }
> db.mytest.findOne()
{
	"_id" : ObjectId("4f19f49c4bffc928bbeee802"),
	"fname" : "Anthony",
	"lname" : "Scotti",
	"favcolor" : "Blue"
}
> db.mytest.findOne({fname:"Dave"})
{
	"_id" : ObjectId("4f19f4f74bffc928bbeee803"),
	"fname" : "Dave",
	"lname" : "Smith",
	"favcolor" : "Green"
}
>

So what’s happening here?

The first line, ‘connecting to: test’ tells you that you have connected to the test database.  The ‘use playingaround’ is switching/making a new database for you. As far as the ‘mytest’, this is a collection, think of a collection like a table, it’s what is going to hold the documents for you.

The next lines are making a document and saving it into a doc variable to be used later. As you can see it’s JSON format, to save this in to the collection all you need to do is run ‘db.mytest.save(doc)’ , where you would replace ’mytest’ with your collection name.

The .find() and .findOne() will show you documents that are in the collection, .findOne will only show you one docment at a time though you are able to pass in criteria in to the find like I did with the .findOne({fname:”Dave”})

This is a very quick overview of using the client, you can find much better info on MongoDB.org’s Tutorial and MongoDB.org’s Overview

If you have any questions or comments please post, also any suggestions on improving this are welcome.

My experience with Heroku

I haven’t done too much publishing of web applications for my own stuff. At work we have an Ops team that deals with setup and deployment of our code. I have used Rackspace cloud and AWS for setting up servers before but mostly just for testing and nothing for production. As a developer I want to focus on the code and not fine tune an application server. I want to be able to write code, push it out and scale without putting much thought in to it. Oh, and I want do to do this without hiring a Ops team of my own! :P

Platform as a Service (or PaaS) is what I’m looking for. There are sites like Engine Yard, DotCloud and many others that give developers what I’m looking for, the ability to just focus on writing great code and not to worry about hosting your application. I have been reading a lot about Heroku, they started as just a Ruby and Ruby on Rails hosting but now are able to host applications made in Python, Node.js, Java, Scala, and Clojure along with Ruby. To me this makes Heroku a one stop shop for hosting. Heroku will give you one free web node to get you up and running. This is a great way to test your application and see how to runs on their system without it costing you anything. Scaling up is as easy as running a command to add more web nodes or workers to your application. They also allow you to scale down to save money when traffic is low.

Heroku also has a large number of ‘add-on’ for things from, MongoDB hosting to add-on that allow you to monitor and tuning your application. All the add-ons are fully documented and give you clear examples on how to use them. Speaking of documentation, I do have to say that the layout of the documentation on the Heroku site is great and I was able to quickly find anything I needed to know.

Like others in the space Heroku uses the Git workflow. After getting things all set (which didn’t take long at all), you just need to do a ‘git push heroku master’ to push out to Heroku and they will take care of the rest for you (restarting, installing gems, updating files and starting the app again…. you know, all the fun stuff!). If you are already using Git for your version control system then this will feel very natural to you.

Right now amscotti.com, my personal site, is being hosted on Heroku for free as it’s a low traffic site without need for more then just one web node. With my experience with Heroku so far I can safely say that the next web app I push out will be on Heroku and I would highly recommend to anyone looking for a easy way to host your web apps.

Anything you want to share about Heroku? Good or bad is welcome. Any other Platform as a Service that is worth a look? Post in the comments!

LinkedIn authentication with Sinatra

To take the authentication with Sinatra a bit farther you may want to use  another service to do your authentication against. This is some sample code adapted from a Rails example. This code takes use of the linkedin gem from Wynn Netherland to do the authentication and also make some calls to the LinkedIn API. There are other gems that just do authentication for many services like Facebook and Twitter but for this sample I wanted to be able to make additional calls to the LinkedIn API.

If you know of any way to make this code better please comment or fork the Gist.

decorous-drummers