Turing Test for Clouds

December 11th, 2009

One of the ‘trends’ in programming is Monkey Patching which bypasses fixed static types and is used in more dynamic languages. I internalize the technique as; “if it looks like a duck, walks like a duck and quacks like a duck…. then who cares what it really is”.

Yea, as a philosophy I know it lacks nuance but it’s worked well historically so let that dog hunt.

Another important bit of geek-trivia is the famous Turing Test, if you’re here and don’t know what that is (or how to figure it out) then you should move along now, this isn’t the droid you’re looking for.

Simplified, Turing’s Test and Monkey Patching both suggest that explicit identifications aren’t practical. Rather that implicit behaviors should define the use of something. It’s a very expedient supposition that anyone who’s dealt with contracts would envy.

What’s in this for Cloud, given that NIST has done a nice job of defining cloud in practical terms?

As a buzzword, cloud’s seen more then it’s fair share of hype;

google trends for cloud computing

So everyone’s been trying to claim the moniker, and today I was reading about a ‘cloud based product’ that really was simply a web portal much like Walmart. Though I’m sure it can accurately claim to be cloud under a number of definitions, my instinct was “No, definitely not”.

However, a colleague replied to my skepticism saying; “it underlines that there are already commonplace applications in use that are legitimately ‘cloud’.”

Where do you stand on such a claim? That online shopping or market makers such as eBay are SaaS cloud services?

Underlying it all, are deep philosophical questions as integral to humanity’s future as determining where the soul resides!

  • What if I have an amazingly dynamic and responsive application, run by monkeys behind the curtain?
  • Would I be cloud computing if I used twitter via snail mail?
  • Does my subdivision’s swimming pool classify as IaaS, with its broad network-wide (i.e. roads) access, and rapid elasticity (easy capacity management) and measured service (towel charge) if there’s no lifeguard (On-demand self-service)? Surely you don’t need me to explain “resource pooling”.

Strictly speaking I’m not sure where I stand, but I think Turing would tell me to go with the duck and even a million monkeys patching the pool shouldn’t change my mind.

couchdb coming back for more

October 30th, 2009

Not that long ago, JChris pointed out that not only was there a new version of couchdb out but that Janl had released a new version of his OSX package, CouchDBX!!

So I knew I needed to find a time to try both new versions out.

‘Thankfully’, I can’t really get to sleep right now so I thought I’d try to be productive and give them both a go again with my small performance test.

And here’s the latest results.

Here’s a baseline, which if you recall loads the file from disk.

$time ./finding_keywords.py
[('Hacker', 249160.0), ('Techcrunch', 249160.0)]

real    0m0.259s
user    0m0.216s
sys    0m0.041s

Now for couchdb’s results. Here’s the portion of time required for the database load:

$ time ./couchdb_finding_keywords.py
real	16m53.912s
user	2m57.409s
sys	1m35.209s

This is down quite substantially from the 28 minutes the last version tested took to load.

Rather then run the timing for the loading stage again (since it’s clearly way beyond the time required to analyze the text file), I thought I’d jump to an actual query.

Unfortunately in the process of running the real test I realized I hadn’t created the necessary views for the new database.

Then, in doing so, I made a typo in my map() function and had to wait through many, many error messages like:

OS Process :: function raised exception (ReferenceError: worse is not defined) with doc._id ############

This was certainly my fault, but it would be nice if couch could take a break from spitting out error messages and not bake my processor any further running a bad map()!

I finally was able to click off the temporary view page and found the “Stop” button.

I managed to get most of my view function squared away but then missed the quotes around the dictionary key “word”, so while it should have read:

"map": function(doc) {
    emit(doc[\"word\"], 1);
}

"reduce": function(key, value, rereduce) {
    if (rereduce) {
        return sum(value);
    }
    else {
      return value.length;
    }
}

It didn’t and the bad line came out as:

emit(doc[word], 1);

So as you can imagine, I had to do the dance all over again. This time, after I was able to stop it I went directly to the document for the design itself and edited the code there.

I know I hit the green arrow to save, but when I went back to the design view to see the results it still had the same mistake. So I corrected it there, and quickly hit ‘Save’ and then couchdbx promptly crashed on me.

After I told OSX to restart it I got:

"The application beam.smp quit unexpectedly after it was relaunched"

So yes… sometimes software and I don’t get along. What can I say, but that it makes me a great tester!

I was able to restart couchdbx though, and it seemed to load fine, and eventually got data from a browser after the view was built.

But I also got an interesting tidbit from the DBX console too:

1> [info] [<0.66.0>] 127.0.0.1 - - 'GET' /_config/native_query_servers/ 200
1> [info] [<0.86.0>] checkpointing view update at seq 92542 for keywords _design/finding
1> [error] [<0.69.0>] Uncaught error in HTTP request: {exit,normal}
1> [info] [<0.69.0>] Stacktrace: [{mochiweb_request,send,2},
             {couch_httpd,send_chunk,2},
             {couch_httpd_view,send_json_reduce_row,3},
             {couch_httpd_view,'-make_reduce_fold_funs/5-fun-1-',8},
             {couch_btree,reduce_stream_kv_node2,8},
             {couch_btree,reduce_stream_kp_node2,11},
             {couch_btree,fold_reduce,7},
             {couch_httpd_view,'-output_reduce_view/6-fun-0-',12}]
1> [info] [<0.81.0>] 127.0.0.1 - - 'GET' /keywords/_design/finding/_view/word_count?group=true 200
1>

Yep, I think I broke it yet again…

A subsequent query to:

http://localhost:5984/keywords/_design/finding/_view/word_count?group=true

Seemws to show all’s well, so I thought I’d get fancy:

wget -O - http://localhost:5984/keywords/_design/finding/_view/word_count?group=true

But when I hit Control-C to cancel the get (because I realized I hadn’t redirected output to /dev/null) I got yet another stack trace:

1> [info] [<0.126.0>] 127.0.0.1 - - 'GET' /keywords/_design/finding/_view/word_count?group=true 304
1> [error] [<0.387.0>] Uncaught error in HTTP request: {exit,normal}
1> [info] [<0.387.0>] Stacktrace: [{mochiweb_request,send,2},
             {couch_httpd,send_chunk,2},
             {couch_httpd_view,send_json_reduce_row,3},
             {couch_httpd_view,'-make_reduce_fold_funs/5-fun-1-',8},
             {couch_btree,reduce_stream_kv_node2,8},
             {couch_btree,reduce_stream_kp_node2,11},
             {couch_btree,fold_reduce,7},
             {couch_httpd_view,'-output_reduce_view/6-fun-0-',12}]

So let’s just get on with the performance test I guess…

After another DBX restart (more to be sure then anything since couchdb seems to almost enjoy dumping stack traces while still merrily marching along).

I changed my URLs:
old_url u = “http://localhost:5984/%s/_view/finding/word_count” % (db_name)

new_url = “http://localhost:5984/%s/_design/finding/_view/word_count” % (db_name)

And can now officially tell you (after one more stack traces) that:

$ time ./couchdb_finding_keywords.py
[('Hacker', 249160.0), ('Techcrunch', 249160.0)]

real	0m31.559s
user	0m0.730s
sys	0m0.429s

It’s still an impressive bit of performance for the functionality, and I think I’ve clearly shown it’s fault resistance. I just wish it didn’t come at more than 100 times the cost of the flat file.

Where are the filters for Google Reader?

October 29th, 2009

If I can create filters for GMail, to push notes to certain folders or automatically star things, then why can’t I create similar rules for my RSS feeds?

RSS has quickly become at least as important to me as email, so I think it deserves at least as many tools.

Can Android Equal Apple ?

October 26th, 2009

Here’s an idea for any aspiring hacker out there.

Find a way to make Android mimic an iPhone when it’s connected.

Users will gain the ability to use iTunes to sync music and podcasts (and possibly Apple Apps too if the emulation went that far).

However, more important then just leveraging a known user interface it provides an obvious migration path off of Apple’s proprietary lock-in platform.

Skinned Programming Paradigms

October 19th, 2009

Here’s a free thought for you.

How much of people choice in programming languages is really syntax dependent?

For example, I dislike Java (I hate it for other reasons) simply because of the verbosity of ‘System.out.println’ and don’t really understand why Scala would chose ‘println’ instead of Python’s terse use of ‘print’.

And I’m pretty sure despite overt rationalizations like ’saving myself keystrokes’ that’s just a petty reason.

However, what I learned in compiler construction is that the parser or tokenizer is really separate from the language itself.

So, for example, there’s no reason there couldn’t be a plugin for Java that allowed me to write with python’s syntax, or vice versa. Such a technique might require a little bit of library support, but I suspect adding pythons ‘map()’ even to C/C++ would be fairly trivial.

We should be able to ’skin’ our languages with our syntax of choice regardless of the underlying compiler, JVM or bytecode.

If this were possible, then ‘language wars’ could be less about syntax and interface (a la emacs vs. vi) and more about the underlying value of the language itself.

If we can theme operating systems and user interfaces, then why not programming languages?

Welcome to the White House State of Confusion

September 28th, 2009

I know it’s easy to sit on the sidelines and poke fun at people trying actually do something. And we’ve been given many reasons to respect the technical proficiency of the recent administration’s IT personnel.

However, here’s an example of drop down box, from a section of the White House site, which seems frustratingly naive:

Does anyone see a problem?

For starters it’s not alphabetical, which makes finding anything atrocious!

However, beyond that I see three entries for “Departmental Administration”!

That’s what happens when you don’t sanitize your data, you can’t sanitize what you do with it!

The Reciprocal World of IT and Business

September 21st, 2009

Working as an Enterprise Architect, you will frequently hear how technology must support a business need. It’s a cliché, yet accurate, reminder that technologists often deploy something that doesn’t best satisfy the problem.

Although play and creativity has a place, even in business, no IT environment can long survive without supplying the business with the means to meet its objectives. There is surly no better route to bankruptcy then wasting time and money, which is what happens when IT divorces itself from the business.

However, often overlooked is the reciprocal need for the business to support IT.

This doesn’t happen when a CIO or CTO is relegated to the back office and denied a seat at the executive table and the effects are more insidious though no less disastrous in the end. Or imagine being asked to run a massive IT department with the wrong skills, or responding to a mandate for change without the ability to make the proper investments.

For any organization to succeed it’s important to realize that all offices must be imbued with the same driving passion and resources for success.

Links as Code

July 14th, 2009

John Willis’ “Infrastructure as Code” should be a startling epiphany for anyone who has long neglected process and people in favor of technological solutions. Yet, I hope anyone here doesn’t need convincing about the validity of institutionalizing the collective knowledge

However, I wonder about a critical level of infrastructure maintenance that seems to be missed, document maintenance.

I’m sure everyone has experienced the frustration of reading a document with an invalid URL, but is this be accepted?

Should documents not be kept in repositories as well? Why not take the same proactive approach to maintaining links as we do “Not breaking the Build” when programming?

So why doesn’t your team have a utility to scan internal documents and links when they propose changing page structures, before they’re made live?

Converging Google Services

July 10th, 2009

The always fabulous Louis Grey makes good points about using GMail in a corporate environment and got me thinking in a different direction.

I begain to consider: Why can’t I share emails the same way I can share RSS entries?

Google Reader allows you to publish all entries you tagged with specific keywords, or you can share entries on an individual basis. Yet, despite the obvious analogue, it’s impossible for me to share email messages or threads in the same manner!

I realize there are some privacy concerns, since RSS & Atom explicitly make things public and email does not. However, there’s no reading I couldn’t use an email to RSS gateway and violate expected convention easily.

I might also argue that by opening up email to the same type of social collaboration we get via Google Reader then the potential would exist to make things more secure.

For example, by adding a default copy-left style licensing, a la creative commons, or a per-email “off the record” flag like Google Talk. There could even be “free to share” delivery options rather then keeping everything on an “honor system”.

Who’s Really Testing Chrome?

July 6th, 2009

Just a quick gripe to share with anyone using Chrome.

For all of Chrome’s new high performance design, there’s a very simple way to bring your tabbed experience to it’s knees, Print something…

In my case it was a 100+ page PDF printed 2 pages per sheet, but I’m sure most anything of a decent size would work.

Disapopinting to say the least, since printing is supposed to be a background task, that’s why we have spooling, not take front and center stage!