O’Reilly Make me an Offer!

I’m a big fan of O’Reilly books, as I’m sure most of you are. They’re great technical resources for me and have cute animals my wife can really enjoy!

A friend of mine got Programming Collective Intelligence and recomended it to me, so my mother-in-law gave it to me for my birthday (yay, I’m old!). I’m stoked to see O’Reilly focused on moving “up the stack” of technology in such an approachable way.

I finally got a chance to start last night and reading the preface it was immediately apparent this was going to challenge my newly developed python skills.

e.g.

{xvii} //That’s the page #

string_list = [‘a’, ‘b’, ‘c’, ‘d’]

string_list[2] # returns ‘b’ #wrong it should be ‘c’

You know when they’re teaching you incorrect python that it’s going to be a fun way to learn. I worked my way up to page 11 lastnight and found about ~8+ errata. This is the first time I’ve felt completely comfortable marking up a book (oh the sacrilege!) but I do focus better when I can’t simply skim…

I expressed my recent activities on twitter, and another friend asked if I was keeping a list. So, FJ, this post’s for you and for everyone else who doesn’t want to scratch the same grove in their head that I did.

O’Reilly’s great about leveraging the collective intelligence [pun intended] and you can Submit and Find errata (perhaps I should order by frequency and say “Find and Submit”) a O’Reilly’s website for the book.

    Unfortunately, the official list only has two and hasn’t been updated since the 18th of Feb!!!

I submitted mine there and there’s a ton more (but the user format is a little hard to scroll through).

So here’s my quick list till now (p11) [I’ll try to add new ones as comments so you can track this post] and if anyone from O’Reilly’s reading I think I’d make a great editor, if only to actually update the official list with the good community feedback and help others out!

{xvii} string_list[2] = ‘c’

{xviii} /* first list compression should change v1>4 to v>4 */

{xix} // Chapter 2, 2nd to last line “move” should be “movie”

{9} critics[‘Toby’] #output is missing ‘Superman Returns’: 4.0

{10} //The results of both math functions are wrong as they use the wrong datapoints (5,4) & (4,1) which should be (1,4.5) and (2,4)

{11} //sim_distance() – the return function should be; return 1/(1+sqrt(sum_of_squares))

{11} from recommendations import critics, sim_distance #reload(recommendations) didn’t work for me. You’ll have to change the subsequent function call as well and because of the previous errata the returned # should be 0.2942 (approximately) and not 0.1481

{11} This wasn’t my find, I learned it from the user submitted errata, but someone mentioned using “si = set()” and then “si.add(item)” instead of “si[item]=1” … Both make sense, but the set seems cleaner and was a new semantic for me.

About jay

I'm trying to build something interactive where I can learn from others and hopefully share useful knowledge too. thecapacity@gmail.com
This entry was posted in books, code, frustration, technology. Bookmark the permalink.

6 Responses to O’Reilly Make me an Offer!

  1. Pingback: thecapacity : What if stocks were movies?

  2. Pingback: thecapacity : My how long it’s been…

  3. jay says:

    P17 when doing the “getRecommendations()” call with “similarity=sim_distance” you will get slightly different values for the 3 movies then what’s listed (because of the previous function errors) but it’s a minimal error.

  4. jay says:

    p14 –
    Not a big but in “Ranking the Critics…” in the code there’s no need to reverse the full list;
    Here’s the code ‘as is’;

    scores.sort()
    scores.reverse()
    return scores[0:n]

    This could simply be;
    scores.sort()
    return scores[-1:-(n+1):-1]

    The syntax’s a little strange but it would save you reversing a big list.

    I think you could also probably sort the list then just slice the parts you need off and then just reverse that new list.

  5. jay says:

    As I mentioned I’ve found it instructive to hand type in the python code for this book. Also I didn’t really feel like signing up for Safari (O’Reilly’s online book library) but today I discovered that you can get a zip file of all the code!

    Thanks to this post
    http://blog.kiwitobes.com/?p=44

    Here’s the direct link
    http://kiwitobes.com/PCI_Code.zip

  6. jay says:

    p14 (3rd paragraph) – “sim_vecror” should be “sim_distance”.

    More interesting, I finished the two movie recommendations exercises (but not done “Top Matches” yet) and I was really surprised at the variability of the results given by the two methods (Euclidean vs. Pearson );

    Jack Matthews and Mick LaSalle => D(0.286) P(0.211)
    Jack Matthews and Claudia Puig => D(0.320) P(0.029)
    Jack Matthews and Lisa Rose => D(0.341) P(0.747)
    Jack Matthews and Toby => D(0.267) P(0.663)
    Jack Matthews and Gene Seymour => D(0.667) P(0.964)
    Jack Matthews and Michael Phillips => D(0.320) P(0.135)

    Mick LaSalle and Jack Matthews => D(0.286) P(0.211)
    Mick LaSalle and Claudia Puig => D(0.315) P(0.567)
    Mick LaSalle and Lisa Rose => D(0.414) P(0.594)
    Mick LaSalle and Toby => D(0.400) P(0.924)
    Mick LaSalle and Gene Seymour => D(0.278) P(0.412)
    Mick LaSalle and Michael Phillips => D(0.387) P(-0.258)

    Claudia Puig and Jack Matthews => D(0.320) P(0.029)
    Claudia Puig and Mick LaSalle => D(0.315) P(0.567)
    Claudia Puig and Lisa Rose => D(0.387) P(0.567)
    Claudia Puig and Toby => D(0.357) P(0.893)
    Claudia Puig and Gene Seymour => D(0.282) P(0.315)
    Claudia Puig and Michael Phillips => D(0.536) P(1.000)

    Lisa Rose and Jack Matthews => D(0.341) P(0.747)
    Lisa Rose and Mick LaSalle => D(0.414) P(0.594)
    Lisa Rose and Claudia Puig => D(0.387) P(0.567)
    Lisa Rose and Toby => D(0.348) P(0.991)
    Lisa Rose and Gene Seymour => D(0.294) P(0.396)
    Lisa Rose and Michael Phillips => D(0.472) P(0.405)

    Toby and Jack Matthews => D(0.267) P(0.663)
    Toby and Mick LaSalle => D(0.400) P(0.924)
    Toby and Claudia Puig => D(0.357) P(0.893)
    Toby and Lisa Rose => D(0.348) P(0.991)
    Toby and Gene Seymour => D(0.258) P(0.381)
    Toby and Michael Phillips => D(0.387) P(-1.000)

    Gene Seymour and Jack Matthews => D(0.667) P(0.964)
    Gene Seymour and Mick LaSalle => D(0.278) P(0.412)
    Gene Seymour and Claudia Puig => D(0.282) P(0.315)
    Gene Seymour and Lisa Rose => D(0.294) P(0.396)
    Gene Seymour and Toby => D(0.258) P(0.381)
    Gene Seymour and Michael Phillips => D(0.341) P(0.205)

    Michael Phillips and Jack Matthews => D(0.320) P(0.135)
    Michael Phillips and Mick LaSalle => D(0.387) P(-0.258)
    Michael Phillips and Claudia Puig => D(0.536) P(1.000)
    Michael Phillips and Lisa Rose => D(0.472) P(0.405)
    Michael Phillips and Toby => D(0.387) P(-1.000)
    Michael Phillips and Gene Seymour => D(0.341) P(0.205)

    I assume some of it could be related to the relatively few data points (sometimes critics only share 2 of 5 movies.

    Any other ideas?

Comments are closed.