tech
More Digg Technical Talks
We had another couple of great external speakers talking at Digg HQ. The open-source master Doug Cutting stopped by the Digg office to talk about the growth of the Hadoop platform. He talks in detail about Avro. Avro is an RPC and serialization framework that has some interesting differences compared to the popular Thrift framework.
- john's blog
- add new comment
- read more
- 170 reads
tokenizing twitter posts in lucene
solr lucene is good technology to use for searching over a corpus of tweets. if you take the content of a tweet and dump that into the default solr lucene "text" field, you'll do pretty well. however, if you look at your results closely, you'll find one subtle, but very annoying problem: searches on a hashtag term will match the non-hashtag term.
- cailin's blog
- add new comment
- read more
- 306 reads
Continuous Deployment at Digg.com
Digg's Andrew Bayer has just written a blog describing how we use Git, Hudson, Selenium, Puppet and Gerrit to manage continuous deployment at Digg.
Andrew describes how we get developer commits to production quickly and safely using a combination of automated packaging and staging, web based code review and automated testing (unit and selenium)
Read the full blog here.
- john's blog
- add new comment
- 341 reads
Digg Technical Talks
- john's blog
- add new comment
- read more
- 374 reads
Digg and Drupal
Right now we're looking for someone to lead the charge on our internal Drupal development and our contributions back to the project. If you're handy with Drupal and have a passion for open source development, take a look at our latest posting on the jobs page Read more about our use of Drupal on the Digg Blog.
- john's blog
- add new comment
- 1561 reads
Dealing with the Data Deluge
- john's blog
- add new comment
- read more
- 1260 reads
A Geometric Progression of Effectiveness - The Agility of Interruptions
André Maurois (1885-1967) wrote that "The effectiveness of work increases according to geometric progression if there are no interruptions." At Digg we struggle between the clear benefits of uninterrupted work and the need to be agile in our communication.
- john's blog
- add new comment
- read more
- 755 reads
Saying Yes to NoSQL; Going Steady with Cassandra
The last six months have been exciting for Digg's engineering team. We're working on a soup-to-nuts rewrite. Not only are we rewriting all our application code, but we're also rolling out a new client and server architecture. And if that doesn't sound like a big enough challenge, we're replacing most of our infrastructure components and moving away from LAMP.
- john's blog
- add new comment
- read more
- 1202 reads
log4drupal now available on github
both the 5.x and 6.x versions are now available for download on github. sorry, i just can't do CVS anymore. to download:
- start by going here: http://github.com/cailinanne/log4drupal
- then click the
all tagsdrop-down and choose the appropriate version - then click the download button
a full description of the module is available here
- cailin's blog
- 1 comment
- read more
- 1836 reads
breadth-first graph search using an iterative map-reduce algorithm
i've noticed two trending topics in the tech world today: social graph manipulation and map-reduce algorithms. in the last blog, i gave a quickie guide to setting up hadoop, an open-source map-reduce implementation and an example of how to use hive - a sql-like database layer on top of that. while this is one reasonable use of map-reduce, this time we'll explore it's more algorithmic uses, while taking a glimpse at both of these trendy topics!
- cailin's blog
- 3 comments
- read more
- 5723 reads



