Data and Analytics Meetup

Earlier this month, we opened up our office to a group of developers for an evening. We talked about Big Data and what it means to fast growing startups, like Wish. The event was co-hosted by one of our partners, Treasure Data, a silicon valley Big Data as Service startup. To me the questions every consumer startup has to answer are, Can we use data to help us grow our reach, and Do we have the right data architecture to sustain this growth? Wish has grown from zero to ~10 million users in a relative brief period. So we learned a few things about leveraging data to grow and engage audience. While some of these insights have to do with combining algorithms and art, which I’ll gladly share with any developers shortly after they start their full time employment here at Contextlogic / Wish, a huge part to our growth has to do with our metric focus product development process. We talked in depth about our feature experiment framework. While A/B testing is a fairly common practice, we often have dozens of experiments live at the same time, cross cutting different user segments and cohorts. There is simply no great tools out there that can help us analyze the data at the pace we need, none that’s affordable anyway, so we built our own. We found that when everyone in the company has visibility into impact on user behavior, we are able to make decision much faster with less friction. As we experience hyper growth, we’re both fortunate and wise to have made the right decisions on when to partner and when to build our own solutions. In the previous blog entries, we talked about auto scaling our serving clusters. However, for data system involving distributed virtual file systems such as Hadoop, things get more complicated. After some evaluation, we decided leverage the scale and services of Treasure Data. Jeff Yuan from Treasure Data gave a great overview of their offerring. Be sure to hit them up in the links below if you’d like to check them out. Treasure Data:

By choosing the right partner, it allows us to do what we do best, building algorithms and applications. We talked about how we leverage Treasure Data’s Hive interface, performing aggregation and ordering to minimize the data transport and still generate models and analytics with maximum throughput on our local Hadoop cluster. There were many more nuggets of information and great conversations that evening to capture it all in one post. Be sure to follow us on LinkedIn and join us at our next open house event.


And we’re always hiring great engineers. Check out our career page or send us your resume: