Wish Now Offers Easy Integration with ShipStation

enter image description here

Wish is proud to announce a cool new integration with ShipStation for our merchants. Over the last few months the volume of orders that our merchants process daily has risen dramatically. To help our merchants fulfill their orders promptly and keep our users happy, we went on search for an-easy-to-use shipping tool. ShipStation was a great fit for both us and our merchants.

Transaction Volume over the last few months

enter image description here

Wish chose ShipStation because of its easy integration, its quality documentation, and its simple interface. The integration process was a snap due to the excellent documentation and simple API that ShipStation provides. Within 2 days we had integrated with their API and had one of our best merchants fulfilling orders via ShipStation. From both an engineering and user standpoint, our integration has gone very smoothly. We think the rest of our merchants will really enjoy this latest addition to the Wish platform.

enter image description here

Our merchants can now easily process their shipping labels using ShipStaion’s robust, time-saving application to manage orders from over 40 marketplaces and shopping cart platforms like Wish, eBay, Amazon, Magento, and Shopify. Within ShipStation, you can define automation rules, filters, and profiles to make the copy-and-paste, cut out with scissors procedures you might use today look like the Stone Age of order and shipment management. So what about those hundreds of orders from Wish? ShipStation will help you sort, process, and ship them all in a flash!

If you interested in adding ShipStation to your merchant arsenal, you can sign up for their 30-day free trial today and see just how amazing your life could be. Those who already have ShipStation accounts can add their Wish merchant info with just a few clicks. Login or sign up at www.shipstation.com.

Cheers,

Christine & Josh

Data and Analytics Meetup

Earlier this month, we opened up our office to a group of developers for an evening. We talked about Big Data and what it means to fast growing startups, like Wish. The event was co-hosted by one of our partners, Treasure Data, a silicon valley Big Data as Service startup. To me the questions every consumer startup has to answer are, Can we use data to help us grow our reach, and Do we have the right data architecture to sustain this growth? Wish has grown from zero to ~10 million users in a relative brief period. So we learned a few things about leveraging data to grow and engage audience. While some of these insights have to do with combining algorithms and art, which I’ll gladly share with any developers shortly after they start their full time employment here at Contextlogic / Wish, a huge part to our growth has to do with our metric focus product development process. We talked in depth about our feature experiment framework. While A/B testing is a fairly common practice, we often have dozens of experiments live at the same time, cross cutting different user segments and cohorts. There is simply no great tools out there that can help us analyze the data at the pace we need, none that’s affordable anyway, so we built our own. We found that when everyone in the company has visibility into impact on user behavior, we are able to make decision much faster with less friction. As we experience hyper growth, we’re both fortunate and wise to have made the right decisions on when to partner and when to build our own solutions. In the previous blog entries, we talked about auto scaling our serving clusters. However, for data system involving distributed virtual file systems such as Hadoop, things get more complicated. After some evaluation, we decided leverage the scale and services of Treasure Data. Jeff Yuan from Treasure Data gave a great overview of their offerring. Be sure to hit them up in the links below if you’d like to check them out. Treasure Data: http://www.treasure-data.com

By choosing the right partner, it allows us to do what we do best, building algorithms and applications. We talked about how we leverage Treasure Data’s Hive interface, performing aggregation and ordering to minimize the data transport and still generate models and analytics with maximum throughput on our local Hadoop cluster. There were many more nuggets of information and great conversations that evening to capture it all in one post. Be sure to follow us on LinkedIn and join us at our next open house event.

-jack jack@contextlogic.com

And we’re always hiring great engineers. Check out our career page or send us your resume: careers@contextlogic.com

Pushing to Production

Code Release Management @ ContextLogic

At ContextLogic we like to iterate quickly and do a major code release every day. That means each and every day we push new features to Wish. Releasing code into production this often requires that we have a code release process that is efficient and stable.

Our Setup

We use Git for source control and deploy our services to AWS. For deployment we use Fabric scripts and AutoScale.

Each day an engineer spends time preparing, testing, pushing and monitoring the release. Since this engineer has many other tasks to complete, it is paramount that each of these steps are fast and reliable.

Our Goals

When designing our release process we had a few goals in mind:

  • Easy to use, so any engineer can push
  • Catch bugs before they appear in production
  • Keep the amount of time pushers spend pushing to a minimum
  • Automate as many of the steps as possible

How We Do It

Each night, a cron job runs that cuts the release, pushes it to a staging area, and sends out summary emails. Automated and manual testing can then be run on the release. The next day the pusher will merge the production branch into the release. The release is then pushed to a testing environment for verification. After verification the release becomes the new production branch and is pushed live to wish.com.

Visual Overview

1. Cutting a new release

Every evening a Teamcity job runs and uses traditional git branching and merging techniques to create a new release branch from the current master branch.

2. Pushing a New Release

To ensure that any hotfixes from the previous night are included in the release, the production branch is merged into the release branch.

Any necessary fixes or last minute commits are then cherry picked into the release branch.

The release is then pushed to a testing tier of front end machines. Functional and manual testing is then run on the tier. If no bugs are found, then the code is pushed live.

The Benefits

  • Many steps are automated, saving the pusher’s valuable time
  • By cutting the release at the same time every night, developers are less compelled to rush code into releases
  • We can dogfood each release by switching our DNS for http://wish.com to the staging environment
  • The time between cutting the release and pushing it allows us to run many automated functional and unit tests
  • There is a very low probability of merge conflicts, making the pusher’s life easy

Conclusions

Like everything at ContextLogic this process is evolving all the time as we continue to optimize and improve. If you know a better process or have suggestions, hit me up josh@contextlogic.com.

~ Josh Kuntz

Joshua Kuntz

ps. We’re hiring. Careers @ ContextLogic

Advice and tools for AWS AutoScaling

AutoScale is a really cool feature of AWS. It lets you define what infrasturcture you would like to be running and then it automatically create & destroys instances to maintain that state.

This is really great for a few reasons:

  • Scaling: If you get a surge of traffic, new capacity can come up automatically to handle it
  • Availability: If a bunch of instances die at once in an emergency, AutoScale will automatically replace them for you
  • Cost management: Automatically reduce capacity during off-peak load to save money

We started using it for a number of our services over the last few months and it’s been great. Some of our older services are not AutoScaled yet, but I’ll move those over at some point and, going forward, all new services will be architected for AutoScale from the start. This post discusses some of the considerations and tools we use for AutoScale here at Wish. We’re releasing one of them open source makes it easier to setup & manage AutoScaled deployments.

Making Instances Ephemeral

To use AutoScale effectively, you need to architect your instances to be ephemeral. That doesn’t necessarily mean using ephemeral storage (but that’s usually a good idea), but you need to be able to survive any AutoScaled instance being terminated or created at any time in an unattended way. (As an aside, a great tool for making sure things continue working like that is Netflix’s Chaos Monkey) Getting to that point isn’t trivial, but it lets you sleep a lot better at night knowing that even if an entire availability zone goes down or you get hammered with traffic, you can trivially add the capacity you need.

For stateless services, this probably isn’t too bad. If your frontends are behind a load balancer, you can automatically add/remove instances from that and if your backends all read from a common queue adding or removing servers there should be easy. It’s obviously trickier for more stateful things like a database where you need to do non-trivial I/O to go from fresh DB server to something useful and we don’t AutoScale those right now for that reason.

Tools like Chef and Puppet help a lot here. The learning curve can be a bit steep, but you should check them out if you haven’t already and you’re thinking of doing something like this. In a nutshell, they give you a nice way to describe the state of a server so you can automatically go from a fresh OS install to a configured server.

One tool we built to help with this process takes a Chef “role” (i.e. a type of server to configure) and a base Ubuntu AMI then automatically creates a new AMI for that server type. This way, whenever we make a major infrastructure change, I can run a single script to get a new AMI with the change baked right in. That makes it easy to make changes across a lot of infrastructure.

Instance Initialization

Once you can configure the OS and basic services the way you want in a reliable way, you probably need to deploy your latest code to the box and do other init tasks when it comes up.

Our AMI creation tool leaves a script in the AMI that, using Ubuntu’s cloud-init, we run when the new instance is first booted.

This script does a few things:

  1. Register the new node with the Chef server
  2. Run Chef on the node to pick up any new changes since the AMI was created
  3. Deploy the latest production code
  4. Start our app

So far, that pattern of baking most infrastructure changes into the AMI and then having a script to grab the latest Chef changes and code has given us a good balance of low time-to-launch and ease of maintenance.

Deployment

Before AutoScale, our deployment process basically involved a tool to scp the new code to each host and run commands to restart services. The problem with this push-based deployment is that if a new server comes up, you need an operator to push the right version of the code to it. This is fine if bringing up new capacity is always attended and happens rarely, but AutoScale could create a new instance in the middle of the night and needs to be able to get the latest code itself.

So, we changed to a pull-based deployment. In a nutshell, our new deploy tool:

  1. Pushes the code to S3
  2. Gets hostnames of running instances from Chef
  3. On each one, runs a script that pulls the code from S3 and restarts the app (same as steps 3 & 4 from above)

Cleanup

One little gotcha with our approach is that if you’re churning through instances constantly, your Chef or Puppet server will end up with a ton of nodes that have been terminated. Our solution to that was to run a script every few minutes that looks for EC2 instances in the “Terminated” state and removes them from Chef and so far that’s worked pretty reliably.

Configuration

The other hurdle to using AutoScale (especially when you’re first starting out) is that there is no UI to change or see your configuration. You can use the AWS command-line tools to manage it, but that can become error-prone and doesn’t scale well to many different engineers managing many different AutoScaling groups.

To manage this at Wish, we built a tool that I’m releasing open source to make this process be driven by a config file. We chose the config file approach because it distills AutoScaling down to a simpler model that’s easy to work with, gives ops a complete picture of the current state, and can be tracked via source control.

AutoScaleCTL

You can find this tool, AutoScaleCTL, on GitHub.

Installation

  1. Download code from GitHub
  2. pip install -r pip-requirements (you can probably skip this if you have a fairly recent boto installed)
  3. Run sudo python setup.py install (makes autoscalectl in /usr/local/bin, so you’ll probably need sudo here)

If you haven’t already, setup your boto config file at ~/.boto with:

[Credentials]
aws_access_key_id = <your access key>
aws_secret_access_key = <your secret key>

Usage

The first step is to copy and edit the sample autoscale.yaml to fit your configuration. A lot of sections are optional so you can start simple to play around and then build out more complexity as you go. It doesn’t support every feature of AutoScale, but it supports a pretty good set. If you want to add more, feel free to send a pull request.

When that’s done, run autoscalectl [/path/to/autoscale.yaml].

One important note is that it doesn’t support removing AutoScaling groups or alarms. So, if you delete a section from the config file, it won’t be deleted in AWS. You’ve gotta take care of that one manually.

Future Work

One big omission from this is supporting spot instances. The trick with spot instance AutoScaling is that you need to make sure your total capacity needs are always met while trying to launch spot instances over on-demand instances when possible. There’s no explicit support for this in AutoScale itself (you can create an AutoScaling group that will buy spot instances up to your bid price, but the trick is figuring out how to calibrate alarms so that on-demand capacity kicks in when needed but doesn’t replace available spot capacity). If you have any experience trying this out that you want to share or want to try it out, I’d love to hear from you.

Another big area for future work here is around supporting a wider range of metrics. CPU, I/O, or memory stats can be a bit of a blunt measure of demand. Better support around custom metrics would make it easier to directly define scaling rules in terms of the things that matter (latency, queue sizes, request volume, etc).

There are also a bunch of fields that aren’t supported in the config file. I got the required ones (and the ones we use at Wish), but things like attaching instances to an ELB isn’t supported right now. It’s pretty easy to add those things if you find you need them (feel free to ping me if you’re not sure about how) and I’ll keep the tool updated as I improve on it for Wish.

-adam, adam@wish.com

MongoMem: Memory usage by collection in MongoDB

Here at Wish, we’re big fans of MongoDB. It powers our site for 8 million users and has been a pretty good experience for us. To help keep everything running smoothly, we’ve built a handful of tools to help automate things and get more insights into what’s going on.

Today, we’re releasing the first of these tools, MongoMem. MongoMem solves the age-old problem of figuring out how much memory each collection is using. In MongoDB, keeping your working set in memory is pretty important for most apps. The problem is, there’s not really a way to get visibility into the working set or what’s in memory beyond looking at resident set size or page faults rate.

As engineers, we usually have a rough, intuitive sense of how the memory distribution breaks down by collection. But, without a good way to validate those assumptions, we found it was easy to look in the wrong places for problems. In our early days, we kept using a lot more memory than we thought we should be, but we were running blind when we tried to decide where the low-hanging fruit was to optimize. After plenty of frustrating optimizations that didn’t make much difference, we decided that we really needed better information, and MongoMem was born.

Usage

You can find MongoMem on GitHub.

Installation

  1. Download the code from GitHub
  2. pip install -r pip-requirements (it’s just argparse and pymongo; any version of either is probably fine)
  3. Run sudo python setup.py install (makes mongomem in /usr/local/bin, so you’ll probably need sudo here)

If you run into any troubles here, leave a comment or ping me at adam@wish.com. I’ve only tried installing this on a couple machines here, so there could be problems I missed.

Usage

MongoMem is pretty simple to use. You have to run it on the same server as your mongod since it needs to be able to read the mongo data files directly (so you may need to run it as root or your mongodb user, depending on how your permissions are setup). It’s safe to run against a live production site (just makes a few cheap syscalls, doesn’t actually touch data).

With that out of the way, usage is:

mongomem --dbpath DBPATH [--num NUM] [--directoryperdb] [--connection CONN]
  • DBPATH: path to your mongo data files (/var/lib/mongodb/ is mongo’s default location for this).
  • NUM: show stats for the top N collections (by current memory usage)
  • Add --directoryperdb if you’re using that option to start mongod.
  • CONN: pymongo connection string (“localhost” is the default which should pretty much always work, unless you’re running a port other than 27017)

It’ll take up to a couple minutes to run depending on your data size then it’ll print a report of the top collections. Don’t worry if you see a few warnings about some lengths not being multiples of page size. Unless there are thousands of those warnings, it won’t really impact your results.

For each collection, it prints:

  • Number of MB in memory
  • Number of MB total
  • Percentage of the collection that’s in memory

How it Works

In theory, the problem isn’t that hard. MongoDB uses mmapped files, so to figure out what data is in memory (in Linux, anyway), a mincore call on each of the data files will tell you which pages are in cache. So, know what collection is in each page, you can easily count the number of pages in cache per collection. The only trick is figuring out what regions of the file map to what collections.

You can figure that out by parsing the namespace files or traversing the data structure inside of MongoDB, but both of those options are annoying if you want to stay in Python and not patch mongo itself. The validate command will give you the extent map, but that’s horribly impactful (and will touch the whole collection anyway), so it wasn’t an option. Thanks to a tip from Eliot over at 10gen, though, it turns out the collStats command has an undocumented option that give us exactly what we need. If you add the verbose: true option to that command, it’ll give you the full extent map for a collection. Armed with that, you can crank through and get all the data.

Future Work

One thing that I’d love to do with this but I haven’t spent enough time experimenting with is to pull these numbers continuously so I can plot them. I think it’d be really cool to see how these numbers change over time and also, within individual collections, how the memory usage changes. If you could measure the difference between consecutive snapshots with a sufficiently small period (possibly non-trivial since it takes around a minute or two to run on a large DB), you could get a plot of page faults by collection. Could be interesting to see where your faults are coming from (and how they change over time / in response to various events).

Another thing that I think would be cool is to break the data down further so we can see per-collection and per-index numbers (right now a collection in the tool counts as data + index). Sadly, there’s no command to get the extent map broken down by data and index, but if 10gen can add this feature, the tool could also give information about which indexes are memory hogs.

Thanks

Just want to give a shout out to Eliot over at 10gen for pointing me to collStats with verbose: true that saved me a lot of trouble to get the data I needed for this and David Stainton for the Python mincore wrapper I needed to pull everything together.

-adam, adam@wish.com