Diversity at Ginkgo

A lot of tech companies have issues with diversity. We don’t want to be like other tech companies.

There are countless studies that show that diverse teams perform better, that workplaces that are inclusive and welcoming of people of all genders, races, ages, abilities, and orientations are more productive and engaged, and that companies with more diverse leadership make more money. Fostering a diverse and inclusive environment is also just the right thing to do, and something that I’ve cared deeply about for a long time.

2016 was a year of incredible growth for our team—in the past 12 months we’ve nearly tripled in size. As we started to ramp up our hiring in the second half of 2015, however, we noticed a disturbing trend. Our gender ratio, which had long hovered around 27% women (not a great starting point to begin with), was beginning to fall. In the second half of 2015, we hired only 23% women.

We quickly recognized that if we continued to ignore diversity and kept along this trajectory, we would rapidly cement a skewed gender ratio that would make a lasting impact on our team and culture. The more we looked into it, the more we realized that early in a company’s growth is the right time to address diversity because it’s much easier to address diversity as a small team, where a small number of hires can make a large impact on percentages. Not to mention that it would really suck to someday succeed at our mission to make biology easier to engineer but then wake up and realize we built just another technology company full of white dudes.

We also recognized how hard it is to make positive changes when it comes to diversity and inclusion in tech. Many companies make promises in good faith to improve diversity but fail to yield any subsequent quantifiable improvements. So we made the decision to focus specifically on reversing our gender diversity numbers in 2016 because we wanted to not just pay lip service to diversity but actually make meaningful change. It was a difficult decision but we felt an initial focus on one axis of diversity was important to increase our likelihood of success.

Around the same time as these Ginkgo team discussions on diversity were happening, Barry and I happened to watch the movie Moneyball. Moneyball tells the story of Billy Beane and the 2002 Oakland Athletics baseball team. Faced with a very limited budget, Billy Beane had to figure out how to put together a winning baseball team. To do so, he relies on sabermetrics to identify very good players that are undervalued by the typical criteria used by other teams. The A’s go on to have a record-breaking 20 game winning streak during the 2002 season. (In fact, the Sox use a similar approach to break the Curse of the Bambino and win the 2004 World Series).

In watching this movie, it was very apparent to me that the driving theme of the sabermetrics approach is to look for talent that is being systematically undervalued by the competition. As almost any study will tell you, women and underrepresented minorities hold fewer positions and make less money in our industry than other demographic groups. Since I believe that these female and minority candidates are every bit as smart and talented as their counterparts in other demographic groups, by definition this means that women and underrepresented minorities are being systematically undervalued in the marketplace. Thus if we can make Ginkgo a place that these folks want to be (say by identifying and addressing our own biases, creating an inclusive culture, and providing fair, equal pay), then we can create a competitive edge in attracting talent from these demographic groups. To be very clear, in our case this isn’t about fielding a team for less money or paying anybody less—it’s about hiring great people who are being unfairly passed over by other companies. As every startup knows, recruiting and retaining great talent is one of the key challenges that any startup faces. Suddenly our commitment to diversity became not just about doing the right thing but also about creating a sustained competitive advantage for Ginkgo.

We still have a long way to go, but I want to share our statistics from 2016. Please note that these data are based on the Federal Equal Employment Opportunity reporting categories and are imperfect representations of the spectrum of diversity. Nevertheless, I think they paint a promising and optimistic picture about how our team is growing.

The first set of graphs below shows the change in gender diversity during 2016. The top line shows the relative percentage of men and women among new hires in each quarter. In the first quarter of 2016, even after we had begun discussing issues of diversity, our hires were only 13% women. This was extremely disappointing and made us redouble our efforts.

The second quarter of 2016 saw our biggest jump in hires, with 30 new people joining, 46.7% of whom were women. Now at the end of 2016, we are so pleased to see that it wasn’t a statistical fluke and that we’ve kept our gender diversity up for new hires in each quarter since then, up to 61.5% in the last three months, for a total of 44% during all of 2016.

graphs showing gender diversity at Ginkgo improving over the course of 2016

All together, this is slowly making an impact on our overall gender diversity. We started the year with 27% women, which dipped down to 23% by the end of the first quarter of 2016. At the time of this writing, we’ve reached 37.5% women and are headed in the right direction. Breaking it down further, the trend is generally upward across the company’s different teams and across levels, from interns to senior leadership (see third graph, below).

We’re encouraged by these statistics but we’re far from declaring success. We’re committed to expanding our efforts in 2017, to keep that upward trend until we reach 50% and to consider other aspects of diversity including age, sexual orientation, race, ethnicity, national origin, and educational background, among others. Here’s where we’re starting 2017: the graph below shows data for racial and ethnic diversity at Ginkgo at the end of 2015 vs. the end of 2016. We started the year with a team that was 67% White, 29% Asian, and 4% two or more races. By the end of 2016, our diversity was broader, but still heavily skewed—66% White, 26% Asian, 3% two or more races, 2% Hispanic or Latino, 3% Black or African American, and 1% Native American or Alaskan Native. We have a lot more work to do here next year.

graphs showing breakdown of ethnic diversity at Ginkgo in 2015 and 2016

And finally, here is where we start 2017 looking at race and gender together for the team as a whole, for technical vs. non technical position (non-technical is defined as management, business development, marketing, and operations, most of whom have a strong technical background), leadership (founders, managers, team leads, and program leads), and padawans, our one-year paid interns. The numbers are still relatively small so it’s difficult to draw too much from these overall, but I’m pleased to see that our ratio of women in technical (35.8%) and leadership positions (34.5%) is close to our overall percentage (37.5%). Moving forward, it’s also important to track the many intersecting forces that have kept women and minorities excluded from technology fields over time.

Diversity statistics by position and rank

So what did we do to reverse the trend on gender diversity? How did we make such a dramatic difference so quickly? We started by talking. We held open discussions where team members could candidly discuss issues of diversity and inclusion that were important to them. Importantly, our leadership attended those meetings and made a strong commitment to hiring and retaining a diverse team.

After those first meetings we started to pursue a range of concrete strategies, particularly:

  • look beyond and expand our own networks to focus on actively recruiting from more diverse pools
  • take a close look at and rewrite our job postings to be more inclusive
  • make sure for each opening to bring in a diverse set of people for on site interviews
  • work towards a culture of inclusivity that makes Ginkgo an awesome place to work for everyone
  • continually check in to see how we’re doing and to keep diversity and inclusion an active project for all of us
    • We’re always challenging ourselves to raise the bar and build the best team to achieve our mission to make biology easier to engineer. We hope to do even better in 2017.

Posted By: Reshma Shetty

The latency of gene synthesis – 2013 update

It’s been a while so I thought it was about time to do an updated post on observed turn times for commercial gene synthesis.  After all, Rob Carlson posted updated cost and productivity curves for DNA sequencing and synthesis.  As in my original post on this topic, I plot turn time vs length for orders placed at Ginkgo.  The y-axis is turn time in days from when an order is initiated by Ginkgo to the day the company ships DNA back to us.  The x-axis is the length of the synthetic DNA (synthon) in base pairs.  Based on a request in the comments, I’ve also scaled the size of the data marker by how long ago the synthon order was placed/delivered to see if there are any observable trends in turn time over time (I don’t see any but it does show how we’ve tried out different providers over time 😉 ).  The data has all the same caveats as last time – so for convenience, I’ve re-listed them at the bottom of this post.

Plot of gene synthesis turn times by vendor

I suspect that one of the most valuable aspects of this data, together with that from Rob, is that is shows how imperfect our benchmarks for the gene synthesis industry really are.  For example, we don’t have a good metric for assessing companies on both cost and turn time.  DNA2.0 has better turn times than Genscript but that comes at the expense of a 2X price premium – which company’s service is “better”?  Selecting and then celebrating the right benchmarks is important because that’s where the industry will place its resources to improve the underlying technology.  To date, the industry has largely been driven to reduce the per base pair cost of gene synthesis because cost per bp is the de factor comparable.  This metric has pushed synthesis companies away from standardizing and competing on “library” offerings for sets of rationally designed synthons (many companies offer this but you have to get a custom quote which has a high transaction cost associated with it).  Given where I suspect the engineering of organisms is going, making it even modestly more difficult to order libraries is probably detrimental to the field.

To help to combat this problem, I’d love to see the development of a true benchmark test for the gene synthesis industry – i.e. a set of synthons that spans different length, GC content, sequence complexity etc. gets designed and the orders placed simultaneously at all vendors so that there can be a true side by side comparison of the performance of all providers.

Probably the biggest update in the world of commercial gene synthesis over the last year and half is that multiple companies now have linear gene synthesis offerings.  In no particular order, IDT offer gBlocks up to 750 bp for $139, Gen9 offers GeneBytes up to 3000 bb in length (they charge per bp but their rates aren’t posted online), and Life Technologies offer Strings up to 1000 bp in length for $149.  With these offerings, gene synthesis companies have been finally able to break through the ~$0.30-$0.35/bp floor that they’ve been stagnating at from 2008-2012 by skipping the cloning and sequencing steps and shifting that technical risk onto their customers.  The hope would be that these offerings could also result in lower turn times but jury’s still out.


Caveats to the data:

  • Orders at the providers weren’t placed contemporaneously and we didn’t place orders for the same synthon at multiple providers.
  • We design all our synthons using in house software – so the measured turn times reflect only ability of providers to build our synthons and not the quality of their gene design tools.
  • By design, nearly all of our synthons pass all provider sequence checks and qualify as “low complexity” sequences from a synthesis standpoint.
  • The synthons generally don’t result in protein expression which might adversely impact clonability by providers.
  • To calculate turn time, I “start the clock” when we input the order to the provider website or send it to their sales rep and I “stop the clock” on the day the provider ships the gene back to me.  So if there is a delay in order processing by the provider or rep, that counts against their turn time.
  • There are a handful of synthons that providers couldn’t synthesize and/or clone.  The 500 bp/80 day outlier for Blue Heron is one but there are also three others from IDT (unmarked).  For those failures, I “stopped the clock” when the provider emailed me to say that they couldn’t make the synthon and weren’t going to try any longer.

Posted By: Reshma Shetty

The lag phase of commercial gene synthesis

From synthetic biology’s earliest days, DNA synthesis, and more specifically gene synthesis, has been touted as the central, enabling technology of the field.  Gene synthesis is part of what lets us make the leap from the ad hoc, cut and paste of genetic engineering to the systematic design that is [or will be] the hallmark of synthetic biology.  Given its central importance, it’s not surprising that many of us in the field keep a close eye on both gene synthesis technology and the gene synthesis industry as a whole.  Yet most of the discussion focuses just on the cost of gene synthesis.  Cost is important.  But I’d argue that turn times are equally important in terms of how gene synthesis is used in the field – more on this below.

Below is a chart summarizing turn times versus length for DNA orders at Ginkgo.  On the y-axis is the turn around time (in days) and on the x-axis is the length of the synthetic DNA (or synthon) in base pairs.  [I refer to all of synthesized DNA fragments as synthons rather than genes since we don’t only synthesize genes.] Data points are colored by gene synthesis provider.

First, a few caveats:

  • Orders at the three providers weren’t placed contemporaneously and we didn’t place orders for the same synthon at multiple providers.  A more controlled experiment would of course be to place orders for a large set of identical synthons at multiple providers simultaneously and compare turn times.  We didn’t do that.
  • We design all our synthons using in house software – so the measured turn times reflect only ability of providers to build our synthons and not the quality of their gene design tools.
  • By design, nearly all of our synthons pass all provider sequence checks and qualify as “low complexity” sequences from a synthesis standpoint.
  • Almost none of the cloned synthons result in protein expression which might adversely impact clonability by providers.
  • To calculate turn time, I “start the clock” when we input the order to the provider website or send it to their sales rep and I “stop the clock” on the day the provider ships the gene back to me.  So if there is a delay in order processing by the provider or rep, that counts against their turn time.
  • We rarely pay the 2X+ price premium to synthesize genes at DNA2.0 – hence the limited number of datapoints for them.
  • There are a handful of synthons that providers couldn’t synthesize and/or clone.  The 500 bp/80 day outlier for Blue Heron is one but there are also three others from IDT (unmarked).  For those failures, I “stopped the clock” when the provider emailed me to say that they couldn’t make the synthon and weren’t going to try any longer.

OK caveats finished.  What can we infer from this chart?  Here are two takeaways.

Lesson 1: Gene synthesis can’t be a part of the design-build-test loop until turn times improve dramatically

Based on our data, turn times are highly variable and show little to no correlation with length overall.  This means that when you place an order, you have no idea if you’ll get it back in 2 weeks or 5 weeks.  From an engineering process standpoint, I’d argue that the unpredictably long turn times mean that it is crazy to include outsourced commercial gene synthesis in the design-build-test loop as you try to engineer an organism.  Instead, do gene synthesis orders up front as a batch (thereby hopefully eliminating gene synthesis from your cycle time) and then mix and match the synthesized parts via a DNA assembly technology with a faster turn time.  Or alternatively, try to achieve faster turn times by doing gene synthesis in house from oligos.

Lesson #2: Different providers specialize at different orders

IDT appears to be quite fast at making sub-500bp synthons.  This is not too surprising given that IDT leverages their ultramer oligo synthesis tech to offer flat rate pricing on so-called minigenes (< 400bp synthons).  At that length scale, you can also opt to stitch together oligos to make the part yourself.  But between the costs of oligos, cloning reagents, sequencing and your own time, you might not do much better than the cost of a minigene (even factoring in cheap grad student/postdoc labor!).  For synthons in the 500-1500 bp range, Blue Heron seems to be a reasonable compromise choice in terms of turn times versus costs.  You get industry competitive pricing with decent turn times.   Overall, DNA2.0 appears to have the best turn times for >1 kb synthons.  Admittedly this is based on a very limited sample size but anecdotal rumors from folks in the field back it up.  So if you’re in a rush and can tolerate the 2X price difference, DNA2.0 could be the way to go.

I’ll close by saying that this post is in no way an attempt to rag on gene synthesis providers.  Building DNA is tough.  And building DNA for customers is even tougher.  But it’s important to think hard about what the realities of costs and turn times of commercial gene synthesis mean for developing best practices for engineering organisms going forward.

Posted By: Reshma Shetty

Cambridge city council hearings on recombinant DNA technology

Jason Bobe pointed out on the DIYbio mailing list that video is now available online from the 1976 Cambridge, MA city council hearings on recombinant DNA technology.


I saw this video a few years ago on VHS and was fascinated by the obvious parallels by the debates then on recombinant DNA and the debates today around synthetic biology. It’s great to see this piece of history now online for all to see.

Posted By: Reshma Shetty

Ginkgo organism engineers and the pipe

In a previous post, I discussed different platforms for organism engineering that were presented at SB5.0.

Here I’ll try to give a high-level picture of Ginkgo’s pipeline for organism engineering. If you’ve checked out our webpage, you’ll see that we have several different organism engineering projects happening at Ginkgo that span several different hosts. Our goal was to build out a pipeline that could support the engineering of all these very different organisms for very different purposes but that uses a shared pipeline. To accomplish this goal, we deliberately opted to decouple design from fabrication. Ginkgo organism engineers place requests via our CAD/CAM/LIMS software system. Those requests are then batched and run on Ginkgo’s robots.

Ginkgo's organism engineering pipeline

By decoupling design from fabrication and pushing construction and testing through a shared, automated pipeline, we’ve been able to achieve a level of productivity that would have been unattainable if we used conventional, manual molecular biology approaches. Below we show a plot of requests (placed either by Ginkgo organism engineers or by other pipeline processes), samples (physical objects containing DNA/strains/reagents), molecules (abstract objects corresponding to unique DNA sequences including but not limited to standardized parts), and runs (batches of multiple requests that have been completed via the Ginkgo pipe).

Ginkgo's pipe drives productivity

Hence, Ginkgo organism engineers are free to focus on design and analysis of novel organisms rather than mindless pipetting operations better done by robots than PhDs. We’re building a team of organism engineers each of whom

  • cares passionately about engineering organisms
  • has a strong opinion about how to engineer organisms
  • understands that their opinion is just that – an opinion – and that the only way to develop general design principles for organism engineering is to design a bunch of organisms successfully
  • If you seek to be one of the best organism engineers on the planet and don’t want to be limited in the complexity of the organisms that you engineer by how fast you pipet, you should come talk to us. See the website for details.

    Posted By: Reshma Shetty

    Emerging platforms for organism engineering

    Synthetic biology 5.0 wrapped up a couple weeks ago and attending the conference reinforced for me that the field has developed sufficiently over the past few years that we are now seeing different platforms and/or schools of thought in how to engineer organisms start to coalesce.

    Chris Voigt gave a nice talk about how his lab harvests large, functional operons from nature, like the nitrogen fixation gene cluster, and goes through a process of refactoring to standardize the control and gene expression elements in order to gain complete control over the pathway. (The refactoring approach was first pioneered by Drew, Sri and Leon.) Unfortunately, refactoring currently appears to lead to a gene cluster that has less activity than what nature provided, but yet there is less concern over unknown or cryptic biology. Interestingly, Chris says that in all his lab’s refactoring efforts (which involves several years of design-construction-debugging by Karsten), they never really discovered new or interesting biology but rather tended to get tripped up by errors in the sequence databases or incorrectly annotated start sites for genes.

    John Glass and Dan Gibson both gave talks about genome synthesis and genome upload technologies that came out of JCVI (see PMIDs 17600181, 18218864, 19073939, 19363495, 19696314, 20211840, 20488990, 20935651). The JCVI/Synthetic Genomics platform might be thought of as combining (meta)genome sequencing and genome synthesis to make organisms from scratch.

    Zach Serber discussed the Automated Strain Engineering (ASE) platform at Amyris. They are able to build 1500 yeast strains start to finish in 3 weeks (though they do pipeline their process). They have a library of 12,000 parts which they draw from to make up to 6 gene constructs using sewing PCR and then integrate into yeast. He didn’t go into their assay platforms but briefly mentioned that they do a combination of high-throughput screening and ‘omics analysis.

    Doug Densmore, Jake Beal and Ron Weiss are working on Bio-Design Automation: namely, the ability to translate a high-level functional specification to successively lower abstraction levels (i.e. devices, parts etc.) until you get the actual DNA sequence that you then construct using automated DNA assembly.

    And while it wasn’t presented in detail in a talk at SB5.0, Jef Boeke, Jean Peccoud and collaborators are developing a platform for yeast chromosome redesign. Finally, of course there is the Tom Knight/iGEM/Registry of Standard of Biological Parts approach to synthetic biology which inspired aspects of many of the above platforms.

    I imagine that at least a fraction of the would-be biological engineers out there might find the platform or tools aspects of synthetic biology mundane and prefer to focus on the organisms that they can design and build. But I’d argue that every synthetic biologist should care deeply about what the platforms look like. There’s a better than even chance that the future of synthetic biology lies in decoupling design from fabrication and testing. If so, the organism engineers in the future will submit their designs to centralized facilities where designs get batched, fabricated on robots and then [maybe] undergo a preliminary analysis. Hence, the platforms that get designed today are going to dictate the design constraints to which organism engineers will be forced to adhere tomorrow. Over time, the relative merits of each platform’s design constraints will be judged based on the complexity and utility of the engineered organisms that they produce.

    Given all the above, you might be asking, what exactly is Ginkgo’s platform for organism engineering? I’ll cover that in a subsequent post …

    Posted By: Reshma Shetty

    Pilgrimage to Taq polymerase’s roots

    Our friend Pete Carr who’s at MIT’s Media Lab came by for a visit last week. Apparently, Pete used to work for Cetus and through his connections there managed to figure out exactly where in Yellowstone National Park is Mushroom Pool, the location where Thomas Brock first discovered Thermus aquaticus in 1966. The thermophile Thermus aquaticus is of course the source of the heat-tolerant Taq polymerase which was key to making PCR work robustly and easily.

    On his recent trip to Yellowstone, Pete’s family waited patiently while Pete made his pilgrimage to this key place in biology and biological engineering’s history. Below are a couple photos that Pete had from his trip at Mushroom Pool. (Many thanks to Pete for letting us post them here.)

    Pete Carr at Mushroom Pool in Yellowstone National Park
    Pete Carr at Mushroom Pool in Yellowstone National Park
    Mushroom Pool at Yellowstone National Park
    Mushroom Pool at Yellowstone National Park

    Posted By: Reshma Shetty