Immersion cooling of servers is always fun, and it has evolved in the 20 years or so since I first saw it with $300/gallon special 3M liquids. In 2019, at every enterprise trade show, we see a few servers that use this cooling in data centers, despite the different infrastructure needs they have. In order to simplify adoption, TMGcore have developed fully self-contained and physically dense server containers. Not only that, but ‘OTTO’ is supposed to be better for the environment too.

The traditional picture we all have of data centers are racks upon racks of servers perhaps built into hot channels and cold channels for airflow, a big HVAC system, a ton of noise, and a ton of networking and power cables everywhere. Over the last few years, we are seeing more and more efforts to bring the power efficiency of these date centers into more reasonable numbers, and one of those methods has been through two-phase immersion cooling: rather than using air, you put the whole server/rack into a liquid with a low boiling point and use the phase transition along with convection as your heat removal system.


A GIGABYTE Demo at CES 2017

Managing the infrastructure needs for two-phase immersion cooling is different to a traditional data center. There are the liquids, the heat exchangers, the power, the maintenance, and the fact that not a lot of people are used to having big expensive hardware dipped in what looks like water. This is why an immersion demonstration at a trade show usually draws a crowd – despite seeing it year on year, there are plenty of people that haven’t. How TMGcore have solved most of these issues is to remove the infrastructure and maintenance requirements completely.


60 kW Unit in 16 sq ft

The OTTO is a self-contained, automated, two-phase liquid immersed data center unit. All a datacenter needs to add is a connection to its power, network, and water lines. The family of products from TMGcore, built with partners, is designed so that once the hardware is installed, it doesn’t need adjusting by the person buying it. Units come in different sizes, and customers can scale their needs simply by ordering more units. Hardware hotswapping is either done locally or remotely by the internal system, energy is reused by the heat exchangers, and the typical ‘PUE’ metric that describes the power efficiency of a datacenter is only 1.028, compared to 1.05/1.06 for some of the most efficient air-cooled data centers. This means that for every megawatt of HPC compute done, TMGcore claims that their OTTO systems only need 1.028 megawatts of energy.


Six units with stacking

Another claim from TMGcore is compute density, up to 3.75 kW per square foot. This means that the three main feature sizes of Otto, 60 kW, 120 kW, and 600 kW, come in self-contained sizes of 16 sq ft, 32 sq ft, and 160 sq ft. The goal here is to provide compute capacity when space requirements are low. The units can also be stacked where required, or retained in portable containers where facilities exist. Customers with specific requirements can request unique builds as required.

Each unit is fitted with TMGcore’s own blade infrastructure, aptly named an ‘OTTOblade’. An example of one blade that the company provides is a dual socket Intel Xeon with dual 100G Ethernet, 512 GB of DRAM, eight SSDs, and 16 V100 GPUs, for 6 kW. 10 of these can go into one of the 60 kW units, affording 160 V100 GPUs in 16 sq ft.

Obviously one of the key criticisms for self-contained, sealed, automated hardware is that it’s a pain when hardware fails and it needs changing. One of the ideas behind two-phase cooling is that the temperature of the hardware can be closely monitored to extend its lifespan. For other out-of-the-box failures, some of it can be managed by the automated systems, while others will require engineers on site. The idea is that because these units are a lot easier to manage, operational expenses will be severely reduced regardless.

TMGcore is working with partners for initial deployments, and we’re hoping to see one in action this week at Supercomputing. I have an open offer to visit the R&D facility next time I’m in Dallas.

Source: TMGcore

Related Reading

POST A COMMENT

20 Comments

View All Comments

  • imaheadcase - Tuesday, November 19, 2019 - link

    So while this sound like a neat way to go about it, the fact that you still need to MOVE the water or the servers is a failure point, unless its some form of gravity fed system. I don't know, seems kinda like a product apple would make, self contained, so only they can know how to fix it and charge a lot for something that can be done cheaper. Reply
  • Father Time - Tuesday, November 19, 2019 - link

    "Low boiling point" - the liquid boils when it gets hot, taking the heat away in its gaseous state - the heat is then transferred into the exchanger at the top, the gas turns back to liquid and re-enters the pool.

    No need for any active "moving" of water, just the need to make sure you don't create "pockets" in which gaseous coolant can get trapped instead of moving up to the exchanger at the top - actually a more difficult issue than it might first seem.
    Reply
  • ads295 - Wednesday, November 20, 2019 - link

    I don't suppose latent heat of vaporization is of any significance then, here? I kept thinking about water evap coolers that cooled through the sole property of water to require so much heat to vaporise. Reply
  • Lord of the Bored - Wednesday, November 20, 2019 - link

    It is. Latent heat of vaporization is relevant to all vaporization.

    All the energy being carried away by the gas will be the vaporization energy(with the fluid as a whole sitting at the boiling point, much as a boiling pot of water won't go over 212 no matter how high the fire is).
    It just happens at a much lower threshold here than it does with water(exact temperature to be determined), so it will maintain a constant computer-safe temperature.
    Reply
  • twotwotwo - Tuesday, November 19, 2019 - link

    I wonder how close you get to a useful thing if you spin some rack units around into a drawer that's 36" tall instead of deep, take out some heatsinks and fans, roughen heatspreader surfaces, pick reasonable components, then fill the thing up and put the condenser setup on top?

    I'd guess the answer is "not very close" but I'd be fascinated to know more about why.

    Also curious what happened with ZTE saying they were looking into cheaper fluids than Novec in the earlier story. Guessing, again, not much; optimistic claims are cheap but fancy hydrocarbon liquids are (sadly) not.
    Reply
  • FunBunny2 - Tuesday, November 19, 2019 - link

    "The idea is that because these units are a lot easier to manage, operational expenses will be severely reduced regardless."

    I continue to wonder about that assertion. In the early days of the industrial revolution, millions of folks were made redundant by capital, thus gaining profit for the capitalist. Most production these days has so little labour in it, I don't see the basis for such an assertion. Just what is the dollar cost of labor per server per year?

    Other than people, what are the "operational expenses" to be reduced? That is, have to be paid with conventional data centre infrastructure? I just don't see it. A solution in search of problem.

    And, by the bye, the laws of thermodynamics demand that such an indirect method must be less efficient. Whether that inefficiency is compensated by other savings is another question.
    Reply
  • Lord of the Bored - Wednesday, November 20, 2019 - link

    Indirect? It is pretty much the most direct cooling you get.
    Liquid in contact with hot components vaporizes, bubble rises to surface, carrying heat. Vapor cools, condenses, falls back down into tank.
    Reply
  • FunBunny2 - Wednesday, November 20, 2019 - link

    it's indirect since, finally, the heat absorbed liquid is cooled by air. the laws say that heat exchange is less than 100% at each interface. now, one could use a geothermal engine, which might be more globally efficient. and, there's all that additional electricity needed to move the liquid around. the ultimate radiators will just as much (the laws say more) air flow to dissipate the removed heat.

    beyond the laws, though, is how 'operational expenses' are reduced. no explanation yet offered.
    Reply
  • destorofall - Wednesday, November 20, 2019 - link

    My brain hurts from trying to understand what you just typed... the dielectric fluid very very very likely boils at ~56°C. The condenser loop very very very likely runs water at or around 40-45°C, the then heated water can then be pumped through a liq-air hx that needs little airflow at relativly mild ambient temperatures of 20-30°C to complete the circuit of heat exchange. Energy transfers passively to the boiling heat sinks -> fluid through evaporation -> condenser through condensation -> water loop pumped -> liq-air hx -> environment. So the pump and potentially fans are added energy consumption devices but the pump can be sized in a manor to use little energy (compared to overall IT power) at the needed mass flow rates likewise with the fans. Reply
  • PeachNCream - Thursday, November 21, 2019 - link

    The way this is advertised, it probably factors HVAC operating costs into conventional air cooling which simply dumps waste heat into the data center. The HVAC is then responsible for removal of said heat. Citing thermodymanics alone ignores the physical placement of the system and the method by which cool air is provided to said system along with a bunch of other factors that are more nuanced, unique to each data center, and unknown based on the information we have available. Reply

Log in

Don't have an account? Sign up now