Transient and Steady Experimental and Numerical Thermal Studies towards Energy Efficiency Improvements in Data Centers
[Thesis]
Tradat, Mohammad
Sammakia, Bahgat
State University of New York at Binghamton
2019
173 p.
Ph.D.
State University of New York at Binghamton
2019
Data centers have become an essential part of modern life. Cloud processing, storage, and an increase in the transfer of information globally has placed a demand on data centers to expand in both in number and in size. To ensure IT equipment functionality and reliability, a data center's rooms must be maintained within the recommended environmental conditions (temperature and humidity) that follow the vendor's specifications and ASHRAE thermal guidelines (servers, network switch and storage devices). Therefore, cooling and energy efficiency are critical to the successful operation of modern large data centers. In addition, the energy consumption of cooling has rapidly increased and has become a major concern for data center operators and managers. The trend is to maximize data center utilization, reducing energy consumption for all its active and passive components by using the cooling system more efficiently. This necessitates practical measurements tools and predictive modeling to capture the temperature, pressure, and complex flow fields of cooling and heating sources. Considering this, today's data centers increasingly rely on environmental data collection and analysis, which are needed to operate the cooling infrastructure as efficiently as possible in order to maintain the reliability of IT equipment. The presented work in this dissertation focuses on three main aspects of efficient cooling system energy consumption in data centers. The first section experimentally emphasizes the importance of the quality of the collected environmental data and its relevance to the overall operation of the data center by analyzing and comparing two different environmental monitoring methods. The comparison considers the quality and relevance of the collected data and investigates their effect on key performance and operational metrics. The results provided by the IPMI interface have shown a large variation of server inlet temperatures. On the other hand, the discrete sensor measurements have shown much more reliable results where the server inlet temperatures had minimal variation inside the cold aisle. These results highlight the potential difficulty in using IPMI inlet temperature data to evaluate the thermal environment inside the contained cold aisle. Furthermore, this section focuses on how common methods for managing cooling efficiency can be affected by the approach to data collection. Results have shown that using preheated IPMI inlet temperature data can lead to unnecessarily lower cooling set points, which in turn minimizes the potential cooling energy savings. One case shows that using discrete sensor data provides 20% more energy savings than using IPMI inlet temperature data. In addition, a steady state experimental analysis of the effect of elevated temperature in a data center is provided. The results show that the IT equipment fans ramp up at an SAT of 24℃, which for this configuration results in an airflow demand that is greater than the available supply, causing negative pressure inside containment. Hence, containment allows raising the SAT up to 24℃ without any negative impact on the IT equipment airflow demand (neutral pressure maintained). This section may help data center operators to make the decision of what monitoring or control scheme to use. The second section of the presented work addresses thermal and IT equipment performance issues during power outages and establishes guidelines for implementing pressure relief mechanisms in a contained environment during blower failure. The impact of pressure relief on IT availability and an experimental based analysis on the Ride Through Time (RTT) of servers inside containment during blower failure are presented. The results show that for all three classes of servers tested, pressure relief is not required. On the contrary, during blower failure, CAC helps keep the servers cooler for longer. Containment provides a barrier between the hot and cold air streams and causes negative pressure to build up, which allows the servers to pull cold air from the underfloor plenum. The data further illustrated that the servers could pull air from the plenum through the cooling unit, taking advantage of its inherent cooling storage capability, which was due to the thermal mass of its various components (heat-exchanger coils, cold water, etc.), and thus providing a longer RTT. Finally, an experimental and numerical characterization of backwardly curved independently driven blowers (EC plug fan's) that has been installed at the ES2 data center laboratory using practical measurements methods, including CRAHs and perforated tiles airflow measurements. A full physics based CFD model is created using Future Facilities 6SigmaRoom CFD tool to predict/simulate the measured flow and pressure fields induced by backward and forward curved blowers. The parameters and sensitivity of the baseline modeling are investigated and considered. Measurements are taken for the EC plug fan cooling unit (i.e. CRAH2) for four experimental flow constraint scenarios, which are applied on the room level. Then, the calibrated model predicts the measured data. The results have shown a very accurate model with a maximum mismatch of 8%. A numerical investigation is performed to examine the positive impact of selectively placed obstructions (on-purpose air-directors) referred to as partitions. A quantitative and qualitative analysis is conducted for the underfloor plenum pressure field, perforated tiles airflow rate, and racks inlet temperature with and without partitions. This was done using two CFD models, which have been built using Future Facilities 6SigmaRoom CFD tool. First, a simple data center model has been used to quantify the partition's benefits for two different systems: hot aisle containment and an open configuration. Second, the investigation is expanded using a physics-based experimentally validated CFD model of a medium size data center (more complicated data center geometry) to compare the different types of proposed partitions. Both model's results have shown that partition type I (partitions height of 2/3 of plenum depth measured from the subfloor) eliminates the presence of vortices in the underfloor plenum allowing for a uniform pressure differential across the perforated tiles, which drives more uniform airflow rates. In addition, the impact of the proposed partitions on rack inlet temperature is reported through a comparison of open versus hot aisle containment. The results show that the partitions have a minor effect on the rack inlet temperate for the HAC system. However, the partitions significantly improve the tiles flow rate. For the open system, the partitions improve the tiles air flow rate and rack inlet temperature, thus improving overall data center performance by eliminate the formation of hot spots at the computer rack inlet. The key scientific contributions of this dissertation are the quantitative and qualitative comparison of two different methods of environmental monitoring. The discrepancy in the server inlet air temperature measured by discrete sensors versus IPMI internal sensors has been found to be caused by the internal hot air recirculation, which is strongly related to the server's internal design. Furthermore, the impact of cold aisle containment pressure relief on IT availability is addressed for the first time in literature. Finally, this work also provides general guidelines for using Novel Underfloor Air-Directors (deliberated partition) and illustrates their positive effect on the hydrodynamic and thermal performance of the data center.