Abstract
A significant number of investigations have been performed to develop and optimize cold plates for direct-to-chip cooling of processor packages. Many investigations have reported computational simulations using commercially available computational fluid dynamic tools that are compared to experimental data. Generally, the simulations and experimental data are in qualitative agreement but often not in quantitative agreement. Frequently, the experimental characterizations have high experimental uncertainty. In this study, extensive experimental evaluations are used to demonstrate the errors in experimental thermal measurements and the experimental artifacts during testing that lead to unacceptable inconsistency and uncertainty in the reported thermal resistance. By comparing experimental thermal data, such as the temperature at multiple positions on the processor lid, and using that data to extract a meaningful measure of thermal resistance, it is shown that the data uncertainty and inconsistency are primarily due to three factors: (1) inconsistency in the thermal boundary condition supplied by the thermal test vehicle (TTV) to the cold plate, (2) errors in the measurement and interpretation of the surface temperature of a solid surface, such as the heated lid surface, and (3) errors introduced by improper contact between cold plate and TTV. A standard thermal test vehicle (STTV) was engineered and used to provide reproducible thermal boundary conditions to the cold plate. An uncertainty analysis was performed in order to discriminate between the sources of inconsistencies in the reporting of thermal resistance, including parameters such as mechanical load distribution, methods for measuring the cold plate base, and TTV surface temperatures. A critical analysis of the classical thermal resistance definition was performed to emphasize its shortcomings for evaluating the performance of a cold plate. It is shown that the thermal resistance of cold plates based on heat exchanger theory better captures the physics of the heat transfer process when cold plates operate at high thermodynamic effectiveness.
1 Introduction
where the driving temperature difference is usually defined as the difference between the hot surface (or case) temperature Tc and the inlet coolant temperature Tin. This definition is a legacy metric from early models developed to predict the thermal resistance of low-density, finned air heat sinks. As shown by Moffat [6], this definition is useful if there is no change in the coolant temperature through the cold plate, which requires an infinite flowrate. This condition is not satisfied in practice. Designing a cold plate with this definition of the thermal resistance may lead to design errors, especially because most of the correlations for heat transfer coefficient correlations use the log-mean temperature difference (LMTD) in defining the overall heat transfer coefficient [7]. Furthermore, the inlet-to-base temperature difference assumes a constant case temperature, which is not generally the case for high power electronic components [8,9]. When applying Eq. (1) to define R, there is ambiguity related to the definition of the surface temperature because neither the cold plate base nor the electronic component lid (or case) is isothermal. This lack of consistency is due to poor control of the boundary conditions rather than experimental measurement error.
Measuring the thermal resistance of a cold plate introduces sources of uncertainty that depend on whether it is defined in terms of the component case temperature or the cold plate base temperature. In the first case, the thermal resistance of the thermal interface material (TIM) is necessarily included and thorough uncertainty analysis must account for the repeatability of the TIM application procedure and the testing parameters, such as mechanical load, that may affect TIM performance. TIMs may be categorized as thermal greases, phase-change materials (PCMs), gels, thermal pads, and thermal pastes [10]. Based on these groups, the TIM material selection and the test conditions are crucial for achieving consistency and repeatability for a comparative study between cooling technologies. In an extensive review, Fletcher [11] found that increasing the joining pressure reduces the thermal resistance of thermal greases, which is even lower in smooth surfaces. Despite the excellent thermal performance of grease, the author notes that greases are not reliable for applications where the operational temperatures are high and when long-term contact can cause vaporization or migration to other surfaces. Zhao et al. [12] studied parameters such as surface roughness, temperature, and pressure, affecting the thermal resistance behavior of thermal pads, PCM, and low melting point alloys (LMPAs). The authors pointed out that PCM thermal resistance is not sensitive to roughness but highly dependent on temperature and pressure; this is also true of LMPAs, whose thermal conductivity is even higher. Because LMPAs are electrically conductive and can corrode aluminum, they are not recommended for electronic cooling. For pads, the authors achieved good thermal performance when the loading pressure is significant, which might affect the mechanical integrity of the chip [13]. Ramakrishna and Prabhu [14] presented a complete and highly detailed review paper about TIM challenges and the future requirements these materials will target.
Experimentally measuring the case temperature with a point sensor introduces at least two unavoidable errors: the sensor error and the perturbation introduced on the true surface temperature by the presence of the sensor. Moffat [15] refers to these as the Zeroth and the First Order uncertainties. Accurate measurement of the surface temperature of a solid is difficult. Azar [16] mentioned that in the case of using thermocouples, the wire selected must be 36 gauge and type-K as it offers the lowest errors by heat conduction. However, type-K thermocouples are challenging to solder to a highly conductive surface, such as copper. Type-T thermocouples can induce a high error because they are made of highly conductive material, but they are easier to attach to the surface, using soldering or other methods. Kozarek [17] notes that experimentally obtaining the junction-to-case thermal resistance Rjc presents challenges due to the error in temperature measurement. He demonstrated that Rjc standards such as MIL-883C [18], JEDEC [19], and SEMI [20] will introduce erroneous surface temperatures that will affect the accuracy of Rjc. Therefore, Kozarek [17] developed an experimental technique to measure thhe Rjc using a liquid cold plate as a heat sink, where the case temperature is measured by a FluorOptic temperature probe inserted in a through-hole in the center of the cold plate [17]. Unfortunately, the same methodology cannot be applied to evaluate cold plate thermal performance because the probe is intrusive and will interfere with the flow distribution in the cold plate. The open compute project [21] initiated an effort on cold plate development and qualification, but no recommendations or standards for experimental thermal measurements have been published.
This work aims to clarify and quantify the errors in quantifying thermal resistance introduced by (1) improper measurement procedures, (2) poor control of the experimental conditions, and (3) misinterpretation of the case temperature when it is measured with thermocouples. In particular, this study focuses on the sensitivity of the case temperature measurement to thermocouple placement procedures with recommendations for best practices based on a standard thermal test vehicle (TTV) engineered to provide a controlled thermal boundary condition. A review of the various thermal resistance definitions is carried out to further illustrate the errors introduced in reporting thermal resistance based on experimental measurements due to ambiguities and assumptions in the definitions.
2 Experimental Methods
2.1 Experimental Apparatuses.
The flow loop shown in Fig. 1 was used to deliver coolant (25 vol. % Propylene Glycol) at a controlled pressure, temperature, and flowrate to the cold plates. The coolant is pumped by a positive displacement gear pump (Diener, Extreme 4000). Downstream of the reservoir, the coolant is pumped through a secondary plate heat exchanger, where its temperature is adjusted to reach the desired inlet temperature. After exiting the heat exchanger, the coolant flows through the electromagnetic flowmeter (IFM SM6004) and enters the test section. Downstream of the test section, the coolant passes through a primary heat exchanger, rejecting the heat to process chilled water and completing the loop by returning to the reservoir.
The standard thermal test vehicle (STTV) shown in Fig. 2(a) was made on copper and designed to ensure that the three-dimensional effects produced by the change in geometry from the cylindrical cartridge heaters to the heat-flux-meter-bar (HFMB). The idea is to provide a one-dimensional heat flux in the HFMB to the cold plate. To achieve this, four cartridge heaters were inserted into the 15.88-mm diameter through holes machined in the base of the STTV. Each heater was 76.2 mm long and had a maximum capacity of 750 W. Thus, the STTV has a maximum capacity of 3000 W (120 W/cm2) when the heaters are connected in parallel. The HFMB, which corresponds to the upper block of the STTV, is 50 mm in height, and its cross-sectional area is 50 mm by 50 mm, as well. Four duplex type-K thermocouples were installed along the HFMB vertically to measure the heat flux delivered to the cold plate. The individual thermocouple wires had a diameter of 0.3 mm and were inserted into 1.0 mm bore holes machined precisely using electrical discharge machining in the HFMB at a spacing of 6.35 mm.
A numerical analysis was performed to ensure that the thermocouple placement matched the one-dimensional conduction zone necessary to measure the heat flux. The three-dimensional computational domain (Fig. 2(a)) was discretized using an unstructured mesh of 837,327 elements. ANSYSFluent was used to solve the steady-state energy equation through a finite difference analysis. The boundary conditions listed in Table 1 correspond to the experimental conditions implemented to validate the thermal performance of the STTV. The validation experiment was performed by placing a commercially available cold plate of unknown internal geometry onto the STTV to maintain its heated area as isothermal as possible. The cartridge heaters were powered at 1000 W with 2 LPM volumetric coolant flowrate was supplied to the cold plate. The inlet temperature of the coolant was set at 32 °C, obtaining a thermal resistance of 0.0198±0.0021 KW−1 whose corresponding unit thermal resistance is the value implemented as a boundary condition on the top surface of the STTV. Figure 2(b) shows the numerical results for the vertical temperature distribution along the centerline of the STTV, where a linear behavior can be observed within the region classified as HFMB. In contrast, the temperature distribution at the bottom of the device exhibits a nonlinear behavior due to the spreading thermal resistance resulting from changing the geometry of the heat source from cylindrical to cubical. The numerical results exhibited acceptable agreement with the experimental measurements of the centerline temperature, within a 7.5% discrepancy between the average experimental values and numerical. The difference can be explained by modeling the vertical surfaces of the STTV as a perfectly insulated boundary, which, in the actual application, is not entirely true. Additionally, inserting the thermocouples into the copper bar will perturb the surrounding region, locally reducing the temperature around the thermocouple tip, which was not considered in the numerical model. Despite the differences, the numerical and experimental results follow a linear temperature distribution, demonstrating one-dimensional heat conduction within the HFMB and validating its use as a device to experimentally determine the heat flux going in the direction of the cold plate.
2.2 Experimental Procedure.
The test section of the flow loop (Fig. 3) includes the cold plate, the STTV, and the thermal interface material used to guarantee the thermal contact between pieces. A commercial microchannel cold plate (Cold plate S) previously documented in Ortega et al. [22] was used to evaluate thermal resistance measurements, with parametric variations in the thermocouples placement procedures. The cold plate is a side-in, side-out single-phase microchannel, where the heat sink geometrical features are shown in Table 2. The cold plate was secured to the heated area using four spring-loaded bolts that applied a repeatable loading pressure. The springs were experimentally characterized by compression tests performed using an electromechanical testing machine (MTS Criterion model 41). From the results shown in Fig. 4, the average spring constant was 10.205±0.005 kN/m. Two loading pressure levels were considered for this study, whose magnitude was controlled by measuring the spring compression length. To complete the study regarding the influence of thermal contact on thermal characterization, Honeywell PTM 7950 and Artic Silver grease MX-4 were used as TIMs. Although the manufacturers of these TIMs specify them for even higher loading pressure, 10 PSI and 15 PSI loading pressure levels were chosen to avoid mechanical failure of the test section components and safely observe the effect of pressure on the overall thermal resistance of the system.
Channel width, mm | 0.2 |
Channel height, mm | 4 |
Channel length, mm | 43 |
Number of channels | 120 |
Fin thickness, mm | 0.2 |
Channel width, mm | 0.2 |
Channel height, mm | 4 |
Channel length, mm | 43 |
Number of channels | 120 |
Fin thickness, mm | 0.2 |
The volumetric flowrate was varied from 1 LPM to 4 LPM. Under steady-state conditions, three inlet temperatures were tested: 22 °C, 32 °C, and 42 °C. To measure the temperatures at the inlet and the outlet of the test section, 3.2 mm diameter type-K thermocouple probes were inserted into the copper tubing connected to the cold plate. The temperature of the case (i.e., the top surface of STTV (Fig. 2(a)) was measured with three butt-welded type-K thermocouples (0.08 mm diameter) inserted into grooves machined into the STTV surface. The distance between grooves was 15 mm in the flow direction. As can be seen in Fig. 5, these thermocouples were strategically positioned in a diagonal orientation with respect to the flow stream direction to account for the impact of the manifold on channel flow distribution.
The influence of the thermocouple placement on the case temperature measurements was studied by setting two groove depth levels (0.46 mm and 1.09 mm) for a fixed width (0.53 mm), obtaining two aspect ratios (): 0.45 and 1.05. Since the thermocouple is not large enough to fill the groove, a thermal adhesive compound was used as a filler to avoid air voids that might perturb the measured temperatures. The thermal adhesive compound (GENNEL G109) was chosen over epoxy and solder because it can be easily reworked and allows thermocouple replacement. The dimensions of the grooves were measured with a digital microscope (Celestron, 5 MP Digital Microscope Pro), and the images were processed in the open-source software ImageJ Fiji.
A KEYSIGHT N8762A DC power supply unit was used to power the cartridge heaters at the base of the TTV. A power of 1000 W applied to the base of the TTV was used for all cases. The thermocouples were connected to a data acquisition system (NI cDAQ-9174) referenced to an external ice bath.
3 Data Reduction
3.1 Thermal Resistance.
where the gradient is determined by curve-fitting the four thermocouple measurement on the HFMB. The thermal conductivity was considered constant with respect to the temperature using the value provided by Lees et al. [23] and published at the NIST Thermodynamics Research Center (382.836±38.2836 Wm−1K−1).
3.2 Heat Exchanger Analogy.
As shown by Ortega et al. [22], the effectiveness is a measure of the cold plate performance. The higher the NTU, the higher will be the effective use of the mass flow for cooling.
3.3 Uncertainty Analysis.
4 Results and Discussion
4.1 Effect of Case Temperature Measurement Method.
Experiments were performed with a groove aspect ratio 0.45 machined into the heated area of the STTV. For this case, the groove width 0.53 mm and depth 0.46 mm. Results in Fig. 6(a) show that for a constant loading pressure, TIM, and inlet coolant temperature, the different definitions of the thermal resistance lead to an average discrepancy of 42.9%. This level of disagreement was observed between RFo and RAve. The STTV surface temperature found by extrapolation of the linear temperature profile is assumed to be the true case temperature. Placing thermocouples on the heated surface of the STTV introduces significant error in the case temperature measurements that negatively impacts the determination of thermal resistance.
Comparing the average case temperature and the center case temperature, the results show an average discrepancy of 3.3%, which experimentally validates the isothermal surface assumption implicit in the one-dimensional conduction behavior in the HFMB. It can be concluded that the discrepancy is thus primarily due to the absolute value of the measured surface temperature and not in its distribution.
To investigate the impact of the depth of the thermocouple groove in the measured case temperature, deeper grooves were machined on the heated surface of the STTV, resulting in an aspect ratio 1.05. In this case, the groove width was 0.53 mm and depth was 1.09 mm. When tested under the same conditions as before, the thermal resistance of the cold plate/TIM system exhibited better agreement, as shown in Fig. 6(b). On average, the discrepancy between RFo and RAve was 9.0%. The improvement in the measurement of the true undisturbed surface temperature by using a deeper groove is more likely as a result of compensating for the surface temperature depression in a shallow groove by embedding the thermocouple in a region that has a higher temperature. Nevertheless, the measured temperature in the deeper groove better responses the true surface temperature.
4.2 Numerical Simulation of Thermocouple Placement.
A numerical study was carried out to obtain more insight related to the effect of placing the thermocouple on the STTV heated surface area. The two-dimensional computational domain, Fig. 7(a), was discretized using a nonuniform mesh of 9066 elements, setting a 1000 W heat flow on the bottom, axisymmetry in the left surface, adiabatic on the right surface, and the boundary condition given in Table 1 for the top surface. As can be seen in Fig. 7(a), the thermocouple induces a temperature depression in the area surrounding the groove, due to the lower thermal conductivity of the thermal adhesive compound ( 1.2 Wm−1K−1), and the abrupt change in geometry introduces by the groove.
°C | |||||
---|---|---|---|---|---|
BiW = 214 | BiW = 21.4 | BiW = 2.14 | BiW = 0.214 | BiW = 0.0214 | |
0.375 | 8.88 | 7.45 | 3.39 | 0.89 | 0.07 |
0.500 | 3.57 | 3.38 | 2.13 | 0.54 | −0.01 |
0.625 | 1.80 | 1.75 | 1.24 | 0.28 | −0.06 |
0.750 | 0.81 | 0.80 | 0.62 | 0.10 | −0.11 |
0.875 | 0.17 | 0.19 | 0.19 | −0.04 | −0.15 |
1.000 | −0.23 | −0.20 | −0.10 | −0.15 | −0.19 |
1.125 | −0.52 | −0.49 | −0.31 | −0.23 | −0.22 |
1.250 | −0.73 | −0.68 | −0.47 | −0.30 | −0.26 |
1.375 | −0.87 | −0.83 | −0.58 | −0.35 | −0.29 |
°C | |||||
---|---|---|---|---|---|
BiW = 214 | BiW = 21.4 | BiW = 2.14 | BiW = 0.214 | BiW = 0.0214 | |
0.375 | 8.88 | 7.45 | 3.39 | 0.89 | 0.07 |
0.500 | 3.57 | 3.38 | 2.13 | 0.54 | −0.01 |
0.625 | 1.80 | 1.75 | 1.24 | 0.28 | −0.06 |
0.750 | 0.81 | 0.80 | 0.62 | 0.10 | −0.11 |
0.875 | 0.17 | 0.19 | 0.19 | −0.04 | −0.15 |
1.000 | −0.23 | −0.20 | −0.10 | −0.15 | −0.19 |
1.125 | −0.52 | −0.49 | −0.31 | −0.23 | −0.22 |
1.250 | −0.73 | −0.68 | −0.47 | −0.30 | −0.26 |
1.375 | −0.87 | −0.83 | −0.58 | −0.35 | −0.29 |
As seen in Fig. 7(b), the temperature perturbation due to the thermocouple and the groove can be rectified by increasing the depth of the groove for all the Biot numbers, reaching an ideal aspect ratio for which the error,, goes to zero. The temperature measured with the ideal aspect ratio is a rectified value that better represents the true undisturbed case temperature. Furthermore, the optimal aspect ratio is Biot number independent measuring that is robust over a large variation in boundary conditions, which would represent the imposed cold plate resistance. This means that regardless of the type of cold plate, coolant flowrate, TIM, or thermal adhesive compound, the groove aspect ratio will rectify the case temperature measurement, which is clearly seen in Fig. 7(b). The error due to the groove depth is increasingly large for shallow grooves and its magnitude depends on the Biot number.
4.3 Effect of Thermal Interface Material, Loading Pressure, and Coolant Temperature.
To ensure thermal contact between the cold plate and the STTV, grease and phase change material (PTM) were used as TIMs. Their thermal resistance depends on the loading pressure, among other factors [24]. The sensitivity of the system cold plate/TIM thermal resistance with loading pressure is shown in Fig. 8(a), where thermal resistances are obtained from Eq. (5). Although both data groups follow the same trend, the thermal resistance at 10 PSI loading pressure is 34.2% higher than at 15 PSI. For both cases, the same amount of grease was applied, using a serrated tool that removes the excess material, leaving constant cross section area stripes (2.4 mm × 2.4 mm) on the heated surface of the STTV.
The coolant inlet temperature did not have any influence on the measured thermal resistance, as seen in Fig. 8(b). In this range of temperatures, the temperature dependence of the PG-25 viscosity, density, and thermal conductivity are minor. The temperature dependence of the TIMs is also minor. The curves corresponding to the experiments performed with grease overlap each other, and the minor discrepancies can be explained by the trend of the grease to flow due to the decreased viscosity at higher temperatures. The difference is 6.05% on average, over the range of 22 °C and 42 °C. Similar results were found for the PTM, which had a 4.2% difference in that temperature range. Since both the grease and the PTM produce consistent results for the coolant inlet temperature range, a consistent characterization can be carried out using either grease or PTM. On average, the PTM leads to a 36.6% lower overall thermal resistance (Tables 4–9).
4.4 Cold Plate Effectiveness.
Following the approach of Ortega et al. [22], Moffat [25], and Webb [6], the thermal resistance was experimentally evaluated in a manner that is consistent with compact heat exchanger theory. As suggested by Webb [6], the LMTD (Eq. (10)) is used to define thermal resistance instead of the inlet temperature difference of the traditional model (Eq. (2)).
To demonstrate the consistency of the LMTD resistance definition and quantify the error introduced using an erroneous thermal resistance definition, a blind test was performed on a second commercially available cold plate (cold plate M), whose geometrical parameters and flow configuration were unknown. Figure 9(a) shows that the thermal resistance calculated using the classical definition RFo leads to higher values of the thermal resistance regarding LMTD definition RLMTD. In fact, the difference reached a maximum magnitude of 46% for cold plate M and 24.2% for cold plate S, which occurred at higher magnitudes of the thermal resistance where the case temperature and the coolant outlet temperature were higher. These results align with the observed trend depicted in Fig. 9(b), where the effectiveness is plotted against the NTU. As the effectiveness increases, as well as the thermal resistance, the data based on RFo deviate from the theoretical model (Eq. (12)). Conversely, the data based on RLMTD consistently adhere to the theoretical model. The divergence occurs because utilizing RFo in computing NTU underestimates the number of transfer units for a given cold plate, which can be interpreted as an artificial reduction of the heat transfer capacity of the cold plate itself. This is because the traditional definition of thermal resistance, RFo, does not account for the temperature rise in the flow.
The effectiveness of a cold plate depends on the temperature rise of the coolant between the inlet and outlet. Since the inlet temperature is constant, effectiveness will increase as the outlet temperature increases. Hence, it is expected that a thermal resistance that only considers the inlet temperature in its definition becomes less able to capture the physics of the heat transfer process at higher magnitudes of NTU and effectiveness, when the flow temperature rise is significant. In other words, the higher NTU and the effectiveness of the cold plate (regardless of its flow configuration), the less meaningful the traditional thermal resistance based on () and the more critical it is to define the thermal resistance in terms of the LMTD for characterizing a given cold plate.
5 Conclusions
An experimental procedure for the thermal characterization of single-phase cold plates was developed. The methodology considered different definitions of the thermal resistance commonly applied to cold plates and the sources of error that affect its magnitude.
A standard thermal test vehicle that delivers a thermally measurable heat flux and known case temperature to the cold plate was designed. The device allows a precise measurement of the case temperature without the need for thermocouples installed on surface.
The method for installing a thermocouple on the case was experimentally and numerically studied. The case temperature measured with a thermocouple installed in a surface groove introduces a significant measurement uncertainty that depends on the Biot number and the groove aspect ratio.
Case temperature error is minimized for groove aspect ratio of 1.0 for all cases. This “ideal” aspect ratio is Biot number independent. The Biot number encompasses the cold plate resistance and the thermal conductivity of the thermal adhesive compound.
Both PTM and grease were used to characterize the thermal resistance of the cold plate. While PTM exhibits lower thermal resistances than grease, both were unaffected by the coolant inlet temperature within the range studied. Conversely, the joining pressure shows a significant effect on the thermal resistance of the system.
A thermal resistance based on the LMTD was evaluated and compared with the traditional inlet temperature difference-based definition. The definition based on the inlet temperature may underestimate the true thermal performance of a given cold plate.
The thermal resistance based and LMTD is consistent with the effectiveness-NTU theory of heat exchangers and therefore embodies known physical limits for cold plate performance.
Acknowledgment
Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Funding Data
National Science Foundation Center for Energy Smart Electronic System (ES2) (Grant No. IIP 1738782; Funder ID: 10.13039/100000001).
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.
Nomenclature
- A =
area, m2
- H =
height, m
- k =
thermal conductivity, Wm−1K−1
- L =
groove depth, m
- LMTD =
logarithmic mean temperature difference, K
- NTU =
number of transfer units
- Q =
heat flow, W
- =
heat flux, Wm−2
- R =
thermal resistance, KW−1
- =
unit thermal resistance, Km2W−1
- T =
temperature, K
- U =
uncertainty
- =
overall heat transfer coefficient, Wm−2K−1
- W =
groove width, m
- =
temperature difference, K
- Ave =
based on the average temperature
- b =
bulk value
- c =
case
- cp =
cold plate
- Ct =
based on the center temperature
- Fo =
based on the extrapolated temperature following Fourier's Law
- G =
groove
- g =
thermal adhesive compound
- H =
heated area of the STTV
- in =
inlet
- jc =
junction to case
- m =
mean
- out =
outlet
- s =
surface
Appendix
Sensitivity coefficient | Uncertainty |
---|---|
= | Uk = 38.2836 Wm−1K−1 |
= | UA = 3.5355 × 10−8 m2 |
= kA | , is the standard error corresponding to the linear regression applied to the HFMB temperature measurements |
Sensitivity coefficient | Uncertainty |
---|---|
= | Uk = 38.2836 Wm−1K−1 |
= | UA = 3.5355 × 10−8 m2 |
= kA | , is the standard error corresponding to the linear regression applied to the HFMB temperature measurements |