As processors, memories, and other components of today's embedded systems are pushed to higher performance in more enclosed spaces, processor thermal management is quickly becoming a limiting design factor. While previous proposals mostly approached this thermal management problem from circuit and architecture angles, software can also play an important role in identifying and eliminating thermal hotspots as it is the main factor that shapes the order and frequency of accesses to different hardware components in the chip. This is particularly true for compiler-scheduled Very Long Instruction Word (VLIW) datapaths. In this paper, we focus on a compiler-based approach to make the thermal profile more balanced in the integer functional units of VLIW architectures. For balanced thermal behavior and peak temperature minimization, we propose techniques based on load balancing across the integer functional units with or without rotation of functional unit usage. As leakage power is exponentially dependent on temperature and temperature is dependent on total power (i.e., switching and leakage), in our techniques, we also consider leakage power optimization by IPC tuning (instructions issued per cycle). By taking a code that is already scheduled for maximum performance as input, our scheduling strategies modify this performance-oriented schedule for balanced thermal behavior with negligible performance degradation. We simulate our scheduling strategies using a framework that consists of the Trimaran infrastructure, a power model, and the HotSpot. Our experimental results using several benchmark programs reveal that the peak temperature can be reduced through compiler scheduling.