Collaborative Filtering (CF) is a crucial task in recommendation systems, aimed at predicting user preferences based on the behaviors and preferences of similar users. While traditional CF techniques have achieved significant success by learning low-dimensional vector representations of users and items from historical interaction data, recent approaches leverage graph neural networks (GNNs) to better capture collaborative signals from the topological structure of user-item interaction graphs. Although these GNN-based methods, such as LightGCN and UltraGCN, have advanced the field, they often overlook heterophilic patterns where users engage with items from diverse categories. Additionally, stacked multi-layer GNNs face the over-smoothing problem, limiting their ability to capture high-order interactions. In this work, we propose WaveHDNN, a novel wavelet-based hypergraph diffusion framework designed to address these challenges. Our model fuses two channels: a Heterophily-aware Collaborative Encoder, which adapts to heterophilic patterns, and a Multi-scale Group-wise Structure Encoder, which utilizes wavelet transforms for flexible, localized structure learning. We also employ cross-view contrastive learning to ensure consistent embeddings. Extensive experiments on popular recommendation datasets demonstrate the superior performance of WaveHDNN, highlighting its ability to capture both heterophilic patterns and localized topological information.