Abstract : Caching of popular content on wireless nodes is recently proposed as a means to reduce congestion in the backbone of cellular networks and to improve Quality of Service. From a network point of view, the goal is to offload as many users as possible from the backbone network to the wireless caches while at the same time offering good service to cache-unrelated users. Aggressive offloading can lead to an unbalanced user association. Some wireless nodes can be overloaded by cache-related traffic while the resources of others remain underused. Given a fixed content placement, this work proposes an efficient distributed algorithm to control and balance the association of cache-related traffic among cellular cache memories. The algorithm allows the network to achieve the globally optimal solution and can be executed on base stations using a limited amount of information exchange between them. It is based on a novel algorithm we call Bucket-filling. The solution limits the cache-users per node by balancing the total load among the nodes in a fair way. The improvement compared to common user assignment policies is highlighted for single- as well as for multi-tier random networks.