The choice of dictionaries of computational units suitable for efficient computation of binary classification tasks is investigated. To deal with exponentially growing sets of tasks with increasingly large domains, a probabilistic model is introduced. The relevance of tasks for a given application area is modeled by a product probability distribution on the set of all binary-valued functions. Approximate measures of network sparsity are studied in terms of variational norms tailored to dictionaries of computational units. Bounds on these norms are proven using the Chernoff-Hoeffding bound on sums of independent random variables that need not be identically distributed. Consequences of the probabilistic results for the choice of dictionaries of computational units are derived. It is shown that when a priori knowledge of a type of classification tasks is limited, then the sparsity may be achieved only at the expense of large sizes of dictionaries.