The importance of surrogate modeling techniques has been gradually increasing in the design of antenna structures over the recent years. Perhaps the most important reason is a high cost of full-wave electromagnetic (EM) analysis of antenna systems. Although imperative in ensuring evaluation reliability, it entails considerable computational expenses. These are especially pronounced when carrying out EM-driven design tasks such as geometry parameter tuning or uncertainty quantification, both requiring repetitive simulations. Conducting some of the design procedures, e.g., global search or yield optimization, directly at the level of simulation models may be prohibitive. The use of fast replacement models (or surrogates) may alleviate these difficulties; yet, accurate modeling of antenna structures faces its own challenges. The two major obstacles are the curse of dimensionality, manifesting itself in a rapid growth of the number of training data samples necessary to render a reliable model (as a function of the number of antenna parameters) and high nonlinearity of antenna characteristics. Recently, the concept of performance-driven modeling has been introduced, where the modeling process is focused on a small region of the parameters' space, which contains high-quality designs with respect to the considered performance figures. The most advanced variation in this class of methods is nested kriging, where both the model domain and the surrogate itself are constructed through kriging interpolation. Domain confinement is realized using a set of preoptimized reference designs and allows for significant improvement of the model predictive power while using a limited number of training data samples. In this work, the constrained modeling concept is coupled with a novel pyramidal deep regression network (PDRN) surrogate, which offers improved handling of highly nonlinear antenna responses. Three examples of microstrip antennas are used to demonstrate the advantages of constrained PDRN metamodels over the nested kriging surrogates with the (average) accuracy improved by a factor of 2 without increasing the training dataset cardinality.