Social norms, including values, beliefs and even perceptions about the world, are preserved and created through repeated interactions between individuals. However, whereas neuro-cognitive research on social norms has used the “unilateral influence” paradigm focusing on people’s reactions to extant standards, little is known about how our basic perceptions and judgments are shaped as new norms through bilateral interaction. Here, using a simple estimation task, we investigated the formation of perceptual norms using two experiments coupled with computational modeling. In the behavioral experiment, participants in dyads repeatedly estimated the number of dots on a screen and viewed each other’s answers. In the fMRI experiment, we manipulated the interaction process by pairing each participant with a computer agent which adjusted its estimations reciprocally to participants’ estimations (bilateral agent) or did not (unilateral). The results indicated that only the bilateral interaction yielded convergence of participants’ covert psychophysical functions (relations between subjective estimations and the actual number of dots) as well as overt behavioral responses within a pair. Bilateral interaction also increased the stability (reliability) of the covert function within each individual after interaction. Neural activity in the mentalizing network (right temporoparietal junction and dorsomedial prefrontal cortex) during interaction modulated the stabilization of the psychophysical function. These results imply that bilateral interaction helps people to cognitively anchor their views with each other. Such spontaneous perspective sharing can yield a shared covert “generative model” that enables endogenous agreement on totally new targets ― one of the key features of social norms.