<p>A voice conference device is provided with functions of precisely detecting narrator's direction and collecting a voice from the direction with a high S/N ratio. A detecting beam generating unit (811) carries out delayed sum processing by using collecting voice signals (SS104)-(SS113) of microphones (MIC104)-(MIC113) densely disposed at a central portion of a disposing direction to generate collecting voice beam signals (MB101)-(MB114) for detection. An output beam generating unit (812) carries out delayed sum processing by using collecting voice signals (SS101)-(SS116) of all microphones (MIC101)-(MIC116) disposed in the disposing direction to generate output collecting voice beam signals (MB101')-(MB114'). A collecting voice beam selecting unit (19) detects direction data (MS) corresponding to the strongest signal intensity in the collecting voice beam signals (MB101)-(MB114) for the detection and supplies the same to an output beam selecting unit (813). The output beam selecting unit (813) selects a collecting voice signal corresponding to the direction data (MS).</p>